# Wednesday, 30 March 2016

The Microsoft Build conference opened this morning. Some of the announcements were technologies that had been previously announced, but are now more generally available (such as Hololens); Some were brand new technologies, such as Bash on Windows.

I was most excited about the Microsoft Bot Framework that allows you to build intelligent bots that interact with Cortana to act as a personal assistant within your app; and Cognitive Services, which includes many of the APIs included in Project Oxford.

Below are the raw notes I took during the keynote. Please let me know (politely) if you discover any typos.

Edge: Biometric authentication for supporting web sites

Fingerprint login (USAA web site)

Ink

Sticky notes: Cortana recognizes words like "tomorrow" as date in appointments or reminder

Draw lines on a map: Distance automatically calculated.

Add labels to map: Labels stay in place when map is rotated in 3D space

Low latency: Ink from pen

Virtual ruler on screen to draw straight line

Developer

GPU Effects, e.g., Blur background synchronized with touch movement

<InkCanvas> control

<InkToolbar> control: Allow users to mark up your screen

Over 1000 new APIs and features

Released today: VS 2015, Update 2

Announced: Bash shell is coming to Windows (native Linux binaries)

Gaming

UWP: Achieve 4k at 60 frames/second

Xbox games running on desktop

  • Take advantage of Windows features
  • Consistent input experience

Xbox developer mode

  • Run, test, debug game on Xbox
  • Preview available today

Cortana on Xbox One

Direct X 12

Hololens

First & only Untethered holographic computer

Hololens shipping today

Releasing app and source code

Apps

  • Holographic anatomy: View internal organs &nerves; collaborate with remote partners
  • Destination Mars by NASA: Virtual tour of another planet

Cortana

Works on lockscreen

Add Cortana to your app.

Insights: When Cortana surfaced in app

Users must give permission for Cortana to their data

Windows, Android, Web Sites, and (soon) iOs

Skype

Interaction with Cortana

Bots: Remembers context from conversation

  • Location of events, friends in locations
  • Bots available today

Skype for Hololens

Microservices

Microsoft Bot Framework

https://dev.botframework.com/

Build conversational bots for your application

Rule-based natural language processing

UI to teach bot to understand your jargon/ terms/ slang

Cognitive Services

22 APIs

https://www.microsoft.com/cognitive-services/

Many of them used to in Project Oxford

Custom Recognition Intelligent Service (CRIS):

  • Understand speech and language
  • Customize speech-to-text

Demo

Smart Phone / Smart glasses app for blind people

Facial recognition and analysis

Read menu, including headings

Wednesday, 30 March 2016 19:07:15 (GMT Daylight Time, UTC+01:00)
# Monday, 28 March 2016
Monday, 28 March 2016 10:27:00 (GMT Daylight Time, UTC+01:00)
# Saturday, 26 March 2016

In the last article, we described the 2-way data binding with the ng-binding directive. In this article, I will introduce the ng-repeat directive.

Recall that a directive is a well-known attribute defined by AngularJS or your AngularJS application that is applied to an HTML element to provide some functionality.

The ng-repeat directive acts in a similar way to foreach loops or other constructs in a programming language. We point ng-repeat to an array of objects and Angular will loop through this array, repeating the HTML contained in the attributed element once for each element in the array.

In the following example, our controller creates an array of "customers" and adds it to the $scope directive, making the customers stateful and available to the HTML. Each customer has a firstName, lastName, and salary property.

var app = angular.module("myApp", []);
app.controller("mainController", function($scope) {
    $scope.customers = [{
      id: 1,
      firstName: "Bill",
      lastName: "Gates",
      salary: 200000
    }, {
      id: 2,
      firstName: "Satya",
      lastName: "Nadella",
      salary: 100000
    }, {
      id: 3,
      firstName: "Steve",
      lastName: "Balmer",
      salary: 300000
    }, {
      id: 4,
      firstName: "David",
      lastName: "Giard",
      salary: 1000000
    }];
}); 

In our HTML, we can loop through every customer in the array by applying the ng-repeat directive to a <tr> element, such as:

<tr ng-repeat="customer in customers"> 

Because the array contains 3 customers, the <tr> tag and all markup within this tag (including any bound data) will be repeated 3 times. As a result, the user will see 3 rows.

Here is the markup for the entire table.

<table>
  <thead>
    <tr>
      <th>First</th>
      <th>Last</th>
      <th>Salary</th>
    </tr>
  </thead>
  <tbody>
    <tr ng-repeat="customer in customers">
      <td>{{customer.firstName}}</td>
      <td>{{customer.lastName}}</td>
      <td style="text-align:right">{{customer.salary | currency:$USD}}</td>
    </tr>
  </tbody>
</table> 

The beauty of this is that we don't need to know in advance how many customers we have or how many rows we need. Angular figures it out for us.

The ng-repeat directive allows us the flexibility of repeating HTML markup for every element in an array.

Saturday, 26 March 2016 23:35:00 (GMT Standard Time, UTC+00:00)
# Friday, 25 March 2016

In the last article, we discussed Angular controllers. In this article, we will add code to a controller to do 2-way data binding.

The $scope object exists to hold stateful data. To use it, we add this object to the arguments of our controller, as shown in Listing 1.

app.controller("mainController", function($scope) {
...
} 

Because JavaScript is a dynamic language, we can create new properties on an object simply by assigning values to those properties. We can maintain state by adding properties to to the $scope object. For example, in Listing 2, we add the FirstName and LastName properties and assign the values "David" and "Giard", respectively.

app.controller("mainController", function($scope) {
  $scope.firstName = "David";
  $scope.lastName = "Giard"; 
}

Now that we have these values assigned, we can bind HTML elements to these properties in our web page, using the ng-bind attribute, as shown in Listing 3.

First: <input type="text" ng-model="firstName" />
<br /> 
 
Last: <input type="text" ng-model="lastName" /> 

We don't need to add the "$scope." prefix because it is implied. In this example, we bind these properties to 2 text boxes and the browser will display the property values.  But unlike the {{}} data binding syntax, this binding is 2-way. In other words, changing the values in the textbox will also change the value of the properties themselves. We can demonstrate this by adding a <div> element to the page to output the current value of these properties, as shown in Listing 3.

<div>Hello, {{$scope.FirstName}} {{$scope.LastName}}!</div> 

When the user modifies the text in the 2 textboxes, the text within the div immediately changes because both are bound to the same properties.

We can do the same with objects and their properties, as in the following example:

JavaScript:

$scope.customer = {
  firstName: "David",
  lastName: "Giard"
}; 

HTML:

First: <input type="text" ng-model="customer.firstName" />
<br /> 

        Last: <input type="text" ng-model="customer.lastName" />
By adding and manipulating properties of the $scope object and using the ng-model directive, we can implement 2-way data binding in our Angular applications.

Friday, 25 March 2016 10:26:00 (GMT Standard Time, UTC+00:00)
# Thursday, 24 March 2016

In the last article, I showed you how to set up your page to use Angular and do simple, one-way data binding.

In this article, I will describe modules and controllers.

A module is a container for the parts of an Angular application, including its controllers. This helps provide a separation of concerns, which helps us organize large, complex applications.

We create a controller in JavaScript with the following syntax:

app.controller("mainController", function() {
  ...
});

In the example above, we named the controller "mainController", but we can give it any name we want. The function argument contains the controller's code and we can pass into this function any Angular objects that we want the function to use.

One common parameter to pass to a controller function is $scope. $scope is a built-in object designed to hold stateful data. We can attach properties to $scope and they will be available in both the view (the web page) and the controller.

The ng-controller directive is an attribute that identifies which controller is available to a page or a section of a page. We can add this attribute to any element, but the controller is only available to that element and the objects contained inside it. If we add it to the body tag, it is available to anything within the body, as in the following example:

<body ng-controller="MyController"> 

which points to the MyController controller and makes it available to everything in the page body.

Once I do this, I can write JavaScript in this controller to add and update properties of the $scope object and those properties become available to the affected part of my page, as in the following example:

JavaScript:

var app = angular.module("myApp", []); 
 
app.controller("mainController", function($scope) {
  $scope.message = "Hello";
  $scope.customer = {
    firstName: "David",
    lastName: "Giard"
  }; 

HTML:

<body ng-controller="mainController">
  <h3>
    {{message}}, {{customer.firstName}} {{customer.lastName}}!
  </h3>
</body> 

In the above example, we use the one-way data binding to display the properties of $scope set within the controller. The output of this is:

Hello, David Giard

This is a simple example, but you can do what you want in a controller and anything created or manipulated in that function is availabe to your web page.

In this article, we introduced Angular controllers and showed how to use them in an Angular application.

Thursday, 24 March 2016 16:51:50 (GMT Standard Time, UTC+00:00)
# Wednesday, 23 March 2016
AngularJS is a popular framework that takes care of many common tasks, such as data binding, routing, and making Ajax calls, allowing developers to focus on the unique aspects of their application. Angular makes it much easier to maintain a large, complex single page application.

Angular is an open source project that you can use for free and contribute to (if you are skilled and ambitious).

As of this writing, AngularJS is on version 1.x. The Angular team is still actively working on the 1.x version; but they have already begun work on AngularJS 2.x, which is a complete rewrite of the Angular framework. AngularJS 2 is currently in beta and features some different paradigms than AngularJS 1. This series will initially focus on AngularJS 1, which has been out of beta for many months. In the future, after AngularJS is out of beta, I hope to write more about that version.

To get started with Angular 1, you need to add a reference to the Angular libraries.

You can either download the Angular files from http://angularjs.org or you can point your browser directly to the files on the Angular site. In either case, you need to add a reference to the Angular library, as in the examples shown below.

<script 
src="https://code.angularjs.org/1.5.0-rc.0/angular.js"></script> 

or

<script src="angular.js"></script> 

Angular uses directives to declaratively add functionality to a web page. A directive is an attribute defined by or within Angular that you add to the elements within your HTML page.

Each Angular page requires the ng-app directive, as in the examples below.

<html ng-app>

or

 
<html ng-app="MyApp"> 

The second example specifies a Controller, which is a JavaScript function that contains some code available to the entire page. We'll talk more about Controllers in a later article.

You can add this attribute to any element on your page, but Angular will only work for elements contained within the attributed element, so it generally makes sense to apply it near the top of the DOM (e.g., at the HTML or BODY tag). If I add ng-app to the HTML element, I will have Angular available throughout my page; however, if I add ng-app to a DIV element, Angular is only available to that DIV and to any elements contained within that DIV. Only one ng-app attribute is allowed per page and Angular will use the first one it finds, ignoring all others.

Once you have the SCRIPT reference and the ng-app attribute, you can begin using Angular. A simple use of Angular is one-way data binding. There are several ways to perform data binding in Angular. The simplest is with the {{}} markup. In an AngularJS application, anything within these double curly braces is interpreted as data binding. So, something like

The time is {{datetime.now()}} 

will output the current date and time. Below are a few other examples.

<h3>{{"Hello" + " world"}}</h3>
<div>{{5+2+3}}</div> 

which will output the following:

Hello World
10

If your controller contains variables, you can use those as well, such as:

<div>{x} + {y} = {x+y}! </div>

Although working with AngularJS can be complex, it only takes a small amount of code to get started.

You can see a live example of the concepts above at this link.

In this article, we introduced the Angular JavaScript framework; then, showed how to add it to your web application and perform simple, one-way data binding.

Wednesday, 23 March 2016 11:54:00 (GMT Standard Time, UTC+00:00)
# Tuesday, 22 March 2016

User expectations for web applications have increased exponentially the past few years. Users now expect applications to respond quickly to their interactions and to render appropriately for different size devices. In addition, users have pushed back against using browser plug-ins, such as Flash and Silverlight.

Developers can meet these expectations by writing an application that performs much of its activity on the client, rather than on the server. The default browser client languages are HTML, JavaScript, and CSS. But these are relatively small languages and they were not originally developed with the idea of building large, complex applications.

Enter: Frameworks. A framework is a combination of pre-built components and utilities that sit on top of HTML, JavaScript, and CSS to manage some of the complexity of large applications.

Some frameworks are very specific, such as jQuery which eases the process of selecting and acting on the DOM elements of a browser; and MustacheJS, which provides automatic data binding. And some are very general frameworks, such as Knockout, Ember, Angular, and React, that provide complex functionality for most aspects of your application and allow you to build custom modules of your own.

Of course, the frameworks themselves add overhead - both in terms of learning time for the developer and download time for the user.  For very simple pages, this overhead might not be worthwhile; but for even moderately complex applications, a framework can manage said complexity, making your code easier to maintain, deploy, debug, and test; and freeing you up to focus less on the application plumbing and more on the code that is unique to your application.

Choosing a framework can be overwhelming. You can find a list of hundreds of JavaScript frameworks and Plug-Ins at http://www.javascripting.com/. Some factors to consider when choosing a framework are:

Does it meet the needs of my application

Do you need a do-everything framework or just data binding. Is the User Interface the most important thing or is synchronizing with backend data more important. Each framework has its strengths. Determine what you need; then find the framework that suits you.

How difficult is it to learning

Look for a framework with good documentation and tutorials. Often, ease of learning is a function of your current knowledge. If you are already familiar with the Model-View-Controller pattern, it may make sense to use a framework that implements this pattern.

How popular is it?

This may strike you as a frivolous criterion, but a popular framework will have more people blogging about it; more people answering forum questions; and bugs will get found and fixed more quickly.

Will it be popular next year?

Future popularity is difficult to predict; but it may be more important than current technology. You are likely to keep this framework for a long time - possibly the life of your application and you want your technologies to remain relevant and supported.

Whichever framework you choose, you will learn it best by diving in and beginning your project.

Tuesday, 22 March 2016 11:18:00 (GMT Standard Time, UTC+00:00)
# Monday, 21 March 2016
Monday, 21 March 2016 14:36:08 (GMT Standard Time, UTC+00:00)
# Saturday, 19 March 2016

Project Oxford offers a set of APIs to analyze the content of images. One of these APIs is a REST web service that can determine the words and punctuation contained in a picture. This is accomplished by a simple REST web service call.

To begin, you must register with Project Oxford at http://www.projectoxford.ai.

Then, get the key at https://www.projectoxford.ai/Subscription

Thu04-ShowKey
Figure 1: Subscription key

To call the API we send a POST request to https://api.projectoxford.ai/vision/v1/ocr

If you like, you may add optional querystring parameters to the URL, language and detectionOrientation to have the service determine automatically whether the text is tilted. If you omit those parameters, Oxford will make an effort to determine these values on its own. As you might guess, it is faster if you provide this information to Oxford.

In the header of the request, you must provide your key as in the following example:

Ocp-Apim-Subscription-Key:15e24a988a179f13a25aac4713aec800

Optionally, you can provide the content-type of the data you are sending. To send a URL, use

Content-Type: application/json

To send an image stream, you can set the Coneplication/octet-stream orten-Type to application/octet-stream or multipart/form-data.

In the body of POST request, you can send JSON that includes the URL of the image location. Here is an example:

{ "Url": "http://media.tumblr.com/tumblr_lrbhs0RY2o1qaaiuh.png"}

This web service returns a JSON object containing an array of regions, each of which representing a block of text found in the image. Within each region is an array of lines and within each line is an array of words.

Region, line, and word objects contain a boundingBox object with coordinates of where to find the corresponding object within the image. Each word object contains the actual text detected, including any punctuation.

The beauty of a REST web service is that you can call it from any language or platform that supports HTTP requests (which is pretty much all of them).

The following example uses JavaScript and jQuery to call this API. It assumes that you have a DIV tag on the page with id="OutputDiv" and that you have a reference to jQuery before this code.

var myKey="<replace_with_your_subscription_key>";
var url="http://media.tumblr.com/tumblr_lrbhs0RY2o1qaaiuh.png";
$.ajax({
    type: "POST",
    url: "https://api.projectoxford.ai/vision/v1/ocr?language=en",
    headers: { "Ocp-Apim-Subscription-Key":myKey },
    contentType: "application/json",
    data: '{ "Url": "' + url + '" }'
}).done(function (data) {
    var outputDiv = $("#OutputDiv");
    outputDiv.text(""); 
 
    var linesOfText = data.regions[0].lines;
    // Loop through each line of text and create a DIV tag 
    // containg each word, separated by a space
    // Append this newly-created DIV to OutputDiv
    for (var i = 0; i < linesOfText.length; i++) {
        var output = "";
        var thisLine = linesOfText[i];
        var words = thisLine.words;
        for (var j = 0; j < words.length; j++) {
            var thisWord = words[j];
            output += thisWord.text;
            output += " ";
        }
        var newDiv = "<div>" + output + "</div>";
        outputDiv.append(newDiv);
    }
}).fail(function (err) {
    $("#OutputDiv").text("ERROR!" + err.responseText);
}); 

The call to the web service is done with the line

$.ajax({
    type: "POST",
    url: "https://api.projectoxford.ai/vision/v1/ocr?language=en",
    headers: { "Ocp-Apim-Subscription-Key":myKey },
    contentType: "application/json",
    data: '{ "Url": "' + url + '" }' 

which sends a POST request and passes the URL as part of a JSON object in the request Body.

This request is Asynchronous, so the "done" function is called when it returns successfully.

            }).done(function (data) {

The function tied to the "done" event parses through the returned JSON and displays it on the screen.

If an error occurs, we output a simple error message to the user in the "fail" function.

}).fail(function (err) {
    $("#OutputDiv").text("ERROR!" + err.responseText);
}); 

Most of the code above is just formatting the output, so the REST call itself is quite simple. Project Oxford makes this type of analysis much easier for developers, regardless of their platform.

You can find this code at my Github repository.

In this article, you learned about the Project Oxford OCR API and how to call it from a JavaScript application.

Saturday, 19 March 2016 17:22:45 (GMT Standard Time, UTC+00:00)
# Thursday, 17 March 2016

Speech recognition is a problem on which computer scientists have been working for years. Project Oxford applies the science of Machine Learning to this problem in order to recognize words spoken and determine their probable meaning based on context.

Project Oxford exposes a REST web service so that you can add speech recognition to your application.

Before you can use the Speech API, you must register at Project Oxford. and retrieve the Speech API key

SpeechKey
Figure 1: Speech API Key

The easiest way to use this API in a .NET application is to use the SpeechRecognition library. A NuGet package makes it easy to add this library to your application. In Visual Studio 2015, create a new WPF application (File | New | Project | Windows | WPF Application). Then, right-click the project in the Solution Explorer and select Manage NuGet Packages. Search for and add the "Microsoft.ProjectOxford.SpeechRecognition" package. Select the "x64" or "x86" version that corresponds with your version of Windows.

NuGet
Figure 2: NuGet dialog

Now, you can start using the library to call the Speech API.

Add the following using statement to the top of a class file:

using Microsoft.ProjectOxford.SpeechRecognition; 

Within the class, declare a private instance of the MicrophoneRecognitionClient class

MicrophoneRecognitionClient _microphoneRecognitionClient; 

To begin listening to speech, instantiate the MicrophoneRecognitionClient object by using the SpeechRecognitionServiceFactory.CreateMicrophoneClient method and pass and pass in  the Speech Recognition Mode, the language to listen for, and your Speech Subscription Key.

The Speech Recognition Mode is an enum that can be either ShortPhrase or LongDictation. These are optimized for shorter or longer voice messages, respectively. Below is an example of this creating a new MicrophoneRecognitionClient instance:

var speechRecognitionMode = SpeechRecognitionMode.ShortPhrase;
string language = "en-us";
string subscriptionKey = ConfigurationManager.AppSettings["SpeechKey"].ToString(); 
 
_microphoneRecognitionClient
        = SpeechRecognitionServiceFactory.CreateMicrophoneClient
                        (
                        speechRecognitionMode,
                        language,
                        subscriptionKey
                        ); 

Now that you have a MicrophoneRecognitionClient object, wire up the OnPartialResponseReceived and the OnResponseReceived events to listen for speech and call the API to turn that speech into text.

_microphoneRecognitionClient.OnPartialResponseReceived += OnPartialResponseReceivedHandler;
_microphoneRecognitionClient.OnResponseReceived += OnMicShortPhraseResponseReceivedHandler;

The MicrophoneRecognitionClient object calls the web service frequently - often after every word - to interpret what words has heard so far. When it makes this call, its OnPartialResponseReceived event fires. 

The signature of OnPartialResponseReceivedHandler is:

void OnPartialResponseReceivedHandler(object sender, PartialSpeechResponseEventArgs e)

and you can retrieve Oxford's text interpretation of the spoken words from e.PartialResult. Oxford may revise its interpretation of words spoken at the beginning of a sentence when it receives more of the sentence to provide some context.

After a significant pause, the MicrophoneRecognitionClient object will decide that the user has finished speaking. At this point, it fires the OnResponseReceived event, giving you a chance to clean up. The EndMicAndRecognition method of the MicrophoneRecognitionClient stops listening and severs the connection to the web service.

Here is some code that may be appropriate in the OnResponseReceived event handler:

_microphoneRecognitionClient.EndMicAndRecognition();
_microphoneRecognitionClient.Dispose();
_microphoneRecognitionClient = null; 

I have created a sample WPF app with a single window containing the following XAML:

<StackPanel Name="MainStackPanel" Orientation="Vertical" VerticalAlignment="Top">
    <Button Name="RecordButton" Width="250" Height="100" 
            FontSize="32" VerticalAlignment="Top" 
            Click="RecordButton_Click">
        Start!
    </Button>
    <TextBox Name="OutputTextbox" VerticalAlignment="Top" Width="600" 
        TextWrapping="Wrap" FontSize="18"></TextBox>
</StackPanel> 

The code-behind for this window is listed below. It includes some visual cues that the app is listening and displays the latest text returned from the Speech API.

using System;
using System.Configuration;
using System.Threading;
using System.Windows;
using System.Windows.Media;
using Microsoft.ProjectOxford.SpeechRecognition; 
 
namespace SpeechToTextDemo
{
    /// <summary>
    /// Interaction logic for MainWindow.xaml
    /// </summary>
    public partial class MainWindow : Window
    {
        AutoResetEvent _FinalResponseEvent;
        MicrophoneRecognitionClient _microphoneRecognitionClient; 
 
        public MainWindow()
        {
            InitializeComponent();
            RecordButton.Content = "Start\nRecording";
            _FinalResponseEvent = new AutoResetEvent(false);
            OutputTextbox.Background = Brushes.White;
            OutputTextbox.Foreground = Brushes.Black;
        } 
 
        private void RecordButton_Click(object sender, RoutedEventArgs e)
        {
            RecordButton.Content = "Listening...";
            RecordButton.IsEnabled = false;
            OutputTextbox.Background = Brushes.Green;
            OutputTextbox.Foreground = Brushes.White;
            ConvertTextToSpeech();
        } 
 
        /// <summary>
        /// Start listening. 
        /// </summary>
        private void ConvertTextToSpeech()
        {
            var speechRecognitionMode = SpeechRecognitionMode.ShortPhrase;
            string language = "en-us";
            string subscriptionKey = ConfigurationManager.AppSettings["SpeechKey"].ToString(); 
 
            _microphoneRecognitionClient
                    = SpeechRecognitionServiceFactory.CreateMicrophoneClient
                                    (
                                    speechRecognitionMode,
                                    language,
                                    subscriptionKey
                                    ); 
 
            _microphoneRecognitionClient.OnPartialResponseReceived += OnPartialResponseReceivedHandler;
            _microphoneRecognitionClient.OnResponseReceived += OnMicShortPhraseResponseReceivedHandler;
            _microphoneRecognitionClient.StartMicAndRecognition(); 
 
        } 
 
        void OnPartialResponseReceivedHandler(object sender, PartialSpeechResponseEventArgs e)
        {
            string result = e.PartialResult;
            Dispatcher.Invoke(() =>
            {
                OutputTextbox.Text = (e.PartialResult);
                OutputTextbox.Text += ("\n"); 
 
            });
        } 
 
        /// <summary>
        /// Speaker has finished speaking. Sever connection to server, stop listening, and clean up
        /// </summary>
        /// <param name="sender"></param>
        /// <param name="e"></param>
        void OnMicShortPhraseResponseReceivedHandler(object sender, SpeechResponseEventArgs e)
        {
            Dispatcher.Invoke((Action)(() =>
            {
                _FinalResponseEvent.Set();
                _microphoneRecognitionClient.EndMicAndRecognition();
                _microphoneRecognitionClient.Dispose();
                _microphoneRecognitionClient = null;
                RecordButton.Content = "Start\nRecording";
                RecordButton.IsEnabled = true;
                OutputTextbox.Background = Brushes.White;
                OutputTextbox.Foreground = Brushes.Black; 
 
            }));
        }
    }
}

You can download this project from my GitHub repository.

In this article, you learned how to use the Project Oxford Speech Recognition .NET library to take advantage of the Oxford Speech API and add text-to-speech capabilities to your application.

Thursday, 17 March 2016 12:26:00 (GMT Standard Time, UTC+00:00)
# Wednesday, 16 March 2016

In the last article, we showed how to call the Project Oxford Emotions API via REST in order to determine the emotions of every person in a picture.

In this article, I will show you how to use a .NET library to call this API. A .NET library simplifies the process by abstracting away HTTP calls and providing strongly-typed objects with which to work in your .NET code.

As with the REST call, we begin by signing up for Project Oxford and getting the key for this API, which you can do at https://www.projectoxford.ai/Subscription?popup=True.

Em01-GetKey
Figure 1: Key

To use the .NET library, launch Visual Studio and create a new Universal Windows App (File | New | Project | Windows | Blank (Universal Windows))

Add the Emotions NuGet Package to your project (Right-click project | Manage NuGet Packages); then search for and install Microsoft.ProjectOxford.Emotion. This will add the appropriate references to your project.

In your code, add the following statement to top of your class file.

using Microsoft.ProjectOxford.Emotion;
using Microsoft.ProjectOxford.Emotion.Contract; 

To use this library, we create an instance of the EmotionServiceClient class, passing in our key to the constructor.

var emotionServiceClient = new EmotionServiceClient(emotionApiKey);

The RecognizeAsync method of this class accepts the URL of an image and returns an array of Emotion objects.

Emotion[] emotionResult = await emotionServiceClient.RecognizeAsync(imageUrl); 

Each emotion object represents a single face detected in the picture and contains the following properties:

FaceRectangle: This indicates the location of the face

Scores: A set of values corresponding to each emotion (anger, content, disgust, fear, happiness, neutral, sadness, and surprise) with a value indicating the confidence with which Oxford thinks the face matches this emotion. Confidence values are between 0 and 1 and higher values indicate a higher confidence that this is the correct emotion.

The code below returns a string indicating the most likely emotion for every face in an image.

var sb = new StringBuilder();
var faceNumber = 0;
foreach (Emotion em in emotionResult)
{
    faceNumber++;
    var scores = em.Scores;
    var anger = scores.Anger;
    var contempt = scores.Contempt;
    var disgust = scores.Disgust;
    var fear = scores.Fear;
    var happiness = scores.Happiness;
    var neutral = scores.Neutral;
    var surprise = scores.Surprise;
    var sadness = scores.Sadness; 
 
    var emotionScoresList = new List<EmotionScore>();
    emotionScoresList.Add(new EmotionScore("anger", anger));
    emotionScoresList.Add(new EmotionScore("contempt", contempt));
    emotionScoresList.Add(new EmotionScore("disgust", disgust));
    emotionScoresList.Add(new EmotionScore("fear", fear));
    emotionScoresList.Add(new EmotionScore("happiness", happiness));
    emotionScoresList.Add(new EmotionScore("neutral", neutral));
    emotionScoresList.Add(new EmotionScore("surprise", surprise));
    emotionScoresList.Add(new EmotionScore("sadness", sadness)); 
 
    var maxEmotionScore = emotionScoresList.Max(e => e.EmotionValue);
    var likelyEmotion = emotionScoresList.First(e => e.EmotionValue == maxEmotionScore); 
 
    string likelyEmotionText = string.Format("Face {0} is {1:N2}% likely to experiencing: {2}\n\n", 
        faceNumber, likelyEmotion.EmotionValue * 100, likelyEmotion.EmotionName.ToUpper());
    sb.Append(likelyEmotionText); 
 
}
var resultsText = sb.ToString(); 
 

This will return a string similar to the following:

Face 1 is 99.36% likely to experiencing: NEUTRAL

Face 2 is 100.00% likely to experiencing: HAPPINESS

Face 3 is 95.02% likely to experiencing: SADNESS

You can download this Visual Studio 2015 Universal Windows App project from here.

Full documentation on the Emotion library is available here. You can find a more complete (although more complicated) demo of this library here.

In this article, you learned how to use the .NET libraries to call the Project Oxford Emotion API and detect emotion in the faces of an image.

Wednesday, 16 March 2016 13:11:00 (GMT Standard Time, UTC+00:00)
# Tuesday, 15 March 2016

It's difficult enough for humans to recognize emotions in the faces of other humans. Can a computer accomplish this task? It can if we train it to and if we give it enough examples of different faces with different emotions.

When we supply data to a computer with the objective of training that computer to recognize patterns and predict new data, we call that Machine Learning. And Microsoft has done a lot of Machine Learning with a lot of faces and a lot of data and they are exposing the results for you to use.

The Emotions API in Project Oxford looks at pictures of people and determines their emotions. Possible emotions returned are anger, contempt, disgust, fear, happiness, neutral, sadness, and surprise. Each emotion is assigned a confidence level between 0 and 1 - higher numbers indicate a higher confidence that this is the emotion expressed in the face. If a picture contains multiple faces, the emotion of each face is returned.

The API is a simple REST web service located at https://api.projectoxford.ai/emotion/v1.0/recognize. POST to this service with a header that includes:
Ocp-Apim-Subscription-Key:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

where xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx is your key. You can find your key at https://www.projectoxford.ai/Subscription?popup=True

and a body that includes the following data:

{ "url": "http://xxxx.com/xxxx.jpg" }

where http://xxxx.com/xxxx.jpg is the URL of an image.
The full request looks something like:
POST https://api.projectoxford.ai/emotion/v1.0/recognize HTTP/1.1
Content-Type: application/json
Host: api.projectoxford.ai
Content-Length: 62
Ocp-Apim-Subscription-Key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

{ "url": "http://xxxx.com/xxxx.jpg" }

This will return JSON data identifying each face in the image and a score indicating how confident this API is that the face is expressing each of 8 possible emotions. For example, passing a URL with a picture below of 3 attractive, smiling people

SpartaHack-068-X2[1] 
(found online at https://giard.smugmug.com/Tech-Community/SpartaHack-2016/i-4FPV9bf/0/X2/SpartaHack-068-X2.jpg)

returned the following data:

[
  {
    "faceRectangle": {
      "height": 113,
      "left": 285,
      "top": 156,
      "width": 113
    },
    "scores": {
      "anger": 1.97831262E-09,
      "contempt": 9.096525E-05,
      "disgust": 3.86221245E-07,
      "fear": 4.26409547E-10,
      "happiness": 0.998336,
      "neutral": 0.00156954059,
      "sadness": 8.370223E-09,
      "surprise": 3.06117772E-06
    }
  },
  {
    "faceRectangle": {
      "height": 108,
      "left": 831,
      "top": 169,
      "width": 108
    },
    "scores": {
      "anger": 2.63808062E-07,
      "contempt": 5.387114E-08,
      "disgust": 1.3360991E-06,
      "fear": 1.407629E-10,
      "happiness": 0.9999967,
      "neutral": 1.63170478E-06,
      "sadness": 2.52861843E-09,
      "surprise": 1.91028926E-09
    }
  },
  {
    "faceRectangle": {
      "height": 100,
      "left": 591,
      "top": 168,
      "width": 100
    },
    "scores": {
      "anger": 3.24157673E-10,
      "contempt": 4.90155344E-06,
      "disgust": 6.54665473E-06,
      "fear": 1.73284559E-06,
      "happiness": 0.9999156,
      "neutral": 6.42121E-05,
      "sadness": 7.02297257E-06,
      "surprise": 5.53670576E-09
    }
  }
]

A high value for the 3 happiness scores and the very low values for all the other scores suggest a very high degree of confidence that the people in this photo is happy.

Here is the request in the popular HTTP analysis tool Fiddler [http://www.telerik.com/fiddler]:

Request

Em01-Fiddler-Request

Response:

Em02-Fiddler-Response

Sending requests to Project Oxford REST API makes it simple to analyze the emotions of people in a photograph.

Tuesday, 15 March 2016 09:57:07 (GMT Standard Time, UTC+00:00)
# Monday, 14 March 2016
Monday, 14 March 2016 16:05:00 (GMT Standard Time, UTC+00:00)

Generating a thumbnail image from a larger image sounds easy – just shrink the dimensions of the original, right? But it becomes more complicated if the thumbnail image is a different shape than the original. In this case, we will need to crop or distort the original image. Distorting the image tends to look very bad; and when we crop an image, we will need to ensure that the primary subject of the image remains in the generated thumbnail. To do this, we need to identify the primary subject of the image. That's easy enough for a human observer to do, but a difficult thing for a computer to do, which is necessary if we want to automate this process.

This is where machine learning can help. By analyzing many images, Machine Learning can figure out what parts of a picture are likely to be the main subject. Once this is known, it becomes a simpler matter to crop the picture in such a way that the main subject remains.

Project Oxford uses Machine Learning so that you don't have to. It exposes an API to create an intelligent thumbnail image from any picture.

You can see this in action at www.projectoxford.ai/demo/vision#Thumbnail.


Thu01-LiveDemo
Figure 1

With this live, in-browser demo, you can either select an image from the gallery and view the generated thumbnails; or provide your own image - either from your local computer or from a public URL. The page uses the Thumbnail API to create thumbnails of 6 different dimensions.

Thu02-LiveDemo-2
Figure 2

For your own application, you can either call the REST Web Service directly or (for a .NET application) use a custom library. The library simplifies development by abstracting away HTTP calls via strongly-typed objects.

To get started, you will need a free Project Oxford account and you will need to sign into projectoxford.ai with a Microsoft account.

For this API, you need a key. From the Computer Vision API page, (Figure 3); click the [Try for free >] button; then, click the "Show" link under the Primary key of the "Computer Vision" section (Figue 4).

Thu03-ComputerVisionApiPage
Figure 3 

Thu04-ShowKey
Figure 4 

To use the SDK, add the Microsoft.ProjectOxford.Video NuGet package to your project: Right-click on your project, select Manage NuGet Packages, search for "ProjectOxford.Video", select the package from the list, and click the [Install] button, as shown in Figure 5

Thu05-NuGet
Figure 5

This adds a reference to Microsoft.ProjectOxford.Vision.dll, which contains classes that make it easier to call this API.

Add the following statement to the top of a class file to use this library.

using Microsoft.ProjectOxford.Vision;

Now, you can use the methods in the VisionServiceClient class to interact with the API.

Create a VisionServiceClient with the following code:

string subscriptionKey = "15e24a988f484591b17bcc4713aec800";
IVisionServiceClient visionClient = new VisionServiceClient(subscriptionKey);

where “xxxxxxxxxxxxxxxxxxxxxxxxxxx” is your subscription key.

Next, use the GetThumbnailAsync method to generate a thumbnail image. The following code creates a 200x100 thumbnail of a photo of a buoy in Stockholm, Sweden.

string originalPicture = @"https://giard.smugmug.com/Travel/Sweden-2015/i-ncF6hXw/0/L/IMG_1560-L.jpg";
int width = 200;
int height = 100;
bool smartCropping = true;
byte[] thumbnailResult = null;
thumbnailResult = visionClient.GetThumbnailAsync(originalPicture, width, height, smartCropping).Result;

The result is an array of bytes, but you can save the corresponding image to a file with the following code:

string folder = @"c:\test";
string thumbnaileFullPath = string.Format("{0}\\thumbnailResult_{1:yyyMMddhhmmss}.jpg", folder, DateTime.Now);
using (BinaryWriter binaryWrite = new BinaryWriter(new FileStream(thumbnaileFullPath, FileMode.Create, FileAccess.Write)))
{
    binaryWrite.Write(thumbnailResult);
}

Below is the full listing in a Console App to generate a thumbnail; then open both the original image and the saved thumbnail image for comparison.

using System;
using System.Diagnostics;
using System.IO;
using Microsoft.ProjectOxford.Vision;
 
namespace ThumbNailConsole
{
    class Program
    {
        static void Main(string[] args)
        {
 
            string subscriptionKey = "15e24a988f484591b17bcc4713aec800";
            IVisionServiceClient visionClient = new VisionServiceClient(subscriptionKey);
 
            string originalPicture = @"https://giard.smugmug.com/Travel/Sweden-2015/i-ncF6hXw/0/L/IMG_1560-L.jpg";
            int width = 200;
            int height = 100;
            bool smartCropping = true;
            byte[] thumbnailResult = null;
            thumbnailResult = visionClient.GetThumbnailAsync(originalPicture, width, height, smartCropping).Result;
 
            string folder = @"c:\test";
            string thumbnaileFullPath = string.Format("{0}\\thumbnailResult_{1:yyyMMddhhmmss}.jpg", folder, DateTime.Now);
            using (BinaryWriter binaryWrite = new BinaryWriter(new FileStream(thumbnaileFullPath, FileMode.Create, FileAccess.Write)))
            {
                binaryWrite.Write(thumbnailResult);
            }
 
            Process.Start(thumbnaileFullPath);
            Process.Start(originalPicture);
 
            Console.WriteLine("Done! Thumbnail is at {0}!", thumbnaileFullPath);
        }
    }
}

The result is shown in Figure 6 below.

Thu06-Output

One thing to note. The Thumbnail API is part of the Computer Vision API. As of this writing, the free version of the Computer Vision API is limited to 5,000 transactions per month. If you want more than that, you will need to upgrade to the Standard version, which charges $1.50 per 1000 transactions.

But this should be plenty for you to learn this API for free and build and test your applications until you need to put them into production.

The code above can be found on GitHub.

Monday, 14 March 2016 04:01:00 (GMT Standard Time, UTC+00:00)
# Sunday, 13 March 2016

Project Oxford is a set of APIs that take advantage of Machine Learning to provide developers with

These technologies require Machine Learning, which requires a lot of computing power and a lot of data. Most of us have neither, but Microsoft does and has used it to create the APIs in Project Oxford.

Project Oxford provides APIs to analyze pictures and voice and provide intelligent information about them.

There are three broad categories of services: Vision, Voice, and Language.

The Vision APIs analyzes pictures and recognizes objects in those pictures.  For example, several Vision APIs are capable of recognizing  faces in an image. One analyzes each face and deduces that person's emotion; another can compare 2 pictures and decide whether or not 2 photographs are the same person; a third guesses the age of each person in a photo.

The Speech APIs can convert speech to text or text to speech. It can also recognize the voice of a given speaker (if you want to use that for authentication in your app, for example) and infer the intent of the speaker from his words and tone.

The Language APIs seem more of a grab bag to me. A spell checker is smart enough to recognize common proper names and homonyms.

All these APIs are currently in Preview but I've played with them and they appear very solid. Many of theme even provide a confidence factor to let you know how confident you should be in the value returned. For example, 2 faces may represent the same person but it helps to know how closely they match.

You can use these APIs. To get started, you need a Project Oxford account, but you can get one for free at projectoxford.ai.

Each API offers a free option that restricts the number and/or frequency of calls, but you can break through that boundary for a charge.

You can also find documentation, sample code, and even a place to try out each API live in your browser at projectoxford.ai.

You call each one by passing and receiving JSON to a RESTful web service, but some of them offer an SDK to make it easier to make that call from a .NET application.

You can see a couple of fun applications of Project Oxford at how-old.net (which guesses the ages of people in photographs) and what-dog.net (which identifies the breed of dog in a photo).

Sign up today and start building apps. It’s fun and it’s free!

Sunday, 13 March 2016 03:14:12 (GMT Standard Time, UTC+00:00)
# Friday, 11 March 2016

The auditorium darkened. The music began and a small light appeared at the front of the room; then more. Students on stage danced and waved lanterns on ropes for an impressive musical light show to kick off the 2016 SpartaHack hackathon.

 

Students came from all over the world to attend this hackathon on the East Lansing campus. Over 200 universities were represented among the applicants. In addition to a number of international students studying on American campus, I met students who traveled to the hackathon from India, Russia, Germany, and the Philippines.

AnnaMattDavidBrian My colleague Brian Sherwin arrived in East Lansing the day before the hackathon to host an Azure workshop for 30 students - showing them how to use the cloud platform to enhance their applications. Ann Lergaard joined us a day later and we did our best to answer student questions and help them build better projects. Late Friday night, I delivered a tech talk showing off some of the services available in Azure.

Microsoft offered a prize for the best hack using our technology. It was won be 2 students who built an application that allowed users to take a photo of text with their iPhone and, in response to voice commands, read back any part of that text. The project combined Microsoft's Project Oxford OCR API with an Amazon Echo and its Alexa platform, an iPhone app, and a Firebase database.

A couple other cool hacks were:

  • ValU, an app that used Microsoft Excel to analyze historical stock price data using Excel VBA scripts.
  • Spartifai, which modified a driver, allowing a Kinect device to be used with a MacBook.

JazzBand A hackathon is an event at which students and others come together and build software and/or hardware projects in small teams over the course of a couple days. I attend a lot of hackathons and SpartaHack was one of the better organized that I've seen. Over 500 students spent the weekend building a wide variety of impressive projects - often with technology they had not touched prior to that weekend. The organizers also did a great job of providing fun activities beyond just hacking. A jazz band and a rock band each performed a set for students to enjoy during a break; a Super Smash Brothers tournament was scheduled; and a Blind Coding Contest challenged students to write code without compiling or testing to see if it would run correctly the first time in front of an audience. 

Snowman As sponsors of the event, we tried to provide some fun as well. We gave away prizes for building a snowman and for tweeting about open source technology. We also provided some loaner hardware for students; and we spent a lot of time mentoring students, which resulted in a lack of sleep this weekend.

The MSU campus has changed a great deal since I earned my undergraduate degree there decades ago. It has even changed since my son graduated from there 4 years ago. But it still felt like a homecoming for me.

 

 

Friday, 11 March 2016 16:42:00 (GMT Standard Time, UTC+00:00)
# Monday, 07 March 2016
Monday, 07 March 2016 12:52:00 (GMT Standard Time, UTC+00:00)
# Sunday, 06 March 2016

3/6
Today I am grateful to unexpectedly run into Jody​ and David in Chicago yesterday.

3/5
Today I am grateful for new shelves and less clutter in my apartment.

3/4
Today I am grateful for a home-cooked dinner last night.

3/3
Today I am grateful for dinner last night with Kevin.

3/2
Today I am grateful for:
-an overwhelming number of kind messages on Facebook yesterday.
-a birthday lunch with Chris in Grand Rapids yesterday.

3/1
Today I am grateful for my first visit to ann arbor, MI since I sold my house, including: -A personal tour of The Forge by Jeeva -Coffee with Velichka -A great crowd to attend my Azure Mobile Apps talk at Mobile Monday.

2/29
Today I am grateful to the SpartaHack organizers and hackers who contributed to a successful hackathon this past weekend.

2/28
Today I am grateful that this parking ticket was thrown out by the local police.

2/27
Today I am grateful for a day in East Lansing, MI and at Michigan State University.

2/26
Today I am grateful for a good crowd at our Azure Workshop last night at MSU.

2/25
Today I am grateful for the technology that allows me to watch TV and movies when and where I want.

2/24
Today I am grateful to Michael, Chris, Chris, and Murali for helping make yesterday's Cloud Camp a success by bringing real world experience to the presentations.

2/23
Today I am grateful to be able to ride my bike in February in Chicago.

2/22
Today I am grateful to the organizers and participants at HackIllinois who made this weekend's hackathon so successful.

2/21
Today I am grateful to Vanessa, who brought me espresso yesterday each time my energy ran low.

2/20
Today I am grateful for the hundreds of students who came to my Azure talk last night that I re-wrote yesterday afternoon.

2/19
Today I am grateful for this excellent and unexpected moving gift from Betsy

2/18
Today I am grateful for unsolicited praise from a co-worker yesterday.

2/17
Today I am grateful for an unexpected call from my cousin Kevin yesterday.

2/16
Today I am grateful for dinner with Christina last night.

2/15
Today I am grateful for a pristine blanket of snow covering downtown Chicago this morning.

2/14
Today I am grateful for Nyquil, Dayquil, and a Neti Pot.

2/13
Today I am grateful for finally paying off some sleep debt.

2/12
Today I am grateful for: -Lunch yesterday with Matt -A chance to present to a group of University of Chicago students

2/11
Today I am grateful for: -My first day with a personal trainer at the new gym -Attending Founder Institute graduation

2/10
Today I am grateful for my first game at Purdue's Mackey Arena.

2/9
Today I am grateful for the help I received yesterday unpacking dozens of boxes from my move.

2/8
Today I am grateful for an excellent week in Seattle.

Sunday, 06 March 2016 12:51:12 (GMT Standard Time, UTC+00:00)