# Wednesday, 23 March 2016
AngularJS is a popular framework that takes care of many common tasks, such as data binding, routing, and making Ajax calls, allowing developers to focus on the unique aspects of their application. Angular makes it much easier to maintain a large, complex single page application.

Angular is an open source project that you can use for free and contribute to (if you are skilled and ambitious).

As of this writing, AngularJS is on version 1.x. The Angular team is still actively working on the 1.x version; but they have already begun work on AngularJS 2.x, which is a complete rewrite of the Angular framework. AngularJS 2 is currently in beta and features some different paradigms than AngularJS 1. This series will initially focus on AngularJS 1, which has been out of beta for many months. In the future, after AngularJS is out of beta, I hope to write more about that version.

To get started with Angular 1, you need to add a reference to the Angular libraries.

You can either download the Angular files from http://angularjs.org or you can point your browser directly to the files on the Angular site. In either case, you need to add a reference to the Angular library, as in the examples shown below.

<script 
src="https://code.angularjs.org/1.5.0-rc.0/angular.js"></script> 

or

<script src="angular.js"></script> 

Angular uses directives to declaratively add functionality to a web page. A directive is an attribute defined by or within Angular that you add to the elements within your HTML page.

Each Angular page requires the ng-app directive, as in the examples below.

<html ng-app>

or

 
<html ng-app="MyApp"> 

The second example specifies a Controller, which is a JavaScript function that contains some code available to the entire page. We'll talk more about Controllers in a later article.

You can add this attribute to any element on your page, but Angular will only work for elements contained within the attributed element, so it generally makes sense to apply it near the top of the DOM (e.g., at the HTML or BODY tag). If I add ng-app to the HTML element, I will have Angular available throughout my page; however, if I add ng-app to a DIV element, Angular is only available to that DIV and to any elements contained within that DIV. Only one ng-app attribute is allowed per page and Angular will use the first one it finds, ignoring all others.

Once you have the SCRIPT reference and the ng-app attribute, you can begin using Angular. A simple use of Angular is one-way data binding. There are several ways to perform data binding in Angular. The simplest is with the {{}} markup. In an AngularJS application, anything within these double curly braces is interpreted as data binding. So, something like

The time is {{datetime.now()}} 

will output the current date and time. Below are a few other examples.

<h3>{{"Hello" + " world"}}</h3>
<div>{{5+2+3}}</div> 

which will output the following:

Hello World
10

If your controller contains variables, you can use those as well, such as:

<div>{x} + {y} = {x+y}! </div>

Although working with AngularJS can be complex, it only takes a small amount of code to get started.

You can see a live example of the concepts above at this link.

In this article, we introduced the Angular JavaScript framework; then, showed how to add it to your web application and perform simple, one-way data binding.

Wednesday, 23 March 2016 11:54:00 (GMT Standard Time, UTC+00:00)
# Tuesday, 22 March 2016

User expectations for web applications have increased exponentially the past few years. Users now expect applications to respond quickly to their interactions and to render appropriately for different size devices. In addition, users have pushed back against using browser plug-ins, such as Flash and Silverlight.

Developers can meet these expectations by writing an application that performs much of its activity on the client, rather than on the server. The default browser client languages are HTML, JavaScript, and CSS. But these are relatively small languages and they were not originally developed with the idea of building large, complex applications.

Enter: Frameworks. A framework is a combination of pre-built components and utilities that sit on top of HTML, JavaScript, and CSS to manage some of the complexity of large applications.

Some frameworks are very specific, such as jQuery which eases the process of selecting and acting on the DOM elements of a browser; and MustacheJS, which provides automatic data binding. And some are very general frameworks, such as Knockout, Ember, Angular, and React, that provide complex functionality for most aspects of your application and allow you to build custom modules of your own.

Of course, the frameworks themselves add overhead - both in terms of learning time for the developer and download time for the user.  For very simple pages, this overhead might not be worthwhile; but for even moderately complex applications, a framework can manage said complexity, making your code easier to maintain, deploy, debug, and test; and freeing you up to focus less on the application plumbing and more on the code that is unique to your application.

Choosing a framework can be overwhelming. You can find a list of hundreds of JavaScript frameworks and Plug-Ins at http://www.javascripting.com/. Some factors to consider when choosing a framework are:

Does it meet the needs of my application

Do you need a do-everything framework or just data binding. Is the User Interface the most important thing or is synchronizing with backend data more important. Each framework has its strengths. Determine what you need; then find the framework that suits you.

How difficult is it to learning

Look for a framework with good documentation and tutorials. Often, ease of learning is a function of your current knowledge. If you are already familiar with the Model-View-Controller pattern, it may make sense to use a framework that implements this pattern.

How popular is it?

This may strike you as a frivolous criterion, but a popular framework will have more people blogging about it; more people answering forum questions; and bugs will get found and fixed more quickly.

Will it be popular next year?

Future popularity is difficult to predict; but it may be more important than current technology. You are likely to keep this framework for a long time - possibly the life of your application and you want your technologies to remain relevant and supported.

Whichever framework you choose, you will learn it best by diving in and beginning your project.

Tuesday, 22 March 2016 11:18:00 (GMT Standard Time, UTC+00:00)
# Monday, 21 March 2016
Monday, 21 March 2016 14:36:08 (GMT Standard Time, UTC+00:00)
# Saturday, 19 March 2016

Project Oxford offers a set of APIs to analyze the content of images. One of these APIs is a REST web service that can determine the words and punctuation contained in a picture. This is accomplished by a simple REST web service call.

To begin, you must register with Project Oxford at http://www.projectoxford.ai.

Then, get the key at https://www.projectoxford.ai/Subscription

Thu04-ShowKey
Figure 1: Subscription key

To call the API we send a POST request to https://api.projectoxford.ai/vision/v1/ocr

If you like, you may add optional querystring parameters to the URL, language and detectionOrientation to have the service determine automatically whether the text is tilted. If you omit those parameters, Oxford will make an effort to determine these values on its own. As you might guess, it is faster if you provide this information to Oxford.

In the header of the request, you must provide your key as in the following example:

Ocp-Apim-Subscription-Key:15e24a988a179f13a25aac4713aec800

Optionally, you can provide the content-type of the data you are sending. To send a URL, use

Content-Type: application/json

To send an image stream, you can set the Coneplication/octet-stream orten-Type to application/octet-stream or multipart/form-data.

In the body of POST request, you can send JSON that includes the URL of the image location. Here is an example:

{ "Url": "http://media.tumblr.com/tumblr_lrbhs0RY2o1qaaiuh.png"}

This web service returns a JSON object containing an array of regions, each of which representing a block of text found in the image. Within each region is an array of lines and within each line is an array of words.

Region, line, and word objects contain a boundingBox object with coordinates of where to find the corresponding object within the image. Each word object contains the actual text detected, including any punctuation.

The beauty of a REST web service is that you can call it from any language or platform that supports HTTP requests (which is pretty much all of them).

The following example uses JavaScript and jQuery to call this API. It assumes that you have a DIV tag on the page with id="OutputDiv" and that you have a reference to jQuery before this code.

var myKey="<replace_with_your_subscription_key>";
var url="http://media.tumblr.com/tumblr_lrbhs0RY2o1qaaiuh.png";
$.ajax({
    type: "POST",
    url: "https://api.projectoxford.ai/vision/v1/ocr?language=en",
    headers: { "Ocp-Apim-Subscription-Key":myKey },
    contentType: "application/json",
    data: '{ "Url": "' + url + '" }'
}).done(function (data) {
    var outputDiv = $("#OutputDiv");
    outputDiv.text(""); 
 
    var linesOfText = data.regions[0].lines;
    // Loop through each line of text and create a DIV tag 
    // containg each word, separated by a space
    // Append this newly-created DIV to OutputDiv
    for (var i = 0; i < linesOfText.length; i++) {
        var output = "";
        var thisLine = linesOfText[i];
        var words = thisLine.words;
        for (var j = 0; j < words.length; j++) {
            var thisWord = words[j];
            output += thisWord.text;
            output += " ";
        }
        var newDiv = "<div>" + output + "</div>";
        outputDiv.append(newDiv);
    }
}).fail(function (err) {
    $("#OutputDiv").text("ERROR!" + err.responseText);
}); 

The call to the web service is done with the line

$.ajax({
    type: "POST",
    url: "https://api.projectoxford.ai/vision/v1/ocr?language=en",
    headers: { "Ocp-Apim-Subscription-Key":myKey },
    contentType: "application/json",
    data: '{ "Url": "' + url + '" }' 

which sends a POST request and passes the URL as part of a JSON object in the request Body.

This request is Asynchronous, so the "done" function is called when it returns successfully.

            }).done(function (data) {

The function tied to the "done" event parses through the returned JSON and displays it on the screen.

If an error occurs, we output a simple error message to the user in the "fail" function.

}).fail(function (err) {
    $("#OutputDiv").text("ERROR!" + err.responseText);
}); 

Most of the code above is just formatting the output, so the REST call itself is quite simple. Project Oxford makes this type of analysis much easier for developers, regardless of their platform.

You can find this code at my Github repository.

In this article, you learned about the Project Oxford OCR API and how to call it from a JavaScript application.

Saturday, 19 March 2016 17:22:45 (GMT Standard Time, UTC+00:00)
# Thursday, 17 March 2016

Speech recognition is a problem on which computer scientists have been working for years. Project Oxford applies the science of Machine Learning to this problem in order to recognize words spoken and determine their probable meaning based on context.

Project Oxford exposes a REST web service so that you can add speech recognition to your application.

Before you can use the Speech API, you must register at Project Oxford. and retrieve the Speech API key

SpeechKey
Figure 1: Speech API Key

The easiest way to use this API in a .NET application is to use the SpeechRecognition library. A NuGet package makes it easy to add this library to your application. In Visual Studio 2015, create a new WPF application (File | New | Project | Windows | WPF Application). Then, right-click the project in the Solution Explorer and select Manage NuGet Packages. Search for and add the "Microsoft.ProjectOxford.SpeechRecognition" package. Select the "x64" or "x86" version that corresponds with your version of Windows.

NuGet
Figure 2: NuGet dialog

Now, you can start using the library to call the Speech API.

Add the following using statement to the top of a class file:

using Microsoft.ProjectOxford.SpeechRecognition; 

Within the class, declare a private instance of the MicrophoneRecognitionClient class

MicrophoneRecognitionClient _microphoneRecognitionClient; 

To begin listening to speech, instantiate the MicrophoneRecognitionClient object by using the SpeechRecognitionServiceFactory.CreateMicrophoneClient method and pass and pass in  the Speech Recognition Mode, the language to listen for, and your Speech Subscription Key.

The Speech Recognition Mode is an enum that can be either ShortPhrase or LongDictation. These are optimized for shorter or longer voice messages, respectively. Below is an example of this creating a new MicrophoneRecognitionClient instance:

var speechRecognitionMode = SpeechRecognitionMode.ShortPhrase;
string language = "en-us";
string subscriptionKey = ConfigurationManager.AppSettings["SpeechKey"].ToString(); 
 
_microphoneRecognitionClient
        = SpeechRecognitionServiceFactory.CreateMicrophoneClient
                        (
                        speechRecognitionMode,
                        language,
                        subscriptionKey
                        ); 

Now that you have a MicrophoneRecognitionClient object, wire up the OnPartialResponseReceived and the OnResponseReceived events to listen for speech and call the API to turn that speech into text.

_microphoneRecognitionClient.OnPartialResponseReceived += OnPartialResponseReceivedHandler;
_microphoneRecognitionClient.OnResponseReceived += OnMicShortPhraseResponseReceivedHandler;

The MicrophoneRecognitionClient object calls the web service frequently - often after every word - to interpret what words has heard so far. When it makes this call, its OnPartialResponseReceived event fires. 

The signature of OnPartialResponseReceivedHandler is:

void OnPartialResponseReceivedHandler(object sender, PartialSpeechResponseEventArgs e)

and you can retrieve Oxford's text interpretation of the spoken words from e.PartialResult. Oxford may revise its interpretation of words spoken at the beginning of a sentence when it receives more of the sentence to provide some context.

After a significant pause, the MicrophoneRecognitionClient object will decide that the user has finished speaking. At this point, it fires the OnResponseReceived event, giving you a chance to clean up. The EndMicAndRecognition method of the MicrophoneRecognitionClient stops listening and severs the connection to the web service.

Here is some code that may be appropriate in the OnResponseReceived event handler:

_microphoneRecognitionClient.EndMicAndRecognition();
_microphoneRecognitionClient.Dispose();
_microphoneRecognitionClient = null; 

I have created a sample WPF app with a single window containing the following XAML:

<StackPanel Name="MainStackPanel" Orientation="Vertical" VerticalAlignment="Top">
    <Button Name="RecordButton" Width="250" Height="100" 
            FontSize="32" VerticalAlignment="Top" 
            Click="RecordButton_Click">
        Start!
    </Button>
    <TextBox Name="OutputTextbox" VerticalAlignment="Top" Width="600" 
        TextWrapping="Wrap" FontSize="18"></TextBox>
</StackPanel> 

The code-behind for this window is listed below. It includes some visual cues that the app is listening and displays the latest text returned from the Speech API.

using System;
using System.Configuration;
using System.Threading;
using System.Windows;
using System.Windows.Media;
using Microsoft.ProjectOxford.SpeechRecognition; 
 
namespace SpeechToTextDemo
{
    /// <summary>
    /// Interaction logic for MainWindow.xaml
    /// </summary>
    public partial class MainWindow : Window
    {
        AutoResetEvent _FinalResponseEvent;
        MicrophoneRecognitionClient _microphoneRecognitionClient; 
 
        public MainWindow()
        {
            InitializeComponent();
            RecordButton.Content = "Start\nRecording";
            _FinalResponseEvent = new AutoResetEvent(false);
            OutputTextbox.Background = Brushes.White;
            OutputTextbox.Foreground = Brushes.Black;
        } 
 
        private void RecordButton_Click(object sender, RoutedEventArgs e)
        {
            RecordButton.Content = "Listening...";
            RecordButton.IsEnabled = false;
            OutputTextbox.Background = Brushes.Green;
            OutputTextbox.Foreground = Brushes.White;
            ConvertTextToSpeech();
        } 
 
        /// <summary>
        /// Start listening. 
        /// </summary>
        private void ConvertTextToSpeech()
        {
            var speechRecognitionMode = SpeechRecognitionMode.ShortPhrase;
            string language = "en-us";
            string subscriptionKey = ConfigurationManager.AppSettings["SpeechKey"].ToString(); 
 
            _microphoneRecognitionClient
                    = SpeechRecognitionServiceFactory.CreateMicrophoneClient
                                    (
                                    speechRecognitionMode,
                                    language,
                                    subscriptionKey
                                    ); 
 
            _microphoneRecognitionClient.OnPartialResponseReceived += OnPartialResponseReceivedHandler;
            _microphoneRecognitionClient.OnResponseReceived += OnMicShortPhraseResponseReceivedHandler;
            _microphoneRecognitionClient.StartMicAndRecognition(); 
 
        } 
 
        void OnPartialResponseReceivedHandler(object sender, PartialSpeechResponseEventArgs e)
        {
            string result = e.PartialResult;
            Dispatcher.Invoke(() =>
            {
                OutputTextbox.Text = (e.PartialResult);
                OutputTextbox.Text += ("\n"); 
 
            });
        } 
 
        /// <summary>
        /// Speaker has finished speaking. Sever connection to server, stop listening, and clean up
        /// </summary>
        /// <param name="sender"></param>
        /// <param name="e"></param>
        void OnMicShortPhraseResponseReceivedHandler(object sender, SpeechResponseEventArgs e)
        {
            Dispatcher.Invoke((Action)(() =>
            {
                _FinalResponseEvent.Set();
                _microphoneRecognitionClient.EndMicAndRecognition();
                _microphoneRecognitionClient.Dispose();
                _microphoneRecognitionClient = null;
                RecordButton.Content = "Start\nRecording";
                RecordButton.IsEnabled = true;
                OutputTextbox.Background = Brushes.White;
                OutputTextbox.Foreground = Brushes.Black; 
 
            }));
        }
    }
}

You can download this project from my GitHub repository.

In this article, you learned how to use the Project Oxford Speech Recognition .NET library to take advantage of the Oxford Speech API and add text-to-speech capabilities to your application.

Thursday, 17 March 2016 12:26:00 (GMT Standard Time, UTC+00:00)
# Wednesday, 16 March 2016

In the last article, we showed how to call the Project Oxford Emotions API via REST in order to determine the emotions of every person in a picture.

In this article, I will show you how to use a .NET library to call this API. A .NET library simplifies the process by abstracting away HTTP calls and providing strongly-typed objects with which to work in your .NET code.

As with the REST call, we begin by signing up for Project Oxford and getting the key for this API, which you can do at https://www.projectoxford.ai/Subscription?popup=True.

Em01-GetKey
Figure 1: Key

To use the .NET library, launch Visual Studio and create a new Universal Windows App (File | New | Project | Windows | Blank (Universal Windows))

Add the Emotions NuGet Package to your project (Right-click project | Manage NuGet Packages); then search for and install Microsoft.ProjectOxford.Emotion. This will add the appropriate references to your project.

In your code, add the following statement to top of your class file.

using Microsoft.ProjectOxford.Emotion;
using Microsoft.ProjectOxford.Emotion.Contract; 

To use this library, we create an instance of the EmotionServiceClient class, passing in our key to the constructor.

var emotionServiceClient = new EmotionServiceClient(emotionApiKey);

The RecognizeAsync method of this class accepts the URL of an image and returns an array of Emotion objects.

Emotion[] emotionResult = await emotionServiceClient.RecognizeAsync(imageUrl); 

Each emotion object represents a single face detected in the picture and contains the following properties:

FaceRectangle: This indicates the location of the face

Scores: A set of values corresponding to each emotion (anger, content, disgust, fear, happiness, neutral, sadness, and surprise) with a value indicating the confidence with which Oxford thinks the face matches this emotion. Confidence values are between 0 and 1 and higher values indicate a higher confidence that this is the correct emotion.

The code below returns a string indicating the most likely emotion for every face in an image.

var sb = new StringBuilder();
var faceNumber = 0;
foreach (Emotion em in emotionResult)
{
    faceNumber++;
    var scores = em.Scores;
    var anger = scores.Anger;
    var contempt = scores.Contempt;
    var disgust = scores.Disgust;
    var fear = scores.Fear;
    var happiness = scores.Happiness;
    var neutral = scores.Neutral;
    var surprise = scores.Surprise;
    var sadness = scores.Sadness; 
 
    var emotionScoresList = new List<EmotionScore>();
    emotionScoresList.Add(new EmotionScore("anger", anger));
    emotionScoresList.Add(new EmotionScore("contempt", contempt));
    emotionScoresList.Add(new EmotionScore("disgust", disgust));
    emotionScoresList.Add(new EmotionScore("fear", fear));
    emotionScoresList.Add(new EmotionScore("happiness", happiness));
    emotionScoresList.Add(new EmotionScore("neutral", neutral));
    emotionScoresList.Add(new EmotionScore("surprise", surprise));
    emotionScoresList.Add(new EmotionScore("sadness", sadness)); 
 
    var maxEmotionScore = emotionScoresList.Max(e => e.EmotionValue);
    var likelyEmotion = emotionScoresList.First(e => e.EmotionValue == maxEmotionScore); 
 
    string likelyEmotionText = string.Format("Face {0} is {1:N2}% likely to experiencing: {2}\n\n", 
        faceNumber, likelyEmotion.EmotionValue * 100, likelyEmotion.EmotionName.ToUpper());
    sb.Append(likelyEmotionText); 
 
}
var resultsText = sb.ToString(); 
 

This will return a string similar to the following:

Face 1 is 99.36% likely to experiencing: NEUTRAL

Face 2 is 100.00% likely to experiencing: HAPPINESS

Face 3 is 95.02% likely to experiencing: SADNESS

You can download this Visual Studio 2015 Universal Windows App project from here.

Full documentation on the Emotion library is available here. You can find a more complete (although more complicated) demo of this library here.

In this article, you learned how to use the .NET libraries to call the Project Oxford Emotion API and detect emotion in the faces of an image.

Wednesday, 16 March 2016 13:11:00 (GMT Standard Time, UTC+00:00)
# Tuesday, 15 March 2016

It's difficult enough for humans to recognize emotions in the faces of other humans. Can a computer accomplish this task? It can if we train it to and if we give it enough examples of different faces with different emotions.

When we supply data to a computer with the objective of training that computer to recognize patterns and predict new data, we call that Machine Learning. And Microsoft has done a lot of Machine Learning with a lot of faces and a lot of data and they are exposing the results for you to use.

The Emotions API in Project Oxford looks at pictures of people and determines their emotions. Possible emotions returned are anger, contempt, disgust, fear, happiness, neutral, sadness, and surprise. Each emotion is assigned a confidence level between 0 and 1 - higher numbers indicate a higher confidence that this is the emotion expressed in the face. If a picture contains multiple faces, the emotion of each face is returned.

The API is a simple REST web service located at https://api.projectoxford.ai/emotion/v1.0/recognize. POST to this service with a header that includes:
Ocp-Apim-Subscription-Key:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

where xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx is your key. You can find your key at https://www.projectoxford.ai/Subscription?popup=True

and a body that includes the following data:

{ "url": "http://xxxx.com/xxxx.jpg" }

where http://xxxx.com/xxxx.jpg is the URL of an image.
The full request looks something like:
POST https://api.projectoxford.ai/emotion/v1.0/recognize HTTP/1.1
Content-Type: application/json
Host: api.projectoxford.ai
Content-Length: 62
Ocp-Apim-Subscription-Key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

{ "url": "http://xxxx.com/xxxx.jpg" }

This will return JSON data identifying each face in the image and a score indicating how confident this API is that the face is expressing each of 8 possible emotions. For example, passing a URL with a picture below of 3 attractive, smiling people

SpartaHack-068-X2[1] 
(found online at https://giard.smugmug.com/Tech-Community/SpartaHack-2016/i-4FPV9bf/0/X2/SpartaHack-068-X2.jpg)

returned the following data:

[
  {
    "faceRectangle": {
      "height": 113,
      "left": 285,
      "top": 156,
      "width": 113
    },
    "scores": {
      "anger": 1.97831262E-09,
      "contempt": 9.096525E-05,
      "disgust": 3.86221245E-07,
      "fear": 4.26409547E-10,
      "happiness": 0.998336,
      "neutral": 0.00156954059,
      "sadness": 8.370223E-09,
      "surprise": 3.06117772E-06
    }
  },
  {
    "faceRectangle": {
      "height": 108,
      "left": 831,
      "top": 169,
      "width": 108
    },
    "scores": {
      "anger": 2.63808062E-07,
      "contempt": 5.387114E-08,
      "disgust": 1.3360991E-06,
      "fear": 1.407629E-10,
      "happiness": 0.9999967,
      "neutral": 1.63170478E-06,
      "sadness": 2.52861843E-09,
      "surprise": 1.91028926E-09
    }
  },
  {
    "faceRectangle": {
      "height": 100,
      "left": 591,
      "top": 168,
      "width": 100
    },
    "scores": {
      "anger": 3.24157673E-10,
      "contempt": 4.90155344E-06,
      "disgust": 6.54665473E-06,
      "fear": 1.73284559E-06,
      "happiness": 0.9999156,
      "neutral": 6.42121E-05,
      "sadness": 7.02297257E-06,
      "surprise": 5.53670576E-09
    }
  }
]

A high value for the 3 happiness scores and the very low values for all the other scores suggest a very high degree of confidence that the people in this photo is happy.

Here is the request in the popular HTTP analysis tool Fiddler [http://www.telerik.com/fiddler]:

Request

Em01-Fiddler-Request

Response:

Em02-Fiddler-Response

Sending requests to Project Oxford REST API makes it simple to analyze the emotions of people in a photograph.

Tuesday, 15 March 2016 09:57:07 (GMT Standard Time, UTC+00:00)
# Monday, 14 March 2016
Monday, 14 March 2016 16:05:00 (GMT Standard Time, UTC+00:00)

Generating a thumbnail image from a larger image sounds easy – just shrink the dimensions of the original, right? But it becomes more complicated if the thumbnail image is a different shape than the original. In this case, we will need to crop or distort the original image. Distorting the image tends to look very bad; and when we crop an image, we will need to ensure that the primary subject of the image remains in the generated thumbnail. To do this, we need to identify the primary subject of the image. That's easy enough for a human observer to do, but a difficult thing for a computer to do, which is necessary if we want to automate this process.

This is where machine learning can help. By analyzing many images, Machine Learning can figure out what parts of a picture are likely to be the main subject. Once this is known, it becomes a simpler matter to crop the picture in such a way that the main subject remains.

Project Oxford uses Machine Learning so that you don't have to. It exposes an API to create an intelligent thumbnail image from any picture.

You can see this in action at www.projectoxford.ai/demo/vision#Thumbnail.


Thu01-LiveDemo
Figure 1

With this live, in-browser demo, you can either select an image from the gallery and view the generated thumbnails; or provide your own image - either from your local computer or from a public URL. The page uses the Thumbnail API to create thumbnails of 6 different dimensions.

Thu02-LiveDemo-2
Figure 2

For your own application, you can either call the REST Web Service directly or (for a .NET application) use a custom library. The library simplifies development by abstracting away HTTP calls via strongly-typed objects.

To get started, you will need a free Project Oxford account and you will need to sign into projectoxford.ai with a Microsoft account.

For this API, you need a key. From the Computer Vision API page, (Figure 3); click the [Try for free >] button; then, click the "Show" link under the Primary key of the "Computer Vision" section (Figue 4).

Thu03-ComputerVisionApiPage
Figure 3 

Thu04-ShowKey
Figure 4 

To use the SDK, add the Microsoft.ProjectOxford.Video NuGet package to your project: Right-click on your project, select Manage NuGet Packages, search for "ProjectOxford.Video", select the package from the list, and click the [Install] button, as shown in Figure 5

Thu05-NuGet
Figure 5

This adds a reference to Microsoft.ProjectOxford.Vision.dll, which contains classes that make it easier to call this API.

Add the following statement to the top of a class file to use this library.

using Microsoft.ProjectOxford.Vision;

Now, you can use the methods in the VisionServiceClient class to interact with the API.

Create a VisionServiceClient with the following code:

string subscriptionKey = "15e24a988f484591b17bcc4713aec800";
IVisionServiceClient visionClient = new VisionServiceClient(subscriptionKey);

where “xxxxxxxxxxxxxxxxxxxxxxxxxxx” is your subscription key.

Next, use the GetThumbnailAsync method to generate a thumbnail image. The following code creates a 200x100 thumbnail of a photo of a buoy in Stockholm, Sweden.

string originalPicture = @"https://giard.smugmug.com/Travel/Sweden-2015/i-ncF6hXw/0/L/IMG_1560-L.jpg";
int width = 200;
int height = 100;
bool smartCropping = true;
byte[] thumbnailResult = null;
thumbnailResult = visionClient.GetThumbnailAsync(originalPicture, width, height, smartCropping).Result;

The result is an array of bytes, but you can save the corresponding image to a file with the following code:

string folder = @"c:\test";
string thumbnaileFullPath = string.Format("{0}\\thumbnailResult_{1:yyyMMddhhmmss}.jpg", folder, DateTime.Now);
using (BinaryWriter binaryWrite = new BinaryWriter(new FileStream(thumbnaileFullPath, FileMode.Create, FileAccess.Write)))
{
    binaryWrite.Write(thumbnailResult);
}

Below is the full listing in a Console App to generate a thumbnail; then open both the original image and the saved thumbnail image for comparison.

using System;
using System.Diagnostics;
using System.IO;
using Microsoft.ProjectOxford.Vision;
 
namespace ThumbNailConsole
{
    class Program
    {
        static void Main(string[] args)
        {
 
            string subscriptionKey = "15e24a988f484591b17bcc4713aec800";
            IVisionServiceClient visionClient = new VisionServiceClient(subscriptionKey);
 
            string originalPicture = @"https://giard.smugmug.com/Travel/Sweden-2015/i-ncF6hXw/0/L/IMG_1560-L.jpg";
            int width = 200;
            int height = 100;
            bool smartCropping = true;
            byte[] thumbnailResult = null;
            thumbnailResult = visionClient.GetThumbnailAsync(originalPicture, width, height, smartCropping).Result;
 
            string folder = @"c:\test";
            string thumbnaileFullPath = string.Format("{0}\\thumbnailResult_{1:yyyMMddhhmmss}.jpg", folder, DateTime.Now);
            using (BinaryWriter binaryWrite = new BinaryWriter(new FileStream(thumbnaileFullPath, FileMode.Create, FileAccess.Write)))
            {
                binaryWrite.Write(thumbnailResult);
            }
 
            Process.Start(thumbnaileFullPath);
            Process.Start(originalPicture);
 
            Console.WriteLine("Done! Thumbnail is at {0}!", thumbnaileFullPath);
        }
    }
}

The result is shown in Figure 6 below.

Thu06-Output

One thing to note. The Thumbnail API is part of the Computer Vision API. As of this writing, the free version of the Computer Vision API is limited to 5,000 transactions per month. If you want more than that, you will need to upgrade to the Standard version, which charges $1.50 per 1000 transactions.

But this should be plenty for you to learn this API for free and build and test your applications until you need to put them into production.

The code above can be found on GitHub.

Monday, 14 March 2016 04:01:00 (GMT Standard Time, UTC+00:00)
# Sunday, 13 March 2016

Project Oxford is a set of APIs that take advantage of Machine Learning to provide developers with

These technologies require Machine Learning, which requires a lot of computing power and a lot of data. Most of us have neither, but Microsoft does and has used it to create the APIs in Project Oxford.

Project Oxford provides APIs to analyze pictures and voice and provide intelligent information about them.

There are three broad categories of services: Vision, Voice, and Language.

The Vision APIs analyzes pictures and recognizes objects in those pictures.  For example, several Vision APIs are capable of recognizing  faces in an image. One analyzes each face and deduces that person's emotion; another can compare 2 pictures and decide whether or not 2 photographs are the same person; a third guesses the age of each person in a photo.

The Speech APIs can convert speech to text or text to speech. It can also recognize the voice of a given speaker (if you want to use that for authentication in your app, for example) and infer the intent of the speaker from his words and tone.

The Language APIs seem more of a grab bag to me. A spell checker is smart enough to recognize common proper names and homonyms.

All these APIs are currently in Preview but I've played with them and they appear very solid. Many of theme even provide a confidence factor to let you know how confident you should be in the value returned. For example, 2 faces may represent the same person but it helps to know how closely they match.

You can use these APIs. To get started, you need a Project Oxford account, but you can get one for free at projectoxford.ai.

Each API offers a free option that restricts the number and/or frequency of calls, but you can break through that boundary for a charge.

You can also find documentation, sample code, and even a place to try out each API live in your browser at projectoxford.ai.

You call each one by passing and receiving JSON to a RESTful web service, but some of them offer an SDK to make it easier to make that call from a .NET application.

You can see a couple of fun applications of Project Oxford at how-old.net (which guesses the ages of people in photographs) and what-dog.net (which identifies the breed of dog in a photo).

Sign up today and start building apps. It’s fun and it’s free!

Sunday, 13 March 2016 03:14:12 (GMT Standard Time, UTC+00:00)