# Tuesday, 22 March 2016

User expectations for web applications have increased exponentially the past few years. Users now expect applications to respond quickly to their interactions and to render appropriately for different size devices. In addition, users have pushed back against using browser plug-ins, such as Flash and Silverlight.

Developers can meet these expectations by writing an application that performs much of its activity on the client, rather than on the server. The default browser client languages are HTML, JavaScript, and CSS. But these are relatively small languages and they were not originally developed with the idea of building large, complex applications.

Enter: Frameworks. A framework is a combination of pre-built components and utilities that sit on top of HTML, JavaScript, and CSS to manage some of the complexity of large applications.

Some frameworks are very specific, such as jQuery which eases the process of selecting and acting on the DOM elements of a browser; and MustacheJS, which provides automatic data binding. And some are very general frameworks, such as Knockout, Ember, Angular, and React, that provide complex functionality for most aspects of your application and allow you to build custom modules of your own.

Of course, the frameworks themselves add overhead - both in terms of learning time for the developer and download time for the user.  For very simple pages, this overhead might not be worthwhile; but for even moderately complex applications, a framework can manage said complexity, making your code easier to maintain, deploy, debug, and test; and freeing you up to focus less on the application plumbing and more on the code that is unique to your application.

Choosing a framework can be overwhelming. You can find a list of hundreds of JavaScript frameworks and Plug-Ins at http://www.javascripting.com/. Some factors to consider when choosing a framework are:

Does it meet the needs of my application

Do you need a do-everything framework or just data binding. Is the User Interface the most important thing or is synchronizing with backend data more important. Each framework has its strengths. Determine what you need; then find the framework that suits you.

How difficult is it to learning

Look for a framework with good documentation and tutorials. Often, ease of learning is a function of your current knowledge. If you are already familiar with the Model-View-Controller pattern, it may make sense to use a framework that implements this pattern.

How popular is it?

This may strike you as a frivolous criterion, but a popular framework will have more people blogging about it; more people answering forum questions; and bugs will get found and fixed more quickly.

Will it be popular next year?

Future popularity is difficult to predict; but it may be more important than current technology. You are likely to keep this framework for a long time - possibly the life of your application and you want your technologies to remain relevant and supported.

Whichever framework you choose, you will learn it best by diving in and beginning your project.

Tuesday, 22 March 2016 11:18:00 (GMT Standard Time, UTC+00:00)
# Monday, 21 March 2016
Monday, 21 March 2016 14:36:08 (GMT Standard Time, UTC+00:00)
# Saturday, 19 March 2016

Project Oxford offers a set of APIs to analyze the content of images. One of these APIs is a REST web service that can determine the words and punctuation contained in a picture. This is accomplished by a simple REST web service call.

To begin, you must register with Project Oxford at http://www.projectoxford.ai.

Then, get the key at https://www.projectoxford.ai/Subscription

Thu04-ShowKey
Figure 1: Subscription key

To call the API we send a POST request to https://api.projectoxford.ai/vision/v1/ocr

If you like, you may add optional querystring parameters to the URL, language and detectionOrientation to have the service determine automatically whether the text is tilted. If you omit those parameters, Oxford will make an effort to determine these values on its own. As you might guess, it is faster if you provide this information to Oxford.

In the header of the request, you must provide your key as in the following example:

Ocp-Apim-Subscription-Key:15e24a988a179f13a25aac4713aec800

Optionally, you can provide the content-type of the data you are sending. To send a URL, use

Content-Type: application/json

To send an image stream, you can set the Coneplication/octet-stream orten-Type to application/octet-stream or multipart/form-data.

In the body of POST request, you can send JSON that includes the URL of the image location. Here is an example:

{ "Url": "http://media.tumblr.com/tumblr_lrbhs0RY2o1qaaiuh.png"}

This web service returns a JSON object containing an array of regions, each of which representing a block of text found in the image. Within each region is an array of lines and within each line is an array of words.

Region, line, and word objects contain a boundingBox object with coordinates of where to find the corresponding object within the image. Each word object contains the actual text detected, including any punctuation.

The beauty of a REST web service is that you can call it from any language or platform that supports HTTP requests (which is pretty much all of them).

The following example uses JavaScript and jQuery to call this API. It assumes that you have a DIV tag on the page with id="OutputDiv" and that you have a reference to jQuery before this code.

var myKey="<replace_with_your_subscription_key>";
var url="http://media.tumblr.com/tumblr_lrbhs0RY2o1qaaiuh.png";
$.ajax({
    type: "POST",
    url: "https://api.projectoxford.ai/vision/v1/ocr?language=en",
    headers: { "Ocp-Apim-Subscription-Key":myKey },
    contentType: "application/json",
    data: '{ "Url": "' + url + '" }'
}).done(function (data) {
    var outputDiv = $("#OutputDiv");
    outputDiv.text(""); 
 
    var linesOfText = data.regions[0].lines;
    // Loop through each line of text and create a DIV tag 
    // containg each word, separated by a space
    // Append this newly-created DIV to OutputDiv
    for (var i = 0; i < linesOfText.length; i++) {
        var output = "";
        var thisLine = linesOfText[i];
        var words = thisLine.words;
        for (var j = 0; j < words.length; j++) {
            var thisWord = words[j];
            output += thisWord.text;
            output += " ";
        }
        var newDiv = "<div>" + output + "</div>";
        outputDiv.append(newDiv);
    }
}).fail(function (err) {
    $("#OutputDiv").text("ERROR!" + err.responseText);
}); 

The call to the web service is done with the line

$.ajax({
    type: "POST",
    url: "https://api.projectoxford.ai/vision/v1/ocr?language=en",
    headers: { "Ocp-Apim-Subscription-Key":myKey },
    contentType: "application/json",
    data: '{ "Url": "' + url + '" }' 

which sends a POST request and passes the URL as part of a JSON object in the request Body.

This request is Asynchronous, so the "done" function is called when it returns successfully.

            }).done(function (data) {

The function tied to the "done" event parses through the returned JSON and displays it on the screen.

If an error occurs, we output a simple error message to the user in the "fail" function.

}).fail(function (err) {
    $("#OutputDiv").text("ERROR!" + err.responseText);
}); 

Most of the code above is just formatting the output, so the REST call itself is quite simple. Project Oxford makes this type of analysis much easier for developers, regardless of their platform.

You can find this code at my Github repository.

In this article, you learned about the Project Oxford OCR API and how to call it from a JavaScript application.

Saturday, 19 March 2016 17:22:45 (GMT Standard Time, UTC+00:00)
# Thursday, 17 March 2016

Speech recognition is a problem on which computer scientists have been working for years. Project Oxford applies the science of Machine Learning to this problem in order to recognize words spoken and determine their probable meaning based on context.

Project Oxford exposes a REST web service so that you can add speech recognition to your application.

Before you can use the Speech API, you must register at Project Oxford. and retrieve the Speech API key

SpeechKey
Figure 1: Speech API Key

The easiest way to use this API in a .NET application is to use the SpeechRecognition library. A NuGet package makes it easy to add this library to your application. In Visual Studio 2015, create a new WPF application (File | New | Project | Windows | WPF Application). Then, right-click the project in the Solution Explorer and select Manage NuGet Packages. Search for and add the "Microsoft.ProjectOxford.SpeechRecognition" package. Select the "x64" or "x86" version that corresponds with your version of Windows.

NuGet
Figure 2: NuGet dialog

Now, you can start using the library to call the Speech API.

Add the following using statement to the top of a class file:

using Microsoft.ProjectOxford.SpeechRecognition; 

Within the class, declare a private instance of the MicrophoneRecognitionClient class

MicrophoneRecognitionClient _microphoneRecognitionClient; 

To begin listening to speech, instantiate the MicrophoneRecognitionClient object by using the SpeechRecognitionServiceFactory.CreateMicrophoneClient method and pass and pass in  the Speech Recognition Mode, the language to listen for, and your Speech Subscription Key.

The Speech Recognition Mode is an enum that can be either ShortPhrase or LongDictation. These are optimized for shorter or longer voice messages, respectively. Below is an example of this creating a new MicrophoneRecognitionClient instance:

var speechRecognitionMode = SpeechRecognitionMode.ShortPhrase;
string language = "en-us";
string subscriptionKey = ConfigurationManager.AppSettings["SpeechKey"].ToString(); 
 
_microphoneRecognitionClient
        = SpeechRecognitionServiceFactory.CreateMicrophoneClient
                        (
                        speechRecognitionMode,
                        language,
                        subscriptionKey
                        ); 

Now that you have a MicrophoneRecognitionClient object, wire up the OnPartialResponseReceived and the OnResponseReceived events to listen for speech and call the API to turn that speech into text.

_microphoneRecognitionClient.OnPartialResponseReceived += OnPartialResponseReceivedHandler;
_microphoneRecognitionClient.OnResponseReceived += OnMicShortPhraseResponseReceivedHandler;

The MicrophoneRecognitionClient object calls the web service frequently - often after every word - to interpret what words has heard so far. When it makes this call, its OnPartialResponseReceived event fires. 

The signature of OnPartialResponseReceivedHandler is:

void OnPartialResponseReceivedHandler(object sender, PartialSpeechResponseEventArgs e)

and you can retrieve Oxford's text interpretation of the spoken words from e.PartialResult. Oxford may revise its interpretation of words spoken at the beginning of a sentence when it receives more of the sentence to provide some context.

After a significant pause, the MicrophoneRecognitionClient object will decide that the user has finished speaking. At this point, it fires the OnResponseReceived event, giving you a chance to clean up. The EndMicAndRecognition method of the MicrophoneRecognitionClient stops listening and severs the connection to the web service.

Here is some code that may be appropriate in the OnResponseReceived event handler:

_microphoneRecognitionClient.EndMicAndRecognition();
_microphoneRecognitionClient.Dispose();
_microphoneRecognitionClient = null; 

I have created a sample WPF app with a single window containing the following XAML:

<StackPanel Name="MainStackPanel" Orientation="Vertical" VerticalAlignment="Top">
    <Button Name="RecordButton" Width="250" Height="100" 
            FontSize="32" VerticalAlignment="Top" 
            Click="RecordButton_Click">
        Start!
    </Button>
    <TextBox Name="OutputTextbox" VerticalAlignment="Top" Width="600" 
        TextWrapping="Wrap" FontSize="18"></TextBox>
</StackPanel> 

The code-behind for this window is listed below. It includes some visual cues that the app is listening and displays the latest text returned from the Speech API.

using System;
using System.Configuration;
using System.Threading;
using System.Windows;
using System.Windows.Media;
using Microsoft.ProjectOxford.SpeechRecognition; 
 
namespace SpeechToTextDemo
{
    /// <summary>
    /// Interaction logic for MainWindow.xaml
    /// </summary>
    public partial class MainWindow : Window
    {
        AutoResetEvent _FinalResponseEvent;
        MicrophoneRecognitionClient _microphoneRecognitionClient; 
 
        public MainWindow()
        {
            InitializeComponent();
            RecordButton.Content = "Start\nRecording";
            _FinalResponseEvent = new AutoResetEvent(false);
            OutputTextbox.Background = Brushes.White;
            OutputTextbox.Foreground = Brushes.Black;
        } 
 
        private void RecordButton_Click(object sender, RoutedEventArgs e)
        {
            RecordButton.Content = "Listening...";
            RecordButton.IsEnabled = false;
            OutputTextbox.Background = Brushes.Green;
            OutputTextbox.Foreground = Brushes.White;
            ConvertTextToSpeech();
        } 
 
        /// <summary>
        /// Start listening. 
        /// </summary>
        private void ConvertTextToSpeech()
        {
            var speechRecognitionMode = SpeechRecognitionMode.ShortPhrase;
            string language = "en-us";
            string subscriptionKey = ConfigurationManager.AppSettings["SpeechKey"].ToString(); 
 
            _microphoneRecognitionClient
                    = SpeechRecognitionServiceFactory.CreateMicrophoneClient
                                    (
                                    speechRecognitionMode,
                                    language,
                                    subscriptionKey
                                    ); 
 
            _microphoneRecognitionClient.OnPartialResponseReceived += OnPartialResponseReceivedHandler;
            _microphoneRecognitionClient.OnResponseReceived += OnMicShortPhraseResponseReceivedHandler;
            _microphoneRecognitionClient.StartMicAndRecognition(); 
 
        } 
 
        void OnPartialResponseReceivedHandler(object sender, PartialSpeechResponseEventArgs e)
        {
            string result = e.PartialResult;
            Dispatcher.Invoke(() =>
            {
                OutputTextbox.Text = (e.PartialResult);
                OutputTextbox.Text += ("\n"); 
 
            });
        } 
 
        /// <summary>
        /// Speaker has finished speaking. Sever connection to server, stop listening, and clean up
        /// </summary>
        /// <param name="sender"></param>
        /// <param name="e"></param>
        void OnMicShortPhraseResponseReceivedHandler(object sender, SpeechResponseEventArgs e)
        {
            Dispatcher.Invoke((Action)(() =>
            {
                _FinalResponseEvent.Set();
                _microphoneRecognitionClient.EndMicAndRecognition();
                _microphoneRecognitionClient.Dispose();
                _microphoneRecognitionClient = null;
                RecordButton.Content = "Start\nRecording";
                RecordButton.IsEnabled = true;
                OutputTextbox.Background = Brushes.White;
                OutputTextbox.Foreground = Brushes.Black; 
 
            }));
        }
    }
}

You can download this project from my GitHub repository.

In this article, you learned how to use the Project Oxford Speech Recognition .NET library to take advantage of the Oxford Speech API and add text-to-speech capabilities to your application.

Thursday, 17 March 2016 12:26:00 (GMT Standard Time, UTC+00:00)
# Wednesday, 16 March 2016

In the last article, we showed how to call the Project Oxford Emotions API via REST in order to determine the emotions of every person in a picture.

In this article, I will show you how to use a .NET library to call this API. A .NET library simplifies the process by abstracting away HTTP calls and providing strongly-typed objects with which to work in your .NET code.

As with the REST call, we begin by signing up for Project Oxford and getting the key for this API, which you can do at https://www.projectoxford.ai/Subscription?popup=True.

Em01-GetKey
Figure 1: Key

To use the .NET library, launch Visual Studio and create a new Universal Windows App (File | New | Project | Windows | Blank (Universal Windows))

Add the Emotions NuGet Package to your project (Right-click project | Manage NuGet Packages); then search for and install Microsoft.ProjectOxford.Emotion. This will add the appropriate references to your project.

In your code, add the following statement to top of your class file.

using Microsoft.ProjectOxford.Emotion;
using Microsoft.ProjectOxford.Emotion.Contract; 

To use this library, we create an instance of the EmotionServiceClient class, passing in our key to the constructor.

var emotionServiceClient = new EmotionServiceClient(emotionApiKey);

The RecognizeAsync method of this class accepts the URL of an image and returns an array of Emotion objects.

Emotion[] emotionResult = await emotionServiceClient.RecognizeAsync(imageUrl); 

Each emotion object represents a single face detected in the picture and contains the following properties:

FaceRectangle: This indicates the location of the face

Scores: A set of values corresponding to each emotion (anger, content, disgust, fear, happiness, neutral, sadness, and surprise) with a value indicating the confidence with which Oxford thinks the face matches this emotion. Confidence values are between 0 and 1 and higher values indicate a higher confidence that this is the correct emotion.

The code below returns a string indicating the most likely emotion for every face in an image.

var sb = new StringBuilder();
var faceNumber = 0;
foreach (Emotion em in emotionResult)
{
    faceNumber++;
    var scores = em.Scores;
    var anger = scores.Anger;
    var contempt = scores.Contempt;
    var disgust = scores.Disgust;
    var fear = scores.Fear;
    var happiness = scores.Happiness;
    var neutral = scores.Neutral;
    var surprise = scores.Surprise;
    var sadness = scores.Sadness; 
 
    var emotionScoresList = new List<EmotionScore>();
    emotionScoresList.Add(new EmotionScore("anger", anger));
    emotionScoresList.Add(new EmotionScore("contempt", contempt));
    emotionScoresList.Add(new EmotionScore("disgust", disgust));
    emotionScoresList.Add(new EmotionScore("fear", fear));
    emotionScoresList.Add(new EmotionScore("happiness", happiness));
    emotionScoresList.Add(new EmotionScore("neutral", neutral));
    emotionScoresList.Add(new EmotionScore("surprise", surprise));
    emotionScoresList.Add(new EmotionScore("sadness", sadness)); 
 
    var maxEmotionScore = emotionScoresList.Max(e => e.EmotionValue);
    var likelyEmotion = emotionScoresList.First(e => e.EmotionValue == maxEmotionScore); 
 
    string likelyEmotionText = string.Format("Face {0} is {1:N2}% likely to experiencing: {2}\n\n", 
        faceNumber, likelyEmotion.EmotionValue * 100, likelyEmotion.EmotionName.ToUpper());
    sb.Append(likelyEmotionText); 
 
}
var resultsText = sb.ToString(); 
 

This will return a string similar to the following:

Face 1 is 99.36% likely to experiencing: NEUTRAL

Face 2 is 100.00% likely to experiencing: HAPPINESS

Face 3 is 95.02% likely to experiencing: SADNESS

You can download this Visual Studio 2015 Universal Windows App project from here.

Full documentation on the Emotion library is available here. You can find a more complete (although more complicated) demo of this library here.

In this article, you learned how to use the .NET libraries to call the Project Oxford Emotion API and detect emotion in the faces of an image.

Wednesday, 16 March 2016 13:11:00 (GMT Standard Time, UTC+00:00)
# Tuesday, 15 March 2016

It's difficult enough for humans to recognize emotions in the faces of other humans. Can a computer accomplish this task? It can if we train it to and if we give it enough examples of different faces with different emotions.

When we supply data to a computer with the objective of training that computer to recognize patterns and predict new data, we call that Machine Learning. And Microsoft has done a lot of Machine Learning with a lot of faces and a lot of data and they are exposing the results for you to use.

The Emotions API in Project Oxford looks at pictures of people and determines their emotions. Possible emotions returned are anger, contempt, disgust, fear, happiness, neutral, sadness, and surprise. Each emotion is assigned a confidence level between 0 and 1 - higher numbers indicate a higher confidence that this is the emotion expressed in the face. If a picture contains multiple faces, the emotion of each face is returned.

The API is a simple REST web service located at https://api.projectoxford.ai/emotion/v1.0/recognize. POST to this service with a header that includes:
Ocp-Apim-Subscription-Key:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

where xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx is your key. You can find your key at https://www.projectoxford.ai/Subscription?popup=True

and a body that includes the following data:

{ "url": "http://xxxx.com/xxxx.jpg" }

where http://xxxx.com/xxxx.jpg is the URL of an image.
The full request looks something like:
POST https://api.projectoxford.ai/emotion/v1.0/recognize HTTP/1.1
Content-Type: application/json
Host: api.projectoxford.ai
Content-Length: 62
Ocp-Apim-Subscription-Key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

{ "url": "http://xxxx.com/xxxx.jpg" }

This will return JSON data identifying each face in the image and a score indicating how confident this API is that the face is expressing each of 8 possible emotions. For example, passing a URL with a picture below of 3 attractive, smiling people

SpartaHack-068-X2[1] 
(found online at https://giard.smugmug.com/Tech-Community/SpartaHack-2016/i-4FPV9bf/0/X2/SpartaHack-068-X2.jpg)

returned the following data:

[
  {
    "faceRectangle": {
      "height": 113,
      "left": 285,
      "top": 156,
      "width": 113
    },
    "scores": {
      "anger": 1.97831262E-09,
      "contempt": 9.096525E-05,
      "disgust": 3.86221245E-07,
      "fear": 4.26409547E-10,
      "happiness": 0.998336,
      "neutral": 0.00156954059,
      "sadness": 8.370223E-09,
      "surprise": 3.06117772E-06
    }
  },
  {
    "faceRectangle": {
      "height": 108,
      "left": 831,
      "top": 169,
      "width": 108
    },
    "scores": {
      "anger": 2.63808062E-07,
      "contempt": 5.387114E-08,
      "disgust": 1.3360991E-06,
      "fear": 1.407629E-10,
      "happiness": 0.9999967,
      "neutral": 1.63170478E-06,
      "sadness": 2.52861843E-09,
      "surprise": 1.91028926E-09
    }
  },
  {
    "faceRectangle": {
      "height": 100,
      "left": 591,
      "top": 168,
      "width": 100
    },
    "scores": {
      "anger": 3.24157673E-10,
      "contempt": 4.90155344E-06,
      "disgust": 6.54665473E-06,
      "fear": 1.73284559E-06,
      "happiness": 0.9999156,
      "neutral": 6.42121E-05,
      "sadness": 7.02297257E-06,
      "surprise": 5.53670576E-09
    }
  }
]

A high value for the 3 happiness scores and the very low values for all the other scores suggest a very high degree of confidence that the people in this photo is happy.

Here is the request in the popular HTTP analysis tool Fiddler [http://www.telerik.com/fiddler]:

Request

Em01-Fiddler-Request

Response:

Em02-Fiddler-Response

Sending requests to Project Oxford REST API makes it simple to analyze the emotions of people in a photograph.

Tuesday, 15 March 2016 09:57:07 (GMT Standard Time, UTC+00:00)
# Monday, 14 March 2016
Monday, 14 March 2016 16:05:00 (GMT Standard Time, UTC+00:00)

Generating a thumbnail image from a larger image sounds easy – just shrink the dimensions of the original, right? But it becomes more complicated if the thumbnail image is a different shape than the original. In this case, we will need to crop or distort the original image. Distorting the image tends to look very bad; and when we crop an image, we will need to ensure that the primary subject of the image remains in the generated thumbnail. To do this, we need to identify the primary subject of the image. That's easy enough for a human observer to do, but a difficult thing for a computer to do, which is necessary if we want to automate this process.

This is where machine learning can help. By analyzing many images, Machine Learning can figure out what parts of a picture are likely to be the main subject. Once this is known, it becomes a simpler matter to crop the picture in such a way that the main subject remains.

Project Oxford uses Machine Learning so that you don't have to. It exposes an API to create an intelligent thumbnail image from any picture.

You can see this in action at www.projectoxford.ai/demo/vision#Thumbnail.


Thu01-LiveDemo
Figure 1

With this live, in-browser demo, you can either select an image from the gallery and view the generated thumbnails; or provide your own image - either from your local computer or from a public URL. The page uses the Thumbnail API to create thumbnails of 6 different dimensions.

Thu02-LiveDemo-2
Figure 2

For your own application, you can either call the REST Web Service directly or (for a .NET application) use a custom library. The library simplifies development by abstracting away HTTP calls via strongly-typed objects.

To get started, you will need a free Project Oxford account and you will need to sign into projectoxford.ai with a Microsoft account.

For this API, you need a key. From the Computer Vision API page, (Figure 3); click the [Try for free >] button; then, click the "Show" link under the Primary key of the "Computer Vision" section (Figue 4).

Thu03-ComputerVisionApiPage
Figure 3 

Thu04-ShowKey
Figure 4 

To use the SDK, add the Microsoft.ProjectOxford.Video NuGet package to your project: Right-click on your project, select Manage NuGet Packages, search for "ProjectOxford.Video", select the package from the list, and click the [Install] button, as shown in Figure 5

Thu05-NuGet
Figure 5

This adds a reference to Microsoft.ProjectOxford.Vision.dll, which contains classes that make it easier to call this API.

Add the following statement to the top of a class file to use this library.

using Microsoft.ProjectOxford.Vision;

Now, you can use the methods in the VisionServiceClient class to interact with the API.

Create a VisionServiceClient with the following code:

string subscriptionKey = "15e24a988f484591b17bcc4713aec800";
IVisionServiceClient visionClient = new VisionServiceClient(subscriptionKey);

where “xxxxxxxxxxxxxxxxxxxxxxxxxxx” is your subscription key.

Next, use the GetThumbnailAsync method to generate a thumbnail image. The following code creates a 200x100 thumbnail of a photo of a buoy in Stockholm, Sweden.

string originalPicture = @"https://giard.smugmug.com/Travel/Sweden-2015/i-ncF6hXw/0/L/IMG_1560-L.jpg";
int width = 200;
int height = 100;
bool smartCropping = true;
byte[] thumbnailResult = null;
thumbnailResult = visionClient.GetThumbnailAsync(originalPicture, width, height, smartCropping).Result;

The result is an array of bytes, but you can save the corresponding image to a file with the following code:

string folder = @"c:\test";
string thumbnaileFullPath = string.Format("{0}\\thumbnailResult_{1:yyyMMddhhmmss}.jpg", folder, DateTime.Now);
using (BinaryWriter binaryWrite = new BinaryWriter(new FileStream(thumbnaileFullPath, FileMode.Create, FileAccess.Write)))
{
    binaryWrite.Write(thumbnailResult);
}

Below is the full listing in a Console App to generate a thumbnail; then open both the original image and the saved thumbnail image for comparison.

using System;
using System.Diagnostics;
using System.IO;
using Microsoft.ProjectOxford.Vision;
 
namespace ThumbNailConsole
{
    class Program
    {
        static void Main(string[] args)
        {
 
            string subscriptionKey = "15e24a988f484591b17bcc4713aec800";
            IVisionServiceClient visionClient = new VisionServiceClient(subscriptionKey);
 
            string originalPicture = @"https://giard.smugmug.com/Travel/Sweden-2015/i-ncF6hXw/0/L/IMG_1560-L.jpg";
            int width = 200;
            int height = 100;
            bool smartCropping = true;
            byte[] thumbnailResult = null;
            thumbnailResult = visionClient.GetThumbnailAsync(originalPicture, width, height, smartCropping).Result;
 
            string folder = @"c:\test";
            string thumbnaileFullPath = string.Format("{0}\\thumbnailResult_{1:yyyMMddhhmmss}.jpg", folder, DateTime.Now);
            using (BinaryWriter binaryWrite = new BinaryWriter(new FileStream(thumbnaileFullPath, FileMode.Create, FileAccess.Write)))
            {
                binaryWrite.Write(thumbnailResult);
            }
 
            Process.Start(thumbnaileFullPath);
            Process.Start(originalPicture);
 
            Console.WriteLine("Done! Thumbnail is at {0}!", thumbnaileFullPath);
        }
    }
}

The result is shown in Figure 6 below.

Thu06-Output

One thing to note. The Thumbnail API is part of the Computer Vision API. As of this writing, the free version of the Computer Vision API is limited to 5,000 transactions per month. If you want more than that, you will need to upgrade to the Standard version, which charges $1.50 per 1000 transactions.

But this should be plenty for you to learn this API for free and build and test your applications until you need to put them into production.

The code above can be found on GitHub.

Monday, 14 March 2016 04:01:00 (GMT Standard Time, UTC+00:00)
# Sunday, 13 March 2016

Project Oxford is a set of APIs that take advantage of Machine Learning to provide developers with

These technologies require Machine Learning, which requires a lot of computing power and a lot of data. Most of us have neither, but Microsoft does and has used it to create the APIs in Project Oxford.

Project Oxford provides APIs to analyze pictures and voice and provide intelligent information about them.

There are three broad categories of services: Vision, Voice, and Language.

The Vision APIs analyzes pictures and recognizes objects in those pictures.  For example, several Vision APIs are capable of recognizing  faces in an image. One analyzes each face and deduces that person's emotion; another can compare 2 pictures and decide whether or not 2 photographs are the same person; a third guesses the age of each person in a photo.

The Speech APIs can convert speech to text or text to speech. It can also recognize the voice of a given speaker (if you want to use that for authentication in your app, for example) and infer the intent of the speaker from his words and tone.

The Language APIs seem more of a grab bag to me. A spell checker is smart enough to recognize common proper names and homonyms.

All these APIs are currently in Preview but I've played with them and they appear very solid. Many of theme even provide a confidence factor to let you know how confident you should be in the value returned. For example, 2 faces may represent the same person but it helps to know how closely they match.

You can use these APIs. To get started, you need a Project Oxford account, but you can get one for free at projectoxford.ai.

Each API offers a free option that restricts the number and/or frequency of calls, but you can break through that boundary for a charge.

You can also find documentation, sample code, and even a place to try out each API live in your browser at projectoxford.ai.

You call each one by passing and receiving JSON to a RESTful web service, but some of them offer an SDK to make it easier to make that call from a .NET application.

You can see a couple of fun applications of Project Oxford at how-old.net (which guesses the ages of people in photographs) and what-dog.net (which identifies the breed of dog in a photo).

Sign up today and start building apps. It’s fun and it’s free!

Sunday, 13 March 2016 03:14:12 (GMT Standard Time, UTC+00:00)
# Friday, 11 March 2016

The auditorium darkened. The music began and a small light appeared at the front of the room; then more. Students on stage danced and waved lanterns on ropes for an impressive musical light show to kick off the 2016 SpartaHack hackathon.

 

Students came from all over the world to attend this hackathon on the East Lansing campus. Over 200 universities were represented among the applicants. In addition to a number of international students studying on American campus, I met students who traveled to the hackathon from India, Russia, Germany, and the Philippines.

AnnaMattDavidBrian My colleague Brian Sherwin arrived in East Lansing the day before the hackathon to host an Azure workshop for 30 students - showing them how to use the cloud platform to enhance their applications. Ann Lergaard joined us a day later and we did our best to answer student questions and help them build better projects. Late Friday night, I delivered a tech talk showing off some of the services available in Azure.

Microsoft offered a prize for the best hack using our technology. It was won be 2 students who built an application that allowed users to take a photo of text with their iPhone and, in response to voice commands, read back any part of that text. The project combined Microsoft's Project Oxford OCR API with an Amazon Echo and its Alexa platform, an iPhone app, and a Firebase database.

A couple other cool hacks were:

  • ValU, an app that used Microsoft Excel to analyze historical stock price data using Excel VBA scripts.
  • Spartifai, which modified a driver, allowing a Kinect device to be used with a MacBook.

JazzBand A hackathon is an event at which students and others come together and build software and/or hardware projects in small teams over the course of a couple days. I attend a lot of hackathons and SpartaHack was one of the better organized that I've seen. Over 500 students spent the weekend building a wide variety of impressive projects - often with technology they had not touched prior to that weekend. The organizers also did a great job of providing fun activities beyond just hacking. A jazz band and a rock band each performed a set for students to enjoy during a break; a Super Smash Brothers tournament was scheduled; and a Blind Coding Contest challenged students to write code without compiling or testing to see if it would run correctly the first time in front of an audience. 

Snowman As sponsors of the event, we tried to provide some fun as well. We gave away prizes for building a snowman and for tweeting about open source technology. We also provided some loaner hardware for students; and we spent a lot of time mentoring students, which resulted in a lack of sleep this weekend.

The MSU campus has changed a great deal since I earned my undergraduate degree there decades ago. It has even changed since my son graduated from there 4 years ago. But it still felt like a homecoming for me.

 

 

Friday, 11 March 2016 16:42:00 (GMT Standard Time, UTC+00:00)