In a previous article, I described the details of the OCR Service, which is part of the Microsoft Cognitive Services Computer Vision API.
To make this API useful, you need to write some code and build an application that calls this service.
If you want to follow along, you can find all the code in the "OCRDemo" project, included in this set of demos.
To use this demo project, you will first need to create a Computer Vision API service, as described here.
Read the project's read.me file, which explains the setup you need to do in order to run this with your account.
If you open index.html in the browser, you will see that it displays an image of a poem, along with some controls on the left:
- A dropdown list to change the poem image
- A dropdown list to select the language of the poem text
- A [Get Text] button that calls the web service.
Fig. 1 shows index.html when it first loads:
On the page is an empty div with the id="OutputDiv"
In the first two lines, we select this div and set its text to "Thinking…" while the web service is being called.
var outputDiv = $("#OutputDiv");
Next, we get the URL of the image containing the currently displayed poem and the selected language. These both come from the selected items of the two dropdowns.
Then, we get the API key, which is in the getKey() function, which is stored in the getkey.js file. You will need to update this file yourself, adding your own key, as described in the read.me.
Now, it's time to call the web service. My Computer Vision API service was created in the West Central US region, so I've hard-coded the URL. You may need to change this, if you created your service in a different region.
I add a querystring parameter to the URL to indicate the slected language.
Then, I call the web service by submitting an HTTP POST request to the web service URL, passing in the appropriate headers and constructing a JSON document to pass in the request body.
Finally, I process the results when the HTTP response returns.
The returned JSON contains an array of regions; each region contains an array of lines; and each line contains an array of words.
In this simple example, I simply loop through each word in each line in each region, concatenating them together and adding some HTML to format line breaks.
Then, I append this HTML to the outputDiv and follow it up with a horizontal rule to emphasize that it is the end.
I also, catch errors that might occur, displaying a generic message in the outputDiv, where the returned text would have been.
Fig. 2 shows the results after a successful web service call.
Try this yourself to see it in action. The process is very similar in other languages.