# Monday, July 1, 2019

Episode 570

Laurent Bugnion on Migrating Data to Azure

Laurent Bugnion describes how he migrated from on-premise MongoDB and SQL Server databases to CosmosDB and Azure SQL Database running in Microsoft Azure, using both native tools and the Database migration service.

Monday, July 1, 2019 9:39:00 AM (GMT Daylight Time, UTC+01:00)
# Sunday, June 30, 2019

MickJaggerFrom a distance, you would swear he was half his 75 years. You would never guess he underwent heart surgery two months ago. Only the cracks in his face revealed Mick Jagger's age. Not his body, which gyrated and strutted and danced for 2 hours as the legendary Rolling Stones performed before an overflowing Soldier Field in Chicago Tuesday night.

Most viewers saw him from a distance in the cavernous stadium. But the energy was high, and the audience sang and danced along with the band. Mick, Keith Richards, Ron Wood, and Charlie Watts have been recording and touring together for decades. People may think of them as the new member's, but Darryl Jones (of south Chicago), who joined the band when bassist Bill Wyman retired in 1993; and Chuck Leavell, former Allman Brothers keyboardist, who has been with the band since 1982 also have a tenure longer than most bands exist. Yet, they are newbies when compared with thier septuagenarian teammates.

This was my first time seeing the Stones and it may be their last visit to Chicago. Now in their 50th year, this year's "No Filter" tour will take them to 13 cities in the U.S. And they chose to open in Chicago, after Jagger's illness forced them to reshuffle the tour schedule.

To the delight of the crowd, they heard many references to Chicago. Mick noted the band had played the city nearly 40 times. And he introduced the new Chicago mayor and governor, who were in attendance, noting that Governor Pritzker had signed legislation that day legalizing cannabis in January. "Some of you may have jumped the gun," he quipped.

Of course, the Rolling Stones drew heavily from their catalog of hit songs - from opening with "Jumpin' Jack Flash" and "It's Only Rock 'n' Roll" to their encores: "Gimme Shelter" and "Satisfaction". But they included a few deeper album tracks, like "Bitch" and "Slipping Away".

It was mostly an evening of high energy rock and roll and blues; but a highlight of the night was when Mick, Keith, Ron, and Charlie brought their instruments (including a small drum kit) to a platform that extended 30 yards out into the audience to play two acoustic numbers: "Play with Fire" and "Sweet Virginia".

When the evening ended, it felt like they had given all they had and all we needed.

After 50 years, the band knew every note by heart, but still brought energy and made us feel they were having a good time after all this time.

Sunday, June 30, 2019 9:45:00 AM (GMT Daylight Time, UTC+01:00)
# Saturday, June 29, 2019

Amsterdam (32)Last month, I visited Copenhagen, Denmark for the first time and thinking that I'd never seen a city as bicycle-friendly.

But Amsterdam has Copenhagen beat by far in this regard. During morning rush hour, bicycle commuters easily outnumber automobiles; and I was told that Amsterdam has more bicycles than people.

I was mostly on foot, but I kept my head on a swivel as I walked around town, looking one way for cars and in the other direction for bicycles each time I ventured across the street.

It was my first visit to Amsterdam, and I came to work at an OpenHack - workshop designed to teach a specific technology via problem-solving and hands-on experience. The OpenHack was a great success for everyone. Nearly all the feedback we received was positive and people seemed to appreciate my coaching and enjoyed a presentation I delivered on Azure Data Factory. In addition, I led two "Envisioning Sessions" - an exploration of a project a customer is considering that involves use of cloud technology.

Amsterdam (17)I arrived Sunday and spent most of the day resting before meeting up with Mike Amundsen, an old friend from Cincinnati, who happened to be in Amsterdam to speak at the GOTO conference.

The next two days consisted of hard work during the day, followed by dinners with my teammates in the evening. I made up for the rich food by walking around miles the city.

When the OpenHack wrapped up on Day 3, I headed over to the museum area and spent about an hour exploring the Stedelijk Museum of Modern Art, before meeting Brent and David at an Indonesian restaurant. The Indonesian food was amazing, consisting of samples of dozens of different dishes.

After dinner, Brent and I took a boat ride along the canals. It featured a pre-recorded guided tour of the city. Because the canals cover almost all of Amsterdam, we got to see nearly all the city in this way.

Amsterdam (39)Friday I had to myself, so I bought tickets to 2 museums: The Van Gogh Museum, which features the works of the famous Dutch painter, along with those who influenced him; and the Rijksmuseum, which primarily features works of classic European artists from the Middle Ages to the present. It has a particularly impressive Rembrandt collection.

Friday morning, I had a special treat as I learned that Austin Haeberle - a friend I grew up with - was arriving in Amsterdam the same morning I was flying home. We had not seen each other in 30 years, so we met at the airport for breakfast and picked up where we left off.

If I had more time and energy, I could have spent days just exploring museums. I did not see the house where Anne Frank famously hid from the Nazis and wrote in her diary, nor the National Maritime Museum. I also did not get a chance to view Amsterdam's (in)famous Red-Light district. When I return, I will try to visit these places, as well as the rest of Amsterdam, which is a small enough country that one could drive across it in a couple hours.

The point is that I would love to return.

Amsterdam (2)

More photos

Saturday, June 29, 2019 9:11:00 AM (GMT Daylight Time, UTC+01:00)
# Friday, June 28, 2019

Azure Data Factory (ADF) is an example of an Extract, Transform, and Load (ETL) tool, meaning that it is designed to extract data from a source system, optionally transform its format, and load it into a different destination system.

The source and destination data can reside in different locations, in different data stores, and can support different data structures.

For example, you can extract data from an Azure SQL database and load it into an Azure Blob storage container.

To create a new Azure Data Factory, log into the Azure Portal, click the [Create a resource] button (Fig. 1) and select Integration | Data Factory from the menu, as shown in Fig. 2.

df01-CreateResource
Fig. 1

df02-IntegrationDataFactory
Fig. 2

The "New data factory" blade displays, as shown in Fig. 3.

df03-NewDataFactory
Fig. 3

At the "Name" field, enter a unique name for this Data Factory.

At the Subscription dropdown, select the subscription with which you want to associate this Data Factory. Most of you will only have one subscription, making this an easy choice.

At the "Resource Group" field, select an existing Resource Group or create a new Resource Group which will contain your Data Factory.

At the "Version" dropdown, select "V2".

At the "Location" dropdown, select the Azure region in which you want your Data Factory to reside. Consider the location of the data with which it will interact and try to keep the Data Factory close to this data, in order to reduce latency.

Check the "Enable GIT" checkbox, if you want to integrate your ETL code with a source control system.

After the Data Factory is created, you can search for it by name or within the Resource Group containing it. Fig. 4 shows the "Overview" blade of a Data Factory.

df04-OverviewBlade
Fig. 4

To begin using the Data Factory, click the [Author & Monitor] button in the middle of the blade.

The "Azure Data Factory Getting Started" page displays in a new browser tab, as shown in Fig. 5.

df05-GetStarted
Fig. 5

Click the [Copy Data] button (Fig. 6) to display, the "Copy Data" wizard, as shown in Fig. 7.

df06-CopyDataIcon
Fig. 6

df07-Properties
Fig. 7

This wizard steps you through the process of creating a Pipeline and its associated artifacts. A Pipeline performs an ETL on a single source and destination and may be run on demand or on a schedule.

At the "Task name" field, enter a descriptive name to identify this pipeline later.

Optionally, you can add a description to your task.

You have the option to run the task on a regular or semi-regular schedule (Fig. 8); but you can set this later, so I prefer to select "Run once now" until I know it is working properly.

df08-Schedule
Fig. 8

Click the [Next] button to advance to the "Source data store" page, as shown in Fig. 9.

df09-Source
Fig. 9

Click the [+ Create new connection] button to display to the "New Linked Service" dialog, as shown in Fig. 10.

df10-NewLinkedService
Fig.10

This dialog lists all the supported data stores.
At the top of the dialog is a search box and a set of links, which allow you to filter the list of data stores, as shown in Fig. 11.

df11-AzureSql
Fig. 11

Fig. 12 shows the next dialog if you select Azure SQL Database as your data source.

df12-AzureSqlDetails
Fig. 12

In this dialog, you can enter information specific to the database from which you are extracting data. When complete, click the [Test connection] button to verify your entries are correct; then click the [Finish] button to close the dialog.

After successfully creating a new connection, the connection appears in the "Source data store" page, as shown in Fig. 13.

df13-Source
Fig. 13

Click the [Next] button to advance to the next page in the wizard, which asks questions to specific to the type of data in your data source. Fig. 14 shows the page for Azure SQL databases, which allows you to select which tables to extract.

df14-SelectTables
Fig. 14

Click the [Next] button to advance to the "Destination data store", as shown in Fig. 15.

df15-Destination
Fig. 15

Click the [+ Create new connection] button to display the "New Linked Service" dialog, as shown in Fig. 16.

df16-NewLinkedService
Fig. 16

As with the source data connection, you can filter this list via the search box and top links, as shown in Fig. 17. Here we are selecting Azure Data Lake Storage Gen2 as our destination data store.

df17-NewLinkedService-ADL
Fig. 17

After selecting a service, click the [Continue] button to display a dialog requesting information about the data service you selected. Fig. 18 shows the page for Azure Data Lake. When complete, click the [Test connection] button to verify your entries are correct; then click the [Finish] button to close the dialog.

df18-ADLDetails
Fig. 18

After successfully creating a new connection, the connection appears in the "Destination data store" page, as shown in Fig. 19.

df19-Destination
Fig. 19

Click the [Next] button to advance to the next page in the wizard, which asks questions to specific to the type of data in your data destination. Fig. 20 shows the page for Azure Data Lake, which allows you to select the destination folder and file name.

df20-ChooseOutput
Fig. 20

Click the [Next] button to advance to the "File format settings" page, as shown in Fig. 21.

df21-FileFormatSettings
Fig. 21

At the "File format" dropdown, select a format in which to structure your output file. The prompts change depending on the format you select. Fig.  21 shows the prompts for a Text format file.

Complete the page and click the [Next] button to advance to the "Settings" page, as shown in Fig. 22.

df22-Settings
Fig. 22

The important question here is "Fault tolerance". When an error occurs, do you want to abort the entire activity, skipping the remaining records or do you want to log the error, skip the bad record, and continue with the remaining records.

Click the [Next] button to advance to the "Summary" page as shown in Fig. 23.

df23-Summary
Fig. 23

This page lists the selections you have made to this point. You may edit a section if you want to change any settings. When satisfied with your changes, click the [Next] button to kick off the activity and advance to the "Deployment complete" page, as shown in Fig. 24.

df24-DeploymentComplete
Fig. 24

You will see progress of the major steps in  this activity as they run. You can click the [Monitor] button to see a more detailed real-time progress report or you can click the [Finish] button to close the wizard.

In this article, you learned about the Azure Data Factory and how to create a new data factory with an activity to copy data from a source to a destination.

Friday, June 28, 2019 9:04:00 AM (GMT Daylight Time, UTC+01:00)
# Thursday, June 27, 2019

GCast 54:

Azure Storage Replication

Learn about the data replication options in Azure Storage and how to set the option appropriate for your needs.

Azure | Database | GCast | Screencast | Video
Thursday, June 27, 2019 4:16:00 PM (GMT Daylight Time, UTC+01:00)
# Wednesday, June 26, 2019

Azure IoT Hub allows you to route incoming messages to specific endpoints without having to write any code.

Refer to previous articles (here, here, and here, to learn how to create an Azure IoT Hub and how to add a device to that hub.

To perform automatic routing, you must

  1. Create an endpoint
  2. Create and configure a route that points to that endpoint
  3. Specify the criteria to invoke that route

Navigate to the Azure Portal and log in.

Open your IoT Hub, as shown in Fig. 1.

ir01-IotHubOverviewBlade
Fig. 1

Click the [Message routing] button (Fig. 2) under the "Messaging" section to open the "Routing" tab, as shown in Fig. 3

ir02-RoutingButton
Fig. 2

ir03-RoutingBlade
Fig. 3

Click the [Add] button to open the "Add a route" blade, as shown in Fig. 4.

ir04-AddRouteBlade
Fig. 4

At the "Name" field, enter a name for your route. I like to use something descripting, like "SendAllMessagesToBlobContainer".

At the "Endpoint" field, you can select an existing endpoint to which to send messages. An Endpoint is a destination to send any messages that meet the specified criteria. By default, only the "Events" endpoint exists. For a new hub, you will probably want to create a new endpoint. To create a new endpoint, click the [Add] button. This displays the "Add Endpoint" dialog, as shown in Fig. 5.

ir05-AddEndpoint
Fig. 5

At the "Endpoint" dropdown, select the type of endpoint you want to create. Fig. 6 shows the "Add a storage endpoint" dialog that displays if you select "Blob Storage".

ir06-AddStorageEndpointBlade
Fig. 6

At the "Endpoint name", enter a descriptive name for the new endpoint.

Click the [Pick a container] button to display a list of Storage accounts, as shown in Fig. 7.

ir07-PickStorageAccount
Fig. 7

Select an existing storage account or click the [+ Storage account] button to create a new one. After you select a storage account, the "Containers" dialog displays, listing all blob containers in the selected storage account, as shown in Fig. 8.

ir08-PickContainer
Fig. 8

Select an existing container or click the [+Container] button to create a new container. Messages matching the specified criteria will be stored in this blob container.

Back at the "Add a storage endpoint" dialog (Fig. 6), you have options to set the Batch frequency, Chunk size window, and Blob file name format.

Multiple blob messages are bundled together into a single blob.

The Batch frequency determines how frequently messages get bundled together. Lowering this value decreases latency; but doing so creates more files and requires more compute resources.

Chunk size window sets the maximum size of a blob. If a bundle of messages would exceed this value, the messages will be split into separate blobs.

The Blob file name format allows you to specify the name and folder structure of the blob. Each value within curly braces ({}) represents a variable. Each of the variables shown is required, but you can reorder them or remove slashes to change folders into file name parts or add more to the name, such as a file extension.

Click the [Create] button to create the endpoint and return to the "Add a route" blade, as shown in Fig. 9.

ir09-SaveRoute
Fig. 9

At the "Endpoint" dropdown, select the endpoint you just created.

At the "Data source" dropdown, you can select exactly what data gets routed to the endpoint. Choices are "Device Telemetry Messages"; "Device Twin Change Events"; and "Device Lifecycle Events".

The "Routing query" field allows you to specify the conditions under which messages will be routed to this endpoint.

If you leave this value as 'true', all messages will be routed to the specified endpoint.

But you can filter which messages are routed by entering something else in the "Routing query" field. Query syntax is described here.

Click the [Save] button to create this route.

In this article, you learned how to perform automatic routing for an Azure IoT Hub.

IoT
Wednesday, June 26, 2019 8:55:00 AM (GMT Daylight Time, UTC+01:00)
# Tuesday, June 25, 2019

Data Lake storage is a type of Azure Storage that supports a hierarchical structure.

There are no pre-defined schemas in a Data Lake, so you have a lot of flexibility on the type of data you want to store. You can store structured data or unstructured data or both. In fact, you can store data of different data types and structures in the same Data Lake.

Typically a Data Lake is used for ingesting raw data in order to preserve that data in its original format. The low cost, lack of schema enforcement, and optimization for inserts make it ideal for this. From the Microsoft docs: "The idea with a data lake is to store everything in its original, untransformed state."

After saving the raw data, you can then use ETL tools, such as SSIS or Azure Data Factory to copy and/or transform this data in a more usable format in another location.

Like most solutions in Azure, it is inherently highly scalable and highly reliable.

Data in Azure Data Lake is stored in a Data Lake Store.

Under the hood, a Data Lake Store is simply an Azure Storage account with some specific properties set.

To create a new Data Lake storage account, navigate to the Azure Portal, log in, and click the [Create a Resource] button (Fig.1).

dl01-CreateResource
Fig. 1

From the menu, select Storage | Storage Account, as shown in Fig. 2.

dl02-MenuStorageAccount
Fig. 2

The "Create Storage Account" dialog with the "Basic" tab selected displays, as shown in Fig. 3.

dl03-Basics
Fig. 3

At the “Subscription” dropdown, select the subscription with which you want to associate this account. Most of you will have only one subscription.

At the "Resource group" field, select a resource group in which to store your service or click "Create new" to store it in a newly-created resource group. A resource group is a logical container for Azure resources.

At the "Storage account name" field, enter a unique name for the storage account.

At the "Location" field, select the Azure Region in which to store this service. Consider where the users of this service will be, so you can reduce latency.

At the "Performance" field, select the "Standard" radio button. You can select the "Premium" performance button to achieve faster reads; however, there may be better ways to store your data if performance is your primary objective.

At the "Account kind" field, select "Storage V2"

At the "Replication" dropdown, select your preferred replication. Replication is explained here.

At the "Access tier" field, select the "Hot" radio button.

Click the [Next: Advanced>] button to advance to the "Advanced" tab, as shown in Fig. 4.

dl04-Advanced
Fig. 4

The important field on this tab is "Hierarchical namespace". Select the "Enabled" radio button at this field.

Click the [Review + Create] button to advance to the "Review + Create" tab, as shown in Fig. 5.

dl05-Review
Fig. 5

Verify all the information on this tab; then click the [Create] button to begin creating the Data Lake Store.

After a minute or so, a storage account is created. Navigate to this storage account and click the [Data Lake Gen2 file systems] button, as shown in Fig. 6.

dl06-Services
Fig. 6

The "File Systems" blade displays, as shown in Fig. 7.

dl07-FileSystem
Fig. 7

Data Lake data is partitioned into file systems, so you must create at least one file system. Click the [+ File System] button and enter a name for the file system you wish to create, as shown in Fig. 8.

dl08-AddFileSystem
Fig. 8

Click the [OK] to add  this file system and close the dialog. The newly-created file system displays, as shown in Fig. 9.

dl09-FileSystem
Fig. 9

If you double-click the file system in the list, a page displays where you can set access control and read about how to manage the files in this Data Lake Storage, as shown in Fig. 10

dl10-FileSystem
Fig. 10

In this article, you learned how to create a Data Lake Storage and a file system within it.

Tuesday, June 25, 2019 10:10:00 AM (GMT Daylight Time, UTC+01:00)
# Monday, June 24, 2019

Episode 569

John Alexander on ML.NET

John Alexander describes how .NET developers can use ML.NET to build and consume Machine Learning solutions.

Monday, June 24, 2019 9:01:00 AM (GMT Daylight Time, UTC+01:00)
# Sunday, June 23, 2019

Frank and April Wheeler were living the 1950s American dream. Frank had a steady - if unfulfilling - job in New York City, April was the attractive wife he always wanted, and they owned a large home in a quiet neighborhood in suburban New Jersey.

But, like nearly all their neighbors, the Wheelers were far from happy.

They were bored suburbanites, working dead-end jobs, in loveless marriages, talking about their dreams.

They talked of how they didn't belong - of how they were so much better than the rest of the sheep who surrendered to the conformity of the world. But they take no action to correct their circumstances. The fact is that they are not as much "better" as they believe.

April suggests that the Wheelers move to Paris and start a new life, so that Frank can explore his potential. But Frank is not interested in his potential or in self-exploration. He likes the low expectations that come with his job. And, when he is given an opportunity at a promotion, he leaps at the chance.

Frank and April are self-aware enough to believe they are superior to their neighbors and co-workers, but not self-aware enough to realize they are not. They either don't know themselves or they refuse to see themselves.

They are under the illusion that their problems are easily fixable - move to Paris; get a promotion; have an affair. New flash: They are not.

Instead they continue their pretentious life of drunken lunches and adultery and deluding themselves that they are destined for more. No one takes responsibility for his or her own actions, choosing instead to blame others or the expectations of society.

The only honest person in the book is John Givings, a son of the Wheelers' neighbors, who has been literally certified insane and institutionalized. But John is so shockingly rude that it's difficult for anyone to listen to him or to take him seriously.

Inevitably, the story ends in tragedy, with no lessons learned and everyone continuing to face their troubles alone.

Don't read Revolutionary Road by Richard Yates to feel good about yourself. Read it as a warning about buying too much into the American dream. The sad part is how relevant this warning feels today.

Sunday, June 23, 2019 7:29:00 AM (GMT Daylight Time, UTC+01:00)
# Saturday, June 22, 2019

NeverLetMeGoIt isn't obvious until well into Never Let Me Go by Kazuo Ishiguro that this is a story of a dystopian society. Ishiguro drops hints throughout the story, slowly revealing the situation in which the characters find themselves. Words like "donations", "Possible", and "Completion" are introduced, and we know they have some mysterious meaning, but are not told that meaning until much later.

Kathy H is a 31-year-old “Carer” looking back on her life - particularly her time at Hailsham - a boarding school in rural England. Life is good at Hailsham, but the students are secluded and are given almost no knowledge of the outside world, other than being told they will someday have a special place in it.

Everyone has a name like "Kathy H" or "Tommy D". At first, I thought this was a literary device, with the author pretending to protect identities; but, on reflection, I think the students were not given last names one more way to dehumanize them.

Never Let Me Go is a story of false hope; of what it means to be human and to have a soul; and of how much control each of us has over our destiny. It is told in a believable manner in a world not very different from ours and referencing technology that does not sound far-fetched.

It is a dystopian nightmare, disguised as a coming-of-age story.

Saturday, June 22, 2019 9:56:00 AM (GMT Daylight Time, UTC+01:00)