# Saturday, July 6, 2019

HousekeepingHousekeeping by Marilynne Robinson is filled with water and filled with tragedy.

It opens with a train crash into a lake, killing hundreds, including the grandfather of Ruthie and Lucille. Later, Lucille and Ruthie's mother commits suicide by driving her car into the same lake. Abandoned years earlier by their father, the girls grow up under the care of their grandmother and aunts until eccentric Aunt Silvie shows up and moves in.

Silvie is a former transient, who sometimes falls asleep on park benches. She is not cut out for motherhood and the girls withdraw into one another, skipping schools and making no friends, other than one another. They skip school and the local authorities begin to question their situation, forcing everyone in this family to make a choice.

Housekeeping is a simple story, built on the strength of the characters. Robinson presents humor and tragedy in an eloquent style that keeps the reader engaged. For such a short novel, we see a full picture of the three main characters. It is worth the time to read.

Saturday, July 6, 2019 9:12:00 AM (GMT Daylight Time, UTC+01:00)
# Friday, July 5, 2019

Azure Databricks is a web-based platform built on top of Apache Spark and deployed to Microsoft's Azure cloud platform.

Databricks provides a web-based interface that makes it simple for users to create and scale clusters of Spark servers and deploy jobs and Notebooks to those clusters. Spark provides a general-purpose compute engine ideal for working with big data, thanks to its built-in parallelization engine.

Apache Spark is open source and Databricks is owned by the Databricks company; but, Microsoft adds value by providing the hardware and fabric on which these tools are deployed, including providing capacity on which to scale and built-in fault tolerance.

To create an Azure Databricks environment, navigate to the Azure Portal, log in, and click the [Create Resource] button (Fig. 1).

db01-CreateResourceButton
Fig. 1

From the menu, select Analytics | Azure Databricks, as shown in Fig. 2.

db02-NewDataBricksMenu
Fig. 2

The "Azure Databricks service" blade displays, as shown in Fig. 3.

db03-NewDataBricksBlade
Fig. 3

At the "Workspace name" field, enter a unique name for the Databricks workspace you will create.

At the "Subscription" field, select the subscription associated with this workspace. Most of you will have only one subscription.

At the "Resource group" field, click the "Use existing" radio button and select an existing Resource Group from the dropdown below; or click the "Create new" button and enter the name and region of a new Resource Group when prompted.

At the "Location" field, select the location in which to store your workspace. Considerations include the location of the data on which you will be working and the location of developers and users who will access this workspace.

At the "Pricing Tier" dropdown, select the desired pricing tier. The Pricing Tier options are shown in Fig. 4.

db04-PricingTier
Fig. 4

If you wish to deploy this workspace to a particular virtual network, select "Yes" radio button at this question.

When completed, the blade should look similar to Fig. 5.

db05-NewDataBricksBlade-Completed
Fig. 5

Click the [Create] button to create the new Databricks service. This may take a few minutes.

Navigate to the Databricks service, as shown in Fig. 6.

db06-OverviewBlade
Fig. 6

Click the [Launch Workspace] button (Fig. 7) to open the Azure Databricks page, as shown in Fig. 8.

db07-LaunchWorkspaceButton
Fig. 7

db08-DatabricksHomePage
Fig. 8

In this article, I showed you how  to create a new Azure Databricks service. In future articles, I will show how to create clusters, notebooks, and otherwise make use of your Databricks service.

Friday, July 5, 2019 9:00:00 AM (GMT Daylight Time, UTC+01:00)
# Thursday, July 4, 2019

GCast 55:

GitHub Deployment to an Azure Web App

Learn how to set up automated deployment from a GitHub repository to an Azure Web App

Thursday, July 4, 2019 9:58:00 AM (GMT Daylight Time, UTC+01:00)
# Wednesday, July 3, 2019

Source control is an important part of software development - from collaborating with other developers to enabling continuous integration and continuous deployment to providing the ability to roll back changes.

Azure Data Factory (ADF) provides the ability to integrate with source control systems GitHub or Azure DevOps.

I will walk you through doing this, using GitHub.

Before you get started, you must have the following:

A GitHub account (Free at https://github.com)

A GitHub repository created in your account, with at least one file in it. You can easily add a "readme.md" file to a repository from within the GitHub portal.

Create an ADF service, as described in this article.

Open the "Author & Monitor" page (Fig. 1) and click the "Set up Code Repository" button (Fig. 2)

ar01-ADFOverviewPage
Fig. 1

ar02-SetupCodeRepositoryButton
Fig. 2

The "Repository Settings" blade displays, as shown in Fig. 3.

ar03-RepositoryType
Fig. 3

At the "Repository Type", dropdown, select the type of source control you are using. The current options are "Azure DevOps Git" and "GitHub". For this demo, I have selected "GitHub".

When you select a Repository type, the rest of the dialog expands with prompts relevant to that type. Fig. 4 shows the prompts when you select "GitHub".

ar04-RepositoryName
Fig. 4

I don't have a GitHub Enterprise account, so I left this checkbox unchecked.

At the "GitHub Account" field, enter the name of your GitHub account. You don't need the full URL - just the name. For example, my GitHub account name is "davidgiard", which you can find online at https://github.com/davidgiard; so, I entered "davidgiard" into the "GitHub Account" field.

The first time you enter this account, you may be prompted to sign in and to authorize Azure to access your GitHub account.

Once you enter a valid GitHub account, the "Git repository name" dropdown is populated with a list of your repositories. Select the repository you created to hold your ADF assets.

After you select a repository, you are prompted for more specific information, as shown in Fig. 5

ar05-RepositorySettings
Fig. 5

At the "Collaboration branch", select "master". If you are working in a team environment or with multiple releases, it might make sense to check into a different branch in order control when changes are merged. To do this, you will need to create a new branch in GitHub.

At the "Root folder", select a folder of the repository in which to store your ADF assets. I typically leave this at "/" to store everything in the root folder; but, if you are storing multiple ADF services in a single repository, it might make sense to organize them into separate folders.

Check the "Import existing Data Factory resources to repository" checkbox. This causes any current assets in this ADF asset to be added to the repository as soon as you save. If you have not yet created any pipelines, this setting is irrelevant.

At the "Branch to import resources into" radio buttons, select "Use Collaboration".

Click the [Save] button to save your changes and push any current assets into the GitHub repository.

Within seconds, any pipelines, linked services, or datasets in this ADF service will be pushed into GitHub. You can refresh the repository, as shown in Fig. 6.

ar06-GitHub
Fig. 6

Fig. 7 shows a pipeline asset. Notice that it is saved as JSON, which can easily be deployed to another server.

ar07-GitHub
Fig. 7

In this article, you learned how to connect your ADF service to a GitHub repository, storing and versioning all ADF assets in source control.

Wednesday, July 3, 2019 6:56:40 PM (GMT Daylight Time, UTC+01:00)
# Tuesday, July 2, 2019

GitHub provides a good way to create and manage source code repository.

But, how do you delete a repository when you no longer need it? I found this to be non-intuitive when I needed to delete one.

Here are the steps.

Log into GitHub and open your repository, as shown in Fig. 1.

dr01-repo
Fig. 1

Click the [Settings] tab (Fig. 2) near the top of the page.

dr02-SettingsButton
Fig. 2

The "Settings" page displays, as shown in Fig. 3

dr03-SettingsPage
Fig. 3

Scroll to the "Danger Zone" section at the bottom of the "Settings" page, as shown in Fig. 4.

dr04-DangerZone
Fig. 4

Click the [Delete this repository] button.

A confirmation popup (Fig. 5) displays, warning you that this action cannot be undone (which is why it is in the Danger Zone).

dr05-AreYouSure
Fig. 5

If you are sure you want to delete this repository, type the repository name in the textbox and click the [I understand the consequences, delete this repository] button.

If all goes well, a confirmation message displays indicating that your repository was successfully deleted, as shown in Fig. 6.

dr06-Confirmation
Fig. 6

Congratulations! Your repository is no more! It is an ex-repository!

Tuesday, July 2, 2019 9:55:00 AM (GMT Daylight Time, UTC+01:00)
# Monday, July 1, 2019

Episode 570

Laurent Bugnion on Migrating Data to Azure

Laurent Bugnion describes how he migrated from on-premise MongoDB and SQL Server databases to CosmosDB and Azure SQL Database running in Microsoft Azure, using both native tools and the Database migration service.

Monday, July 1, 2019 9:39:00 AM (GMT Daylight Time, UTC+01:00)
# Sunday, June 30, 2019

MickJaggerFrom a distance, you would swear he was half his 75 years. You would never guess he underwent heart surgery two months ago. Only the cracks in his face revealed Mick Jagger's age. Not his body, which gyrated and strutted and danced for 2 hours as the legendary Rolling Stones performed before an overflowing Soldier Field in Chicago Tuesday night.

Most viewers saw him from a distance in the cavernous stadium. But the energy was high, and the audience sang and danced along with the band. Mick, Keith Richards, Ron Wood, and Charlie Watts have been recording and touring together for decades. People may think of them as the new member's, but Darryl Jones (of south Chicago), who joined the band when bassist Bill Wyman retired in 1993; and Chuck Leavell, former Allman Brothers keyboardist, who has been with the band since 1982 also have a tenure longer than most bands exist. Yet, they are newbies when compared with thier septuagenarian teammates.

This was my first time seeing the Stones and it may be their last visit to Chicago. Now in their 50th year, this year's "No Filter" tour will take them to 13 cities in the U.S. And they chose to open in Chicago, after Jagger's illness forced them to reshuffle the tour schedule.

To the delight of the crowd, they heard many references to Chicago. Mick noted the band had played the city nearly 40 times. And he introduced the new Chicago mayor and governor, who were in attendance, noting that Governor Pritzker had signed legislation that day legalizing cannabis in January. "Some of you may have jumped the gun," he quipped.

Of course, the Rolling Stones drew heavily from their catalog of hit songs - from opening with "Jumpin' Jack Flash" and "It's Only Rock 'n' Roll" to their encores: "Gimme Shelter" and "Satisfaction". But they included a few deeper album tracks, like "Bitch" and "Slipping Away".

It was mostly an evening of high energy rock and roll and blues; but a highlight of the night was when Mick, Keith, Ron, and Charlie brought their instruments (including a small drum kit) to a platform that extended 30 yards out into the audience to play two acoustic numbers: "Play with Fire" and "Sweet Virginia".

When the evening ended, it felt like they had given all they had and all we needed.

After 50 years, the band knew every note by heart, but still brought energy and made us feel they were having a good time after all this time.

Sunday, June 30, 2019 9:45:00 AM (GMT Daylight Time, UTC+01:00)
# Saturday, June 29, 2019

Amsterdam (32)Last month, I visited Copenhagen, Denmark for the first time and thinking that I'd never seen a city as bicycle-friendly.

But Amsterdam has Copenhagen beat by far in this regard. During morning rush hour, bicycle commuters easily outnumber automobiles; and I was told that Amsterdam has more bicycles than people.

I was mostly on foot, but I kept my head on a swivel as I walked around town, looking one way for cars and in the other direction for bicycles each time I ventured across the street.

It was my first visit to Amsterdam, and I came to work at an OpenHack - workshop designed to teach a specific technology via problem-solving and hands-on experience. The OpenHack was a great success for everyone. Nearly all the feedback we received was positive and people seemed to appreciate my coaching and enjoyed a presentation I delivered on Azure Data Factory. In addition, I led two "Envisioning Sessions" - an exploration of a project a customer is considering that involves use of cloud technology.

Amsterdam (17)I arrived Sunday and spent most of the day resting before meeting up with Mike Amundsen, an old friend from Cincinnati, who happened to be in Amsterdam to speak at the GOTO conference.

The next two days consisted of hard work during the day, followed by dinners with my teammates in the evening. I made up for the rich food by walking around miles the city.

When the OpenHack wrapped up on Day 3, I headed over to the museum area and spent about an hour exploring the Stedelijk Museum of Modern Art, before meeting Brent and David at an Indonesian restaurant. The Indonesian food was amazing, consisting of samples of dozens of different dishes.

After dinner, Brent and I took a boat ride along the canals. It featured a pre-recorded guided tour of the city. Because the canals cover almost all of Amsterdam, we got to see nearly all the city in this way.

Amsterdam (39)Friday I had to myself, so I bought tickets to 2 museums: The Van Gogh Museum, which features the works of the famous Dutch painter, along with those who influenced him; and the Rijksmuseum, which primarily features works of classic European artists from the Middle Ages to the present. It has a particularly impressive Rembrandt collection.

Friday morning, I had a special treat as I learned that Austin Haeberle - a friend I grew up with - was arriving in Amsterdam the same morning I was flying home. We had not seen each other in 30 years, so we met at the airport for breakfast and picked up where we left off.

If I had more time and energy, I could have spent days just exploring museums. I did not see the house where Anne Frank famously hid from the Nazis and wrote in her diary, nor the National Maritime Museum. I also did not get a chance to view Amsterdam's (in)famous Red-Light district. When I return, I will try to visit these places, as well as the rest of Amsterdam, which is a small enough country that one could drive across it in a couple hours.

The point is that I would love to return.

Amsterdam (2)

More photos

Saturday, June 29, 2019 9:11:00 AM (GMT Daylight Time, UTC+01:00)
# Friday, June 28, 2019

Azure Data Factory (ADF) is an example of an Extract, Transform, and Load (ETL) tool, meaning that it is designed to extract data from a source system, optionally transform its format, and load it into a different destination system.

The source and destination data can reside in different locations, in different data stores, and can support different data structures.

For example, you can extract data from an Azure SQL database and load it into an Azure Blob storage container.

To create a new Azure Data Factory, log into the Azure Portal, click the [Create a resource] button (Fig. 1) and select Integration | Data Factory from the menu, as shown in Fig. 2.

df01-CreateResource
Fig. 1

df02-IntegrationDataFactory
Fig. 2

The "New data factory" blade displays, as shown in Fig. 3.

df03-NewDataFactory
Fig. 3

At the "Name" field, enter a unique name for this Data Factory.

At the Subscription dropdown, select the subscription with which you want to associate this Data Factory. Most of you will only have one subscription, making this an easy choice.

At the "Resource Group" field, select an existing Resource Group or create a new Resource Group which will contain your Data Factory.

At the "Version" dropdown, select "V2".

At the "Location" dropdown, select the Azure region in which you want your Data Factory to reside. Consider the location of the data with which it will interact and try to keep the Data Factory close to this data, in order to reduce latency.

Check the "Enable GIT" checkbox, if you want to integrate your ETL code with a source control system.

After the Data Factory is created, you can search for it by name or within the Resource Group containing it. Fig. 4 shows the "Overview" blade of a Data Factory.

df04-OverviewBlade
Fig. 4

To begin using the Data Factory, click the [Author & Monitor] button in the middle of the blade.

The "Azure Data Factory Getting Started" page displays in a new browser tab, as shown in Fig. 5.

df05-GetStarted
Fig. 5

Click the [Copy Data] button (Fig. 6) to display, the "Copy Data" wizard, as shown in Fig. 7.

df06-CopyDataIcon
Fig. 6

df07-Properties
Fig. 7

This wizard steps you through the process of creating a Pipeline and its associated artifacts. A Pipeline performs an ETL on a single source and destination and may be run on demand or on a schedule.

At the "Task name" field, enter a descriptive name to identify this pipeline later.

Optionally, you can add a description to your task.

You have the option to run the task on a regular or semi-regular schedule (Fig. 8); but you can set this later, so I prefer to select "Run once now" until I know it is working properly.

df08-Schedule
Fig. 8

Click the [Next] button to advance to the "Source data store" page, as shown in Fig. 9.

df09-Source
Fig. 9

Click the [+ Create new connection] button to display to the "New Linked Service" dialog, as shown in Fig. 10.

df10-NewLinkedService
Fig.10

This dialog lists all the supported data stores.
At the top of the dialog is a search box and a set of links, which allow you to filter the list of data stores, as shown in Fig. 11.

df11-AzureSql
Fig. 11

Fig. 12 shows the next dialog if you select Azure SQL Database as your data source.

df12-AzureSqlDetails
Fig. 12

In this dialog, you can enter information specific to the database from which you are extracting data. When complete, click the [Test connection] button to verify your entries are correct; then click the [Finish] button to close the dialog.

After successfully creating a new connection, the connection appears in the "Source data store" page, as shown in Fig. 13.

df13-Source
Fig. 13

Click the [Next] button to advance to the next page in the wizard, which asks questions to specific to the type of data in your data source. Fig. 14 shows the page for Azure SQL databases, which allows you to select which tables to extract.

df14-SelectTables
Fig. 14

Click the [Next] button to advance to the "Destination data store", as shown in Fig. 15.

df15-Destination
Fig. 15

Click the [+ Create new connection] button to display the "New Linked Service" dialog, as shown in Fig. 16.

df16-NewLinkedService
Fig. 16

As with the source data connection, you can filter this list via the search box and top links, as shown in Fig. 17. Here we are selecting Azure Data Lake Storage Gen2 as our destination data store.

df17-NewLinkedService-ADL
Fig. 17

After selecting a service, click the [Continue] button to display a dialog requesting information about the data service you selected. Fig. 18 shows the page for Azure Data Lake. When complete, click the [Test connection] button to verify your entries are correct; then click the [Finish] button to close the dialog.

df18-ADLDetails
Fig. 18

After successfully creating a new connection, the connection appears in the "Destination data store" page, as shown in Fig. 19.

df19-Destination
Fig. 19

Click the [Next] button to advance to the next page in the wizard, which asks questions to specific to the type of data in your data destination. Fig. 20 shows the page for Azure Data Lake, which allows you to select the destination folder and file name.

df20-ChooseOutput
Fig. 20

Click the [Next] button to advance to the "File format settings" page, as shown in Fig. 21.

df21-FileFormatSettings
Fig. 21

At the "File format" dropdown, select a format in which to structure your output file. The prompts change depending on the format you select. Fig.  21 shows the prompts for a Text format file.

Complete the page and click the [Next] button to advance to the "Settings" page, as shown in Fig. 22.

df22-Settings
Fig. 22

The important question here is "Fault tolerance". When an error occurs, do you want to abort the entire activity, skipping the remaining records or do you want to log the error, skip the bad record, and continue with the remaining records.

Click the [Next] button to advance to the "Summary" page as shown in Fig. 23.

df23-Summary
Fig. 23

This page lists the selections you have made to this point. You may edit a section if you want to change any settings. When satisfied with your changes, click the [Next] button to kick off the activity and advance to the "Deployment complete" page, as shown in Fig. 24.

df24-DeploymentComplete
Fig. 24

You will see progress of the major steps in  this activity as they run. You can click the [Monitor] button to see a more detailed real-time progress report or you can click the [Finish] button to close the wizard.

In this article, you learned about the Azure Data Factory and how to create a new data factory with an activity to copy data from a source to a destination.

Friday, June 28, 2019 9:04:00 AM (GMT Daylight Time, UTC+01:00)
# Thursday, June 27, 2019

GCast 54:

Azure Storage Replication

Learn about the data replication options in Azure Storage and how to set the option appropriate for your needs.

Azure | Database | GCast | Screencast | Video
Thursday, June 27, 2019 4:16:00 PM (GMT Daylight Time, UTC+01:00)