# Monday, 28 May 2018
Monday, 28 May 2018 10:13:00 (GMT Daylight Time, UTC+01:00)
# Sunday, 27 May 2018

IMG_0659Two weeks ago, I attended my first Imagine Cup event - the US Finals in San Francisco. Yesterday, I attended my second.

The Canadian Imagine Cup Finals took place in Vancouver, BC. Six finalists competed for the right to advance to the International Finals in Redmond in July.

This event was smaller and shorter than the corresponding US event; but equal in energy. Every one of the teams showed great creativity, strong technical skills, and impressive presentations.

I was excited that all 6 finalists came from schools with which I work. First place went to SmartArm - a project that provides affordable prosthetic arms for amputees - a team I helped mentor at the UTHacks hackathon in January in Toronto.

I'm getting used to these Imagine Cup projects and competitions and I'm looking forward to the world finals.

Sunday, 27 May 2018 17:35:17 (GMT Daylight Time, UTC+01:00)
# Friday, 25 May 2018

One of the challenges of working with data is what to do with missing data.

A missing column in a dataset can be, but is not limited to the following:

  • No text or numbers between 2 column delimiters
  • An empty string ("")
  • A blank string (e.g., "   ")
  • The number 0
  • A special indicator, such as "NA" or "NONE"
  • An inconsistent data type, such as a number where a string is expected
  • A value that makes no sense in the context of the data.

The last one requires some domain knowledge about the data, so it is often difficult to spot.

There are two strategies for dealing with missing data

  1. Delete or ignore the entire row
  2. Replace the column with a reasonable value.

If only a few rows contain missing data, it may be efficient to simply delete these rows.

But if many rows contain missing data, it probably makes sense to keep them as other columns may contain valuable information. In this case, we will want to replace the missing data with a reasonable value.

But what is a reasonable value?

Options include replacing the column with an average value, such as the mean or median of the non-missing values. Of course, this is only valid for numeric data that is ordinal, that is data in which higher numbers indicate a higher value and not simply a discrete category.

The Pandas library contains some simple functions for deleting rows and replacing values. The fillna function is the simplest way to do this.

# Replace all missing values with 0
df.fillna(0)

# Replace all missing values with the string 'Missing'
df.fillna('Missing')
  

You can delete invalid or missing rows by overwriting a dataset with a filtered version of that set, as in the following examples

# Delete all rows with area = 0 
df = df[df.area != 0]

# Delete all rows with null area 
df = df[df.area.notnull()] 
  

But for values that are not missing, but are inappropriate (e.g., using 0 to represent a missing data point, when 0 could be a valid measurement), we can use the map function.

Below, we use the map function to want to replace any value in the 'area' column that has a value of 0 with the mean value for this column.

# Replace 0 area with the mean
mean_area = df['area'].mean()   
df['area'] = df['area'].map({0: mean_area})    
  

In this article, we discussed ways to use Pandas handle missing data in a dataframe.

Friday, 25 May 2018 00:44:34 (GMT Daylight Time, UTC+01:00)
# Monday, 21 May 2018
Monday, 21 May 2018 16:37:00 (GMT Daylight Time, UTC+01:00)
# Friday, 18 May 2018

I was working with a dataset in Azure ML Studio and I needed to replace values in a column.

Reasons for replacing value include:

  • Replacing codes with a more readable word or words
  • Consistency when combining 2 sets of data
  • Converting to numeric values in order to assign values to discrete strings
  • Converting to numeric values in order to work with an algorithm that only accepts numeric values

There is no built-in shape to do this, but you can do so with a couple lines of code.

I can demonstrate by creating a new ML studio experiment and dragging the "Automobile Price Data" sample dataset onto the experiment design surface, as shown in Fig. 1.

MLRe01-AutomobilePriceData
Fig. 1

If we click on this shape and select "Visualize" (Fig. 2), we can see the data in the dataset. Click the "drive-wheels" column to see details about the data in that column (Fig. 3).

MLRe02DataSetMenu
Fig. 2

MLRe03VisualizeData-Before
Fig. 3

You can see from the visualization that the "drive-wheels" column contains 3 distinct values: "fwd", "rwd", "4wd"
Imagine I wanted to replace these with "FRONT", "REAR", and "FOUR", respectively. (Maybe to be consistent with a second dataset I plan to merge with this one.)

Drag an "Execute Python Script" shape to the  Experiment and connect its input to the output of the data shape (Fig. 4).

MLRe04-TwoShapes
Fig. 4

In the Properties of the "Execute Python Script" shape, replace the existing code with the following:

import pandas as pd
def azureml_main(dataframe1 = None):
    dataframe1['drive-wheels'] = dataframe1['drive-wheels'].map({'fwd': 'FRONT', 'rwd': 'REAR', '4wd': 'FOUR'})
    return dataframe1,
    

The azureml_main function is required by ML Studio. It accepts one parameter - a dataframe, which we name “dataframe1”

The first line of code maps the 3 existing drive-wheels values to 3 new values for every row and saves these 3 new values back to the dataset. By returning that dataframe, we make this updated data the output of this shape, so it can be used by later steps in our experiment.

After running this experiment, we can click the script shape and Visualize the output and see that each value in "drive-wheels" has been replaced, as shown in Fig 5.

MLRe05-VisualizeData-After
Fig. 5

This article shows a simple way to replace values in a dataframe column with new values in Azure ML Studio.

Friday, 18 May 2018 22:15:08 (GMT Daylight Time, UTC+01:00)
# Thursday, 17 May 2018

Many Machine Learning solutions have the same steps in common. For example, you will need to retrieve data from one or more sources; you will want to split your data into training and testing subsets; and you will want to clean up your source data. I refer to these as "plumbing tasks” because they are common to so many projects; and spending time coding these tasks takes time away from working on your data and your solution.

Azure Machine Learning Studio can help.

Azure Machine Learning Studio or "ML Studio" is a graphical design tool for building machine learning solutions. It includes a design surface and a set of shapes to perform specific tasks.

To work with ML Studio, drag a shape onto the design surface, set some properties, and connect the inputs and/or outputs to other shapes to build the workflow of your solution. For example, you can drag an "Import Data" shape onto a form and set information about the data source and data type. This is "plumbing" code that you do not have to write.

MLStudio-01

If a shape for a desired task does not exist, there are shapes that allow you to write custom code. Supported languages are Python and R.

When you finish building and testing your solution, buttons at the bottom allow you to configure and deploy a web service, so that your model is accessible via a simple API. There is even a test page, allowing you to call this API from within your browser.

You can get a free trial at https://studio.azureml.net/

There are limits to the free version. You cannot configure the size and number of instances on which it will run, and you are limited to 10 GB storage. If you cannot work within these restrictions, you can sign up for an Azure account and pay for the resources you use. Current pricing is available at this link.

MLStudio-02

If you are looking for a quick and simple way to build a machine learning solution, Azure Machine Learning Studio may be the tool for you.

Thursday, 17 May 2018 23:58:00 (GMT Daylight Time, UTC+01:00)
# Tuesday, 15 May 2018

I have been working with North American universities since joining Microsoft almost 5 years ago.

The school year is winding down for most colleges, so I'd like to talk about ways that Microsoft is helping university students in North America.

Azure for Students

This offering provides $100 Azure cloud computing credit free to any student. It is a good opportunity to learn about cloud computing and test the services in Azure. You can sign up with an EDU email address at http://aka.ms/Azure4Students. No credit card is required.

  • $100 credit for 1 year
  • Access to all of Azure
  • No credit card required

Microsoft Student Partner

Microsoft Student Partners (or MSPs) serve as Microsoft advocates on campus, providing workshops and information about Microsoft technology. In exchange, they receive education from Microsoft and (sometimes) a few prizes. It is a good way to increase the networking, education, and employability of a student. You can learn more at https://imagine.microsoft.com/msp

Student Ambassador

This program is similar to the MSP program, but it is run by the Microsoft Recruiting organization.

Student Hackathons

Student hackathons have become very popular the last few years. At a hackathon, students get together on campus and build hardware and software projects in teams. Many hackathons offer prizes for the best projects and students from other universities are often welcome. Microsoft has sponsored a great many hackathons over the years, offering money, prizes, Azure credits, hardware, and mentors to answer student questions. I have personally been involved in dozens of hackathons the last 3 years.

Imagine Cup

Imagine Cup is an international competition for teams of students who build amazing projects and want to turn them into a business. The top teams in each participating country are invited to the national finals for a chance to pitch their projects to a panel of judges, who select a few teams to advance to the International Finals. The top prize for this competition is $100,000 US and a mentoring session with Microsoft CEO Satya Nadella. You can learn more about Imagine Cup at https://imagine.microsoft.com/Compete

DataFests

Recently, I have been involved in a few DataFests. A DataFest is a competition on campus in which students are provided a set of data they have not yet seen; and asked to provide insights into the data. Students are free to use any tools they want and many present summaries, visualizations, and predictive analyses about the data. For the 3 DataFests in which I was involved (two at the University of Toronto and one at Duke University), Microsoft provided funds for food, free Azure credits, a workshop to show how to use MS's data science tools, and mentors to answer student questions.

Internships

Microsoft offers opportunities for university students to intern with the company. Most take place in Redmond in the summer. This is a great chance to work with a product team, learn new skills, and enhance your resume. These internships are very competitive, so students are encouraged to apply early in the school year. You can learn more and apply for internships at https://careers.microsoft.com/us/en/students-and-graduates

Academic CSE Team

This is the team for which I have been working this past year. We have coordinated and executed many of Microsoft's programs around university education. My responsibilities have included talking with professors, TAs, and students about how to incorporate Azure into their classes, meeting with MSP, and mentoring at hackathons and DataFests.

And So...

Microsoft is committed to helping students learn about software and computer science. The above list is some of the opportunities for students provided by Microsoft. Microsoft’s new fiscal year begins in a few weeks and there is not guarantee these programs will remain the same next year. In fact it’s likely there will be some changes.I don't know what programs will be offered going forward, but I expect a continuing strong commitment from my employer.

Tuesday, 15 May 2018 14:54:00 (GMT Daylight Time, UTC+01:00)
# Monday, 14 May 2018
Monday, 14 May 2018 09:33:00 (GMT Daylight Time, UTC+01:00)
# Sunday, 13 May 2018

Freddy Cole at the Jazz ShowcaseFreddy Cole looked every bit of his 83 years as he was helped onto the stage last night at the Jazz Showcase in Chicago's Printer's Row.

Until he sat at the piano. At that point he was transformed. For an hour and a half, he showed a strength and grace that belied his 8 decades. His command of piano and vocals was a strong as a man half his age.

He launched from one song to the next, never taking a break to chat with the audience until the final few minutes.

Although Cole can claim 3 Grammy nominations, he will always be remembered as the younger brother of legendary singer Nat "King" Cole. But he does not shun that comparison, as his set included three of Nat's songs (Paper Moon, L-O-V-E, and A Blossom Fell) and he recently released an album of songs made famous by his brother. Close your eyes and the richness of Freddy's voice is reminiscent of his late brother's talents.

Cole stuck mostly to ballads, but pleased the local crowd near the end of his set with a rendition of Ray Price's swinging "On the South Side of Chicago", which brought an ovation from his hometown.

His piano and vocals were accompanied by drums, upright bass, and guitar. Of course, the attention was mostly on Cole, but his talented Adam Moezinia took many solos. Bassist Elias Bailey stayed in the background until the last few songs when he became more and more bold with his solos and complex playing.

Freddy and DavidIn the end, Freddy Cole closed the set with a song called "Goodbye", accepted a standing ovation, and was helped from the stage, again transformed into a fragile old man. Until the night's second set.

Sunday, 13 May 2018 14:49:00 (GMT Daylight Time, UTC+01:00)
# Saturday, 12 May 2018

Day 2 of Microsoft's annual Build conference began with a keynote presentation hosted by Corporate Vice President Joe Belfiore. This was much shorter than the Day 1 keynote and focused on Microsoft 365. The presentation was split into the following "chapters":  

  • Windows  
  • Windows Developers  
  • Office Development  
  • Microsoft Graph

For me, the most interesting topic was Adaptive Cards - a technology that allows you to add functionality to Office Applications, Microsoft Teams, or SharePoint. Organizations can create Cards that access user and group data in Microsoft Graph and share data across applications.

Many of the features discussed in the keynote are available by joining the Insiders Program and using early releases of Windows and Office. Information on the Insiders Program is here.

You can watch the full keynote below or click this link.

Below are my raw notes as I watched.

Windows
    Timeline
    Apps save data to Graph (in cloud)
    Data available across devices
   
    Shipping in-box PC app
        Data from phone available on PC
        e.g., read and reply text from PC
        Sets (available in Windows insider build)
            Office / Graph / Web working together
           
   
Windows Developers
    Fluent Design System
    Decoupling parts of Windows to make it easier to add to apps
    UWP XAML Islands
        All Windows Application can access Fluent Design System
        No threading across processes
    Some controls designed for Win10
    <WPF:WebView />
         Edge-based
        Available in WPF app
    Microsoft is using ML to improve products
        e.g., Grammar checking in Word
    Windows UI Library (WinUI)
        available via NuGet
    .NET Core 3 preview available later this year
    MSIX
        Application Containers
        Simpler deployment
    Android Emulator compatible with Hyper-V
    Notepad now supports Linux line feeds
    Change to MS Store revenue sharing
        Consumer apps: Increase dev revenue share to 85%
        95% if your campaign drives user to store (via web site or app)
   
Office Development
    Deployment
        Deploy custom functions to all users in your organization
        Deployment centralized
    Adaptive cards
        Post / update GitHub comments and issues from Outlook
        Payments from Outlook email message
        Build Your own cards at adaptivecard.io
    Customization to MS Teams
        Tab Extension
        Same as SharePoint extension
     Sample app: Click in teams to launch PowerBI report
    Build Adaptive Card for MS Teams
    MS Store has "Teams" section
        
Microsoft Graph
    Microsoft Graph Powers Microsoft 365
     Users sign into app with Microsoft Graph identity
    Get user data across apps: Provide personalized experience
    Extend Graph group or user schema: Add new properties
    Microsoft Graph UWP Controls available today
        Open Source
        https://aka.ms/windowstoolkit

Saturday, 12 May 2018 04:36:38 (GMT Daylight Time, UTC+01:00)