Introduction of Machine Learning.NET (Part 1 of 5)

What is ML.NET ( https://dot.net/ml )

It is a free, cross-platform(Windows/MacOs/Linux),  open source machine learning framework made particularly for .NET developers.

Developing and integrating custom machine learning models into your applications while teaching yourself the basics of machine learning is very easy for existing .NET developers.

ML.NET is an extensible platform that powers already proven Microsoft products like Windows Hello, Bing Ads, PowerPoint Design Ideas, and more.

As of writing this blog, ML.NET has 0.3 version here https://github.com/dotnet/machinelearning

ML.NET was originated in Microsoft Research and evolved into a significant framework over the last decade. With this preview release, ML.NET enables ML tasks like classification (e.g. support text classification, sentiment analysis) and regression (e.g. price-prediction). Along with ML capabilities, this release of ML.NET also brings the draft of .NET APIs for training models, using models for predictions, as well as the core components of this framework such as learning algorithms, transforms, and ML data structures.

Microsoft is building ML.NET as an extensible framework, with support for Light GBM, Accord.NET, CNTK, and TensorFlow coming soon.

Installation

Once you have an app, you can install the ML.NET NuGet package from the .NET Core CLI using:

dotnet add package Microsoft.ML

or from the NuGet package manager:

Install-Package Microsoft.ML

Let us try writing some Machine Learning Code in .NET

I am here using Visual Studio for Mac, you can also use Visual Studio for Windows. This code is 100% compatible with both versions of Visual Studios.

Open the UCI Machine Learning Repository: Iris Data Set, copy and paste the data into a text editor (e.g. Notepad), and save it as iris-data.txt in the project directory. This is the dataset which we should include in our visual studio project and we also need to set its build type to Copy always. Download CSV based data from this URL UCI Machine Learning Repository: Iris Data Set

Data Set Characteristics:  

Multivariate

Number of Instances:

150

Attribute Characteristics:

Real

Number of Attributes:

4

Associated Tasks:

Classification

Missing Values? No

What we are going to do in 6 steps?

STEP 1: Define your data structures

STEP 2: Create a pipeline and load your data

STEP 3: Transform your data

STEP 4: Add learner

STEP 5: Train your model based on the dataset

STEP 6: Use your model to make a prediction

Create File -> New Project> Choose .NET Core based Console App and name it as “FlowerPredictor“. Now we will add Microsoft.ML NuGet package in this project as shown below.

Screen Shot 2018-07-15 at 12.58.40 PM.png

Now add an Inner Class “IrisPrediction” inside your Program Class, this class contains PreditedLabels property which will be the result of our prediction.

// IrisPrediction is the result returned from prediction operations
public class IrisPrediction
{
[ColumnName("PredictedLabel")]
public string PredictedLabels;
}

view raw
IrisPrediction.cs
hosted with ❤ by GitHub

Add One More Class named “IrisData” this class contains input data for training the machine learning model.

public class IrisData
{
[Column("0")]
public float SepalLength;
[Column("1")]
public float SepalWidth;
[Column("2")]
public float PetalLength;
[Column("3")]
public float PetalWidth;
[Column("4")]
[ColumnName("Label")]
public string Label;
}

view raw
PredictionProgram.cs
hosted with ❤ by GitHub

Now our Main method will look like this

static void Main(string[] args)
{
// STEP 2: Create a pipeline and load your data
var pipeline = new LearningPipeline();
// If working in Visual Studio, make sure the 'Copy to Output Directory'
// property of iris-data.txt is set to 'Copy always'
string dataPath = "iris-data.txt";
pipeline.Add(new TextLoader(dataPath).CreateFrom<IrisData>(separator: ','));
// STEP 3: Transform your data
// Assign numeric values to text in the "Label" column, because only
// numbers can be processed during model training
pipeline.Add(new Dictionarizer("Label"));
// Puts all features into a vector
pipeline.Add(new ColumnConcatenator("Features", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth"));
// STEP 4: Add learner
// Add a learning algorithm to the pipeline.
// This is a classification scenario (What type of iris is this?)
pipeline.Add(new StochasticDualCoordinateAscentClassifier());
// Convert the Label back into original text (after converting to number in step 3)
pipeline.Add(new PredictedLabelColumnOriginalValueConverter() { PredictedLabelColumn = "PredictedLabel" });
// STEP 5: Train your model based on the data set
var model = pipeline.Train<IrisData, IrisPrediction>();
// STEP 6: Use your model to make a prediction
// You can change these numbers to test different predictions
var prediction = model.Predict(new IrisData()
{
SepalLength = 3.3f,
SepalWidth = 1.6f,
PetalLength = 0.2f,
PetalWidth = 5.1f,
});
Console.WriteLine($"Predicted flower type is: {prediction.PredictedLabels}");
}

view raw
MainProgram.cs
hosted with ❤ by GitHub

Screen Shot 2018-07-15 at 1.50.27 PM

Once you run this project you will notice on output windows the flower type predicted as “Iris-virginica

Here is the Git Repo for the above Solution.

https://github.com/abhiongithub/ML-for-Dot-Net-developers

Now in next blog Post (Part 2), we will be applying concepts of clustering in existing solution.

Read Part 2 Here :

http://cloudandmobileblog.com/2018/07/15/clustering-in-machinelearning-net

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.