I think that more than a year and a half ago I read “Real-World Machine Learning” by Henrik Brink, Joseph Richards, and Mark Fetherolf. A book that is easy to read and goes “to the point”!!! I’m sure you know what I mean.

At the time the only thing that prevented me from really enjoying the book samples was that there was no easy way to “translate” them to .NET Core.

Well well, times have changed and now ML.NET (aka Microsoft.ML) exists: An open source and cross-platform machine learning framework for .NET.

The following sample will show you how to train a model with a binary classifier algorithm to determine if a passenger aboard the Titanic survived or not (Based on the sample provided in chapter 3 of the book).

1. Create your project#


Open a command prompt and run:

1md titanic.ml
2cd titanic.ml
3dotnet new console

2. Add a reference to ML.NET#


The nuget package for ML.NET is: Microsoft.ML so add a reference to it:

1dotnet add package Microsoft.ML -v 0.2.0
2dotnet restore

3. Replace the content of Program.cs#


Replace the content of Program.cs with the following contents:

 1using Microsoft.ML;
 2using Microsoft.ML.Data;
 3using Microsoft.ML.Runtime.Api;
 4using Microsoft.ML.Trainers;
 5using Microsoft.ML.Transforms;
 6using System;
 7using System.Threading.Tasks;
 8
 9namespace Titanic.ML
10{
11    class Program
12    {
13        static async Task Main(string[] args)
14        {
15            // Create a learning pipeline
16            var pipeline = new LearningPipeline();
17
18            // Load training data and add it to the pipeline
19            string dataPath = @".\data\titanic.training.csv";
20            var data = new TextLoader(dataPath).CreateFrom<TitanicData>(useHeader: true, separator: ',');
21            pipeline.Add(data);
22
23            // Transform any text feature to numeric values
24            pipeline.Add(new CategoricalOneHotVectorizer(
25                "Sex",
26                "Ticket",
27                "Fare",
28                "Cabin",
29                "Embarked"));
30
31            // Put all features into a vector
32            pipeline.Add(new ColumnConcatenator(
33                "Features",
34                "Pclass",
35                "Sex",
36                "Age",
37                "SibSp",
38                "Parch",
39                "Ticket",
40                "Fare",
41                "Cabin",
42                "Embarked"));
43
44            // Add a learning algorithm to the pipeline.
45            // This is a classification scenario (Did this passenger survive?)
46            pipeline.Add(new FastTreeBinaryClassifier() { NumLeaves = 5, NumTrees = 5, MinDocumentsInLeafs = 2 });
47
48            // Train your model based on the data set
49            Console.WriteLine($"Training Titanic.ML model...");
50            var model = pipeline.Train<TitanicData, TitanicPrediction>();
51
52            // Optional: Save the model to a file. You can use teh model in another program!!!
53            var modelPath = @".\data\titanic.model";
54            await model.WriteAsync(modelPath);
55
56            // Use your model to make a prediction. Let's predict what happened to this passenger
57            var prediction = model.Predict(new TitanicData()
58            {
59                Pclass = 3f,
60                Name = "Braund, Mr. Owen Harris",
61                Sex = "male",
62                Age = 31,
63                SibSp = 0,
64                Parch = 0,
65                Ticket = "335097",
66                Fare = "7.75",
67                Cabin = "",
68                Embarked = "Q"
69            });
70
71            Console.WriteLine($"Did this passenger survive? {(prediction.Survived ? "Yes" : "No")}");
72
73            // Evaluate the model using the test data. In other words let's test the model 
74            // to see how accurate it is.
75            Console.WriteLine($"Evaluating Titanic.ML model...");
76            dataPath = @".\data\titanic.csv";
77            data = new TextLoader(dataPath).CreateFrom<TitanicData>(useHeader: true, separator: ',');
78            var evaluator = new Microsoft.ML.Models.BinaryClassificationEvaluator();
79            var metrics = evaluator.Evaluate(model, data);
80
81            Console.WriteLine($"Accuracy: {metrics.Accuracy:P2}");
82            Console.WriteLine($"Auc: {metrics.Auc:P2}");
83            Console.WriteLine($"F1Score: {metrics.F1Score:P2}");
84        }
85    }
86}

Note: I splitted the original dataset from the book for this post.

Please find the code and data files here.

Hope it helps!