Machine learning (ML) enables computers to make predictions and
automate decision-making based on data.
In .NET applications, ML.NET
provides an easy way to integrate machine learning models. ML.NET
allows
developers to train, build, and deploy custom ML models without
requiring prior ML expertise. It is
open-source, cross-platform, and works offline, making it ideal for cloud and on-premises applications.
In this article, we will explore how to use ML.NET to build a
language detection model. We will use
AutoML to train the model and Visual Studio's
Model Builder to simplify the process. The resulting model
can be used to detect the language of a given text, which is useful for applications that process
multilingual content.
Why we gonna do?
You don't need machine learning expertise to use
Model Builder. All you need is some data, and a problem
to solve.
Traditional programming relies on explicit rules and control flow statements like if
, else, switch
,
foreach, while, etc., which
makes solving complex problems—like text classification or language
detection—challenging. ML.NET overcomes this by allowing
.NET developers to leverage machine learning
without deep ML knowledge. With AutoML
and Visual Studio's Model Builder, even
beginners can train and
deploy models effortlessly. This simplifies AI adoption in
.NET applications, enabling powerful
data-driven solutions.
How we gonna do?
Now let's learn how to use data classification to detect language
with help of AutoML and ML.NET
in a step-by-step manner.
Install Visual Studio 2022: Download and
install Visual Studio 2022 Community Edition
from the official website and make sure to install the extension ML.NET Model
Builder 2022.
To proceed with tasks and classify text into different languages, we need training
data. Since our
client deals with multi-locale historically, we have a rich
dataset to train. We exported the data as
CSV with two columns: the first column is
Language, and the second column is Message
in that language.
To create a Model Builder config file, right-click the project
in Solution Explorer and select Add > Machine
Learning Model. Name the file
LanguageDetectionModel.mbconfig, then open and configure it.
To select a scenario, choose Data Classification and use the
Local CPU as the training environment.
To specify the training data, select file
as the data source and choose language-messages.csv
.
Ensure the correct column headers are set, with Language as the
label and Message as the
feature.
To train the model, click Start Training, set the training
duration, and check the model's performance.
Now you can see the best model getting auto selected and macro accuracy getting
displayed. The
closer the value to 1, the better the model. This again depends
the on the quality of data you
provide. Meanwhile you can also have a look at output logs to
see what is happening.
This will also update the csproj file with necessary Nuget
packages and references.
To evaluate the model, enter a sample message, click Predict to
check accuracy, and adjust training if needed.
You can see that it has detected the entered message as English with 25%
confidence compared to
other languages. This again depends on the data you provide for training.
To consume the model, copy the generated C# code, integrate it
into the Console App or your .NET App
, and use the Predict method for language
detection.
You can see that it has detected the entered message as English with 25%
confidence compared to
other languages. This again depends on the data you provide for training.
The final solution and code looks as follows.
Code Sample - Language Detection Using AutoML in ML.NET
Demo Space
Summary
In this article, we learned how to use ML.NET and
AutoML to build a language detection model. We used
Visual Studio's Model Builder to simplify the process, making it
easy for beginners to train and deploy
models. The resulting model can detect the language of a given text
, which is useful for applications
that process multilingual content.