Saturday, December 14, 2024

.NET Aspire and Semantic Kernel AI

 Let's learn how to use the .NET Aspire Azure OpenAI client. We will familiarize ourselves with the Aspire.Azure.AI.OpenAI library, which is used to register an OpenAIClient in the dependency injection (DI) container for consuming Azure OpenAI or OpenAI functionality. In addition, it enables corresponding logging and telemetry.

Companion Video:
Final Solution Code:


  • .NET 9.0
  • Visual Studio Code
  • .NET Aspire Workload
  • "C# Dev Kit" extension for VS Code

Getting Started

We will start by cloning a simple C# solution that contains two projects that use Semantic Kernel - namely a console project (ConsoleAI) and a razor-pages project (RazorPagesAI). Clone the project in a working directory on your computer by executing these commands in a terminal window:

git clone
cd AspireAI

The cloned solution contains a console application (ConsoleAI) and a razor-pages application (RazorPagesAI). They both do pretty much do the same thing. The objective of today’s exercise is to:

  • use .NET Aspire so that both projects get started from one place 
  • pass environment variables to the console and razor-pages web apps from the .AppHost project that belongs to .NET Aspire

Open the solution in VS Code and update the values in the following appsettings.json files with your access parameters for Azure OpenAI and/or OpenAI:


The most important settings are the connection strings. They are identical in both projects:

"ConnectionStrings": {
  "azureOpenAi": "Endpoint=Azure-OpenAI-Endpoint-Here;Key=Azure-OpenAI-Key-Here;",
  "openAi": "Key=OpenAI-Key-Here"

After you update your access parameters, try each application separately to see what it does:

Here is my experience using the console application (ConsoleAI) with AzureOrOpenAI set to “OpenAI”:

cd ConsoleAI
dotnet run

I then changed the AzureOrOpenAI setting to “Azure” and ran the console application (ConsoleAI) again:

Next, try the razor pages web application (RazorPagesAI) with AzureOrOpenAI set to “OpenAI”:

cd ../RazorPagesAI
dotnet watch

In the RazorPagesAI web app’s appsettings.json file, I changed AzureOrOpenAI to “Azure”, resulting in a similar experience.

In the root folder, add .NET Aspire to the solution:

cd ..
dotnet new aspire --force

Add the previous projects to the newly created .sln file with:

dotnet sln add ./AiLibrary/AiLibrary.csproj
dotnet sln add ./RazorPagesAI/RazorPagesAI.csproj
dotnet sln add ./ConsoleAI/ConsoleAI.csproj

Add the following .NET Aspire agent packages to the client ConsoleAI and RazorPagesAI projects with:

dotnet add ./ConsoleAI/ConsoleAI.csproj package Aspire.Azure.AI.OpenAI --prerelease
dotnet add ./RazorPagesAI/RazorPagesAI.csproj package Aspire.Azure.AI.OpenAI --prerelease

To add Azure hosting support to your IDistributedApplicationBuilder, install the 📦 Aspire.Hosting.Azure.CognitiveServices NuGet package in the .AppHost project:

dotnet add ./AspireAI.AppHost/AspireAI.AppHost.csproj package Aspire.Hosting.Azure.CognitiveServices

In VS Code, add the following references:

  1. Add a reference from the .AppHost project into ConsoleAI project.
  2. Add a reference from the .AppHost project into RazorPagesAI project.
  3. Add a reference from the ConsoleAI project into .ServiceDefaults project.
  4. Add a reference from the RazorPagesAI project into .ServiceDefaults project.

Copy the AI and ConnectionStrings blocks from either the console (ConsoleAI) or web app (RazorPagesAI)  appsettings.json file into the appsettings.json file of the .AppHost project. The appsettings.json file in the .AppHost project will look similar to this:

"AI": {
  "AzureOrOpenAI": "OpenAI",
  "OpenAiChatModel": "gpt-3.5-turbo",
  "AzureChatDeploymentName": "gpt-35-turbo"
"ConnectionStrings": {
  "azureOpenAi": "Endpoint=Azure-OpenAI-Endpoint-Here;Key=Azure-OpenAI-Key-Here;",
  "openAi": "Key=OpenAI-Key-Here"

Add the following code to the Program.cs file in the .AppHost project just before builder.Build().Run()

IResourceBuilder<IResourceWithConnectionString> openai;
var AzureOrOpenAI = builder.Configuration["AI:AzureOrOpenAI"] ?? "Azure"; ;
var chatDeploymentName = builder.Configuration["AI:AzureChatDeploymentName"];
var openAiChatModel = builder.Configuration["AI:OpenAiChatModel"];
// Register an Azure OpenAI resource. 
// The AddAzureAIOpenAI method reads connection information
// from the app host's configuration
if (AzureOrOpenAI.ToLower() == "azure") {
    openai = builder.ExecutionContext.IsPublishMode
        ? builder.AddAzureOpenAI("azureOpenAi")
        : builder.AddConnectionString("azureOpenAi");
} else {
    openai = builder.ExecutionContext.IsPublishMode
        ? builder.AddAzureOpenAI("openAi")
        : builder.AddConnectionString("openAi");
// Register the RazorPagesAI project and pass to it environment variables.
//  WithReference method passes connection info to client project
    .WithEnvironment("AI__AzureChatDeploymentName", chatDeploymentName)
    .WithEnvironment("AI__AzureOrOpenAI", AzureOrOpenAI)
    .WithEnvironment("AI_OpenAiChatModel", openAiChatModel);
 // register the ConsoleAI project and pass to it environment variables
    .WithEnvironment("AI__AzureChatDeploymentName", chatDeploymentName)
    .WithEnvironment("AI__AzureOrOpenAI", AzureOrOpenAI)
    .WithEnvironment("AI_OpenAiChatModel", openAiChatModel);

We need to add .NET Aspire agents in both our console and web apps. Let us start with the web app. Add this code to the Program.cs file in the RazorPagesAI project right before “var app = builder.Build()”: 


In the same Program.cs of the web app (RazorPagesAI), comment out the if (azureOrOpenAi.ToLower() == "openai") { …. } else { ….. } block and replace it with this code:

if (azureOrOpenAi.ToLower() == "openai") {
} else {

In the above code, we call the extension method to register an OpenAIClient for use via the dependency injection container. The method takes a connection name parameter. Also, register Semantic Kernel with the DI. 

Also, in the Program.cs file in the ConsoleAI project, add this code right below the using statements:

var hostBuilder = Host.CreateApplicationBuilder();

In the same Program.cs of the console app (ConsoleAI), comment out the if (azureOrOpenAi.ToLower() == "azure") { …. } else { ….. } block and replace it with this code:

if (azureOrOpenAI.ToLower() == "azure") {
    var azureChatDeploymentName = config["AI:AzureChatDeploymentName"] ?? "gpt-35-turbo";
} else {
    var openAiChatModel = config["AI:OpenAiChatModel"] ?? "gpt-3.5-turbo";
var app = hostBuilder.Build();

Replace “var kernel = builder.Build();” with this code:

var kernel = app.Services.GetRequiredService<Kernel>();

You can now test that the .NET Aspire orchestration of both the Console and Web apps. Stop all applications, then, in a terminal window,  go to the .AppHost project and run the following command:

dotnet watch

You will see the .NET Aspire dashboard:

Click on Views under the Logs column. You will see this output indicating that the console application ran successfully:

Click on the link for the web app under the Endpoints column. It opens the razor pages web app in another tab in your browser. Test it out and verify that it works as expected.

Stop the .AppHost application, then comment out the AI and ConneectionStrings blocks in the appsettings.json files in both the console and web apps. If you run the .AppHost project again, you will discover that it works equally well because the environment variables are being passed from the .AppHost project into the console and web apps respectively.

One last refinement we can do to the console application is do away with the ConfigurationBuilder because we can get a configuration object from the ApplicationBuilder. Therefore, comment out the following code in the console application:

var config = new ConfigurationBuilder()
    .AddJsonFile("appsettings.json", optional: true, reloadOnChange: true)

Replace the above code with the following:

var config = hostBuilder.Configuration;

You can delete the following package from the ConsoleAI.csproj file:

<PackageReference Include="Microsoft.Extensions.Configuration.Json" Version="9.0.0" />

Everything works just as it did before.

Wednesday, November 20, 2024

Using Aspire with gRPC

We start with a simple gRPC application that involves a gRPC server and Blazor client. The gRPC server connects to a SQLite database. To test the sample solution, we must first start the gRPC server app, then start the client app. This is somewhat tedious. By introducing .NET Aspire into the mix, we only need to start one app to get the solution to work. .NET Aspire also gives us many more benefits.

Start source code:
Companion Video:


In order to continue with this tutorial, you will need the following:

  • .NET 9.0
  • Visual Studio Code
  • 'C# Dev Kit' extension for Visual Studio Code

.NET Aspire Setup

In any terminal window folder, run the following command before you install .NET Aspire:

dotnet workload update 

To install the .NET Aspire workload from the .NET CLI, execute this command:

dotnet workload install aspire

Check your version of .NET Aspire, with this command:

dotnet workload list

Startup Application

We will start with a .NET 9.0 solution that involves a gRPC backend and a Blazor frontend. Clone the code from this GitHub site with:

git clone

To run the solution, we must first start the backend, then start the frontend. To get a good sense of what the application does, follow these steps:

1) Inside the GrpcStudents folder, run the following command in a terminal window:

dotnet run

2) Next, start the frontend. Inside a terminal window in the BlazorGrpcClient folder, run this command:

dotnet watch

Try the application by adding, updating, and deleting data saved in a SQLit database on the gRPC server. However, it is a pain to have to start both projects to get the solution to work. This is where .NET Aspire comes to the rescue.

Converting solution to .NET Aspire

Close both terminal windows by hitting CTRL C in each.

To add the basic .NET Aspire projects to our solution by running the following command inside the root GrpcBlazorSolution folder:

dotnet new aspire --force

We use the --force switch because the above command will overwrite the .sln file with a new one that only includes two new projects: GrpcBlazorSolution.AppHost and GrpcBlazorSolution.ServiceDefaults.

NOTE: At the time of writing this article, the two .NET Aspire projects are created using .NET 8.0. This will likely change with the passage of time.

We will add our previous gRPC & Blazor projects to the newly created .sln file by executing the following commands inside the root SoccerFIFA folder:

dotnet sln add ./GrpcStudents/GrpcStudents.csproj
dotnet sln add ./BlazorGrpcClient/BlazorGrpcClient.csproj

Open the solution in Visual Studio Code.

We will add references in the GrpcBlazorSolution.AppHost project to the GrpcStudents and BlazorGrpcClient projects. This can be done in the "Solution Explorer" tab in Visual Studio Code. 

Right-click on "GrpcBlazorSolution.AppHost" then select "Add Project Reference". 

Choose BlazorGrpcClient.

Similarly, do the same for the the "GrpcStudents" project. Right-click on "GrpcBlazorSolution.AppHost" then select "Add Project Reference". 

Choose BlazorStudents.

Also, both GrpcStudents and BlazorGrpcClient projects need to have references into GrpcBlazorSolution.ServiceDefaults.

Right-click on BlazorGrpcClient then select "Add Project Reference". 

Choose GrpcBlazorSolution.ServiceDefaults.

Similarly, add a reference to GrpcBlazorSolution.ServiceDefaults from GrpcStudents:

Right-click on GrpcStudents then select "Add Project Reference". 

Choose GrpcBlazorSolution.ServiceDefaults once again.

Then, in the Program.cs files of both GrpcStudents and BlazorGrpcClient projects, add this agent code right before "var app = builder.Build();":

// Add service defaults & Aspire components.

In the Program.cs file in GrpcBlazorSolution.AppHost, add this code right before “builder.Build().Run();”:

var grpc = builder.AddProject<Projects.GrpcStudents>("backend");

The relative name for the gRPC app is “backend”. Therefore, edit Program.cs in the BlazorGrpcClient project. At around line 15, change the address from http://localhost:5099 to simply http://backend so that the statement looks like this:

builder.Services.AddGrpcClient<StudentRemote.StudentRemoteClient>(options =>
    options.Address = new Uri("http://backend");

Test .NET Aspire Solution

To test the solution,  start the application in the GrpcBlazorSolution.AppHost folder with:

dotnet watch

NOTE: If you are asked to enter a token, copy and paste it from the value in your terminal window:

This is what you should see in your browser:

Click on the app represented by the frontend link on the second row. You should experience the Blazor app:

.NET Aspire has orchestrated for us the connection between multiple projects and produced a single starting point in the Host project. 

Mission accomplished. We have achieved our objective by adding NET Aspire into the mix of projects and wiring up a couple of agents.

Sunday, November 3, 2024

Using Dependency Injection with Sematic Kernel in ASP.NET


In this video I will show you how you Dependency Inject can be used with Semantic Kernel in an ASP.NET Razor Pages application. The same principals can be used with MVC. We will use the Phi-3 model hosted on GitHub. Developers can use a multitude of AI models on GitHub for free.

Source Code:
Companion Video:


This walkthrough was done using .NET 8.0.

Getting Started

There are many AI models at GitHub from a variety of vendors that you can choose from. The starting point is to visit At the time of writing, these are a subset of the models available:

For this article, I will use the "Phi-3.5-mini instruct (128k)" model highlighted above. If you click on that model you will be taken to the model's landing page:

Click on the green "Get started" button.

The first thing we need to do is get a 'personal access token' by clicking on the indicated button above.

Choose 'Generate new token', which happens to be in beta at the time of writing.

Give your token a name, set the expiration, and optionally describe the purpose of the token. Thereafter, click on the green 'Generate token' button at the bottom of the page.

Copy the newly generated token and place it is a safe place because you cannot view this token again once you leave the above page. 

Let's use Semantic Kernel in ASP.NET Razor Pages

In a working directory, create a Razor Pages web app named AspWithSkDI inside a terminal window with the following command:

dotnet new razor -n AspWithSkDI

Change into the newly created directory GitHubAiModelSK with:

cd AspWithSkDI

Next, let's add the Sematic Kernel package to our application with:

dotnet add package Microsoft.SemanticKernel -v 1.25.0

Open the project in VS Code and add this directive to the .csproj file right below: <Nullable>enable</Nullable>:


Add the following to appsettings.json:

    "AI": {
      "Endpoint": "",
      "Model": "Phi-3.5-mini-instruct",
      "PAT": "fake-token"

Replace "fake-token" with the personal access token that you got from GitHub. 

Adding Dependency Injection support

Next, open Program.cs in an editor. Add the following code right above the statement "var app = builder.Build();" :

var modelId = builder.Configuration["AI:Model"]!;
var uri = builder.Configuration["AI:Endpoint"]!;
var githubPAT = builder.Configuration["AI:PAT"]!;

var client = new OpenAIClient(new ApiKeyCredential(githubPAT), new OpenAIClientOptions { Endpoint = new Uri(uri) });

var kernel = builder.Services.AddKernel()
    .AddOpenAIChatCompletion(modelId, client);

It is the last statement above that is key to making Semantic Kernel available to all other classes through Dependency Injection.

ASP.NET Razor Pages


Add the following instance variable and property to the code-behind file named Pages/Index.cshtml.cs:

private readonly Kernel _kernel;

public string? Reply { get; set; }

Replace the class constructor with the following:

public IndexModel(ILogger<IndexModel> logger, Kernel kernel) {
    _logger = logger;
    _kernel = kernel;

We can now get access to Semantic Kernel through the _kernel object.

Add the following OnPostAsync() method to the IndexModel class:

// action method that receives user prompt from the form
public async Task<IActionResult> OnPostAsync(string userPrompt) {
    // get a chat completion service
    var chatCompletionService = _kernel.GetRequiredService<IChatCompletionService>(); 
    // Create a new chat by specifying the assistant
    ChatHistory chat = new(@"
        You are an AI assistant that helps people find information about baking. 
        The baked item must be easy, tasty, and cheap. 
        I don't want to spend more than $10 on ingredients.
        I don't want to spend more than 30 minutes preparing.
        I don't want to spend more than 30 minutes baking."
    var response = await chatCompletionService.GetChatMessageContentAsync(chat, kernel: _kernel); 
    Reply = response.Content!.Replace("\n", "<br>"); 
    return Page();

In the above OnPostAsync() method, the user prompt is received and passed on to the Phi-3 GPT SLM model. In this case our assistant specializes suggests easy, fast and cheap baking ideas.


The Index.cshtml file represents the view that the user sees when interacting with our web app. Replace the content of Index.cshtml with the following:

@model IndexModel

    ViewData["Title"] = "SK Dependency Injection in ASP.NET Razor Pages";

<div class="text-center">
    <h3 class="display-6">@ViewData["Title"]</h3>
    <form method="post" onsubmit="showPleaseWaitMessage()">
        <input type="text" name="userPrompt" size="80" 
            required placeholder="What do you want to bake today?"/>
        <input type="submit" value="Submit" />


<div id="please-wait-message" style="display:none;">
    <p class="alert alert-info">Please wait...</p>

<div id="response-message">
    @if (Model.Reply != null) {
        <p class="alert alert-success">@Html.Raw(Model.Reply)</p>

@section Scripts {
        function showPleaseWaitMessage() {
    document.getElementById('response-message').innerHTML = '';      
    document.getElementById('please-wait-message').style.display = 'block';   

There is some JavaScript that was added to the view to display a "Please wait ...." message while the user waits for the AI model to respond.

Run the application

In a terminal window in the root of the application, run the following command:

dotnet watch

The home page of the web app displays in your default browser and it looks like this:

I entered "Apple Pie" then clicked on Submit. A "Please wait ..." message appeared while the AI model processed my request.

After about 40 seconds the response came back.


It is easy and straight forward to use dependency injection to create a kernel.

Monday, September 30, 2024

Using Sematic Kernel with SLM AI models downloaded to your computer

In this walkthrough, I will demonstrate how you can download the Phi-3 AI SLM (small language model) from Hugging Face and use it in a C# application. 

What is Hugging Face?

Hugging Face ( provides AI/ML researchers & developers with access to thousands of curated datasets, machine learning models, and AI-powered demo apps. We will download he Phi-3 SLM model in ONNX format onto our computers from

What is ONNX?

ONNX is an open format built to represent machine learning models. Visit for more information.

Getting Started

We will download the Phi-3 Mini SLM for the ONNX runtime from Hugging Face. Run the following command from within a terminal window so that the destination is a location of your choice. In the below example the destination is a folder named phi-3-mini on a Windows C: drive.

git clone C:/phi-3-mini


   This example only works on the Windows Operating System.

Be patient as the download could take some time. On my Windows computer the size of the download is 30.1 GB comprising 97 files and 48 folders.

We will be using the files in the cpu_and_mobile folder. Inside that folder, navigate into the cpu-int4-rtn-block-32 folder where you will find this pair of files that contain the AI ONNX model:


Getting Started

In a working directory, create a C# console app named LocalAiModelSK inside a terminal window with the following command:

dotnet new console -n LocalAiModelSK 

Change into the newly created directory LocalAiModelSK with:

cd LocalAiModelSK

Next, let's add two packages to our console application with:

dotnet add package Microsoft.SemanticKernel -v 1.16.2

dotnet add package Microsoft.SemanticKernel.Connectors.Onnx -v 1.16.2-alpha

Open the project in VS Code and add this directive to the .csproj file right below: <Nullable>enable</Nullable>:


Replace the contents of Program.cs with the following C# code:

using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.OpenAI; 
// PHI-3 local model location 
var modelPath = @"C:\phi-3-mini\cpu_and_mobile\cpu-int4-rtn-block-32"; 
// Load the model and services
var builder = Kernel.CreateBuilder();
builder.AddOnnxRuntimeGenAIChatCompletion("phi-3", modelPath); 
// Build Kernel
var kernel = builder.Build(); 
// Create services such as chatCompletionService and embeddingGeneration
var chatCompletionService = kernel.GetRequiredService<IChatCompletionService>(); 
// Start the conversation
while (true) {
    // Get user input
    Console.ForegroundColor = ConsoleColor.Yellow;
    Console.Write("User : ");
    var question = Console.ReadLine()!; 
    OpenAIPromptExecutionSettings openAIPromptExecutionSettings = new() {
        MaxTokens = 200
    var response = kernel.InvokePromptStreamingAsync(
        promptTemplate: @"{{$input}}",
        arguments: new KernelArguments(openAIPromptExecutionSettings){
            { "input", question }
    Console.ForegroundColor = ConsoleColor.Green;
    Console.Write("\nAssistant : ");
    string combinedResponse = string.Empty;
    await foreach (var message in response) {
        // Write the response to the console
        combinedResponse += message;

In the above code, make sure that modelPath points to the proper location of the model on your computer.

I asked the question: How long do mosquito live?

This is the response I received:


You can choose from a variety of  SLMs at Hugging Face. Of course, the penalty is that the actual ONNX model sizes are significant making it, in some circumstances, more desirable to use a model that resides online.

Saturday, September 28, 2024

Using Sematic Kernel with AI models hosted on GitHub


In this article I will show you how you can experiment with AI models hosted on GitHub. GitHub AI Models are intended for learning, experimentation and proof-of-concept activities. The feature is subject to various limits (including requests per minute, requests per day, tokens per request, and concurrent requests) and is not designed for production use cases.

Companion Video:

Getting Started

There are many AI models from a variety of vendors that you can choose from. The starting point is to visit At the time of writing, these are a subset of the models available:

For this article, I will use the "Phi-3.5-mini instruct (128k)" model highlighted above. If you click on that model you will be taken to the model's landing page:

Click on the green "Get started" button.

The first thing we need to do is get a 'personal access token' by clicking on the indicated button above.

Choose 'Generate new token', which happens to be in beta at the time of writing.

Give your token a name, set the expiration, and optionally describe the purpose of the token. Thereafter, click on the green 'Generate token' button at the bottom of the page.

Copy the newly generated token and place it is a safe place because you cannot view this token again once you leave the above page. 

Let's use Semantic Kernel

In a working directory, create a C# console app named GitHubAiModelSK inside a terminal window with the following command:

dotnet new console -n GitHubAiModelSK

Change into the newly created directory GitHubAiModelSK with:

cd GitHubAiModelSK

Next, let's add two packages to our console application with:

dotnet add package Microsoft.SemanticKernel -v 1.25.0

dotnet add package Microsoft.Extensions.Configuration.Json

Open the project in VS Code and add this directive to the .csproj file right below: <Nullable>enable</Nullable>:


Create a file named appsettings.json. Add this to appsettings.json:

    "AI": {
      "Endpoint": "",
      "Model": "Phi-3.5-mini-instruct",
      "PAT": "fake-token"

Replace "fake-token" with the personal access token that you got from GitHub.

Next, open Program.cs in an editor and delete all contents of the file. Add this code to Program.cs:

using Microsoft.SemanticKernel;
using System.Text;
using Microsoft.SemanticKernel.ChatCompletion;
using OpenAI;
using System.ClientModel;
using Microsoft.Extensions.Configuration;

var config = new ConfigurationBuilder()
    .AddJsonFile("appsettings.json", optional: true, reloadOnChange: true)

var modelId = config["AI:Model"]!;
var uri = config["AI:Endpoint"]!;
var githubPAT = config["AI:PAT"]!;

var client = new OpenAIClient(new ApiKeyCredential(githubPAT), new OpenAIClientOptions { Endpoint = new Uri(uri) });

// Initialize the Semantic kernel
var builder = Kernel.CreateBuilder();

builder.AddOpenAIChatCompletion(modelId, client);
var kernel = builder.Build();

// get a chat completion service
var chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();

// Create a new chat by specifying the assistant
ChatHistory chat = new(@"
    You are an AI assistant that helps people find information. 
    The response must be brief and should not exceed one paragraph.
    If you do not know the answer then simply say 'I do not know the answer'."

// Instantiate a StringBuilder
StringBuilder strBuilder = new();

// User question & answer loop
while (true)
    // Get the user's question
    Console.Write("Q: ");

    // Clear contents of the StringBuilder

    // Get the AI response streamed back to the console
    await foreach (var message in chatCompletionService.GetStreamingChatMessageContentsAsync(chat, kernel: kernel))


Run the application:

I asked the question "How many pyramids are there in Egypt?" and the AI answered as shown above. 

Using a different model

How about we use a different AI model. For example, I will try the 'Meta-Llama-3.1-405B-Instruct' model. We need to get the model ID. Click on the model on the page.

Change Model in appsettings.json to "Meta-Llama-3.1-405B-Instruct".

Run the application again. This is what I experienced with the AI model meta-llama-3.1-405b-instruct:


GitHub AI models are easy to access. I hope you come up with great AI driven applications that make a difference to our world.

Thursday, September 19, 2024

Phi-3 Small Language Model (SLM) in a C# console application with Ollama and Sematic Kernel

What is small language model (SLM)?

A small language model (SLM) is a machine learning model typically based on a large language mode (LLM) but of greatly reduced size. An SLM retains much of the functionality of the LLM from which it is built but with far less complexity and computing resource demand.

What is Ollama?

Ollama is an application you can download onto your computer or server to run open source generative AI small-language-models (SLMs) such as Meta's Llama 3 and Microsoft's Phi-3. You can see the many models available at

What is Semantic Kernel

Semantic Kernel is a lightweight, open-source development kit that lets you easily build AI agents and integrate the latest AI models into your C#, Python, or Java codebase.


In this tutorial, we will see how easy it is to use the Phi-3 small language model in a C# application. The best part is that it is free and runs entirely on your local device. Ollama will be used to serve the Phi-3 small language model and Semantic Kernel will be the C# development kit we will use for code development. 

Companion Video:

Getting Started

Download Ollama installer from

Once you have installed Ollama, run these commands from a terminal window:

ollama pull phi3:latest
ollama list
ollama show phi3:latest

In a working directory, create a C# console app named Phi3SK inside a terminal window with the following command:

dotnet new console -n SlmSK

Change into the newly created directory Phi3SK with:

cd SlmSK

Next, let's add the Semantic Kernel package to our console application with:

dotnet add package Microsoft.SemanticKernel -v 1.19.0

Open the project in VS Code and add this directive to the .csproj file right below: <Nullable>enable</Nullable>:


The Code

Replace contents of Program.cs with the following code:

using Microsoft.SemanticKernel;
using System.Text;
using Microsoft.SemanticKernel.ChatCompletion;

// Initialize the Semantic kernel
var builder = Kernel.CreateBuilder();

// We will use Semantic Kernel OpenAI API
        modelId: "phi3",
        apiKey: null,
        endpoint: new Uri("http://localhost:11434"));

var kernel = builder.Build();

// get a chat completion service
var chatCompletionService = kernel.GetRequiredService<IChatCompletionService>(); 

// Create a new chat by specifying the assistant
ChatHistory chat = new(@"
    You are an AI assistant that helps people find information. 
    The response must be brief and should not exceed one paragraph.
    If you do not know the answer then simply say 'I do not know the answer'."
// Instantiate a StringBuilder
StringBuilder strBuilder = new();

// User question & answer loop
while (true) {
    Console.Write("Q: "); 
// Get the user's question

    // Clear contents of the StringBuilder

    // Get the AI response streamed back to the console
    await foreach (var message in chatCompletionService.GetStreamingChatMessageContentsAsync(chat, kernel: kernel))



Running the app

Run the application with:

dotnet run

I entered this question: How long does a direct flight take from Los Angles to Frankfurt?

Then, I got the following response:

Use another SLM

How about we use another SLM. Let's try llama3.1 ( ). Pull the image with:

ollama pull llama3.1:latest

In the code, REPLACE modelId: "phi3" WITH modelId: "llama3.1". Run your application with the same question. This is what I got:

The response is quite similar to that received from Phi-3, even though it shows that the flight to Frankfurt takes longer.


We can package our applications with a local SLM. This makes our applications cheaper, faster, connection-free, and self contained.