Medhat Elmasry: Small Language Models with AI Toolkit Extension in VS Code

In this article, we will see how we can work with small language models (SLM) from the AI Toolkit extension in VS Code. Though the toolkit can do other things, our focus is to consume an ONNX SLM hosted on Visual Studio Code from a C# application. We will first look at an example that is based on OpenAI packages. We will later use a similar example based on the Sematic Kernal approach.

Companion Video: https://youtu.be/V_eWAM2fxJg

Prerequisites

You will need:

The latest version of VS Code
.NET version 9.0 or higher

What are small language models (SLMs)?

Small Language Models (SLMs) are compact versions of large language models (LLMs), designed to deliver strong performance in natural language tasks while using significantly fewer computational resources.

What is the AI Toolkit Extension in VS Code?

The AI Toolkit Extension for Visual Studio Code is a powerful, all-in-one environment for building, testing, and deploying generative AI applications—especially useful for developers working with small language models (SLMs).

Getting Started

Install the following Visual Studio Code extension:

Click on the three dots (...) in the left navigation of VS Code, and choose "AI Toolkit".

Click on "Model Catalog".

Scroll down down the list until you find “Local Models” >> ONNX >> Minstral 7B – (CPU – Small, Standard) >> + Add Model.

Once the model is fully downloaded, it will appear under Models >> ONNX.

Right-click on the model and select “Copy Model Name”.

I copied the following name for the "Minstral 7B" model:

mistral-7b-v02-int4-cpu

Using OpenAI packages

Create a C# console application named AIToolkitOpenAI and add to it required packages with the following terminal window commands:

dotnet new console -n AIToolkitOpenAI
cd AIToolkitOpenAI
dotnet add package OpenAI

Start VS Code with:

code .

Click on the "AI Toolkit" tab in VS Code and make sure that the "Minstral 7B" model is running.

Replace content of Program.cs with this code:

using OpenAI;
using OpenAI.Chat;
using System.ClientModel;
using System.Text;

var model = "mistral-7b-v02-int4-cpu";
var baseUrl = "http://localhost:5272/v1/"; // root URL for local OpenAI-like server
var apikey = "unused";

OpenAIClientOptions options = new OpenAIClientOptions();
options.Endpoint = new Uri(baseUrl);
ApiKeyCredential credential = new ApiKeyCredential(apikey);
ChatClient client = new OpenAIClient(credential, options).GetChatClient(model);

// Build the prompt
StringBuilder prompt = new StringBuilder();
prompt.AppendLine("You will analyze the sentiment of the following product reviews.");
prompt.AppendLine("Each line is its own review. Output the sentiment of each review in");
prompt.AppendLine("a bulleted list and then provide a general sentiment of all reviews.");
prompt.AppendLine();
prompt.AppendLine("I bought this product and it's amazing. I love it!");
prompt.AppendLine("This product is terrible. I hate it.");
prompt.AppendLine("I'm not sure about this product. It's okay.");
prompt.AppendLine("I found this product based on the other reviews. It worked");

// send the prompt to the model and wait for the text completion
var response = await client.CompleteChatAsync(prompt.ToString());
// display the response
Console.WriteLine(response.Value.Content[0].Text);

Run the application with:

dotnet run

The application does sentiment analysis on what customers think of the product.

This is a sample of the output:

* I bought this product and it's amazing. I love it!: Positive sentiment
* This product is terrible. I hate it.: Negative sentiment
* I'm not sure about this product. It's okay.: Neutral sentiment
* I found this product based on the other reviews. It worked for me.: Positive sentiment

General sentiment: The reviews contain both positive and negative sentiments. Some customers expressed their love for the product, while others expressed their dislike. Neutral sentiment was also expressed by one customer. Overall, the reviews suggest that the product has the potential to elicit strong feelings from customers, both positive and negative.

Sematic Kernel packages

Create a C# console application named AIToolkitSK and add to it required packages with the following terminal window commands:

dotnet new console -n AIToolkitSK
cd AIToolkitSK
dotnet add package Microsoft.SemanticKernel

Start VS Code with:

code .

Click on the "AI Toolkit" tab in VS Code and make sure that the "Minstral 7B" model is running.

Replace content of Program.cs with this code:

using System.Text;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.OpenAI;

var model = "mistral-7b-v02-int4-cpu";
var baseUrl = "http://localhost:5272/v1/";
var apikey = "unused";

// Create a chat completion service
var kernel = Kernel.CreateBuilder()
    .AddOpenAIChatCompletion(modelId: model, apiKey: apikey, endpoint: new Uri(baseUrl))
    .Build();
var chat = kernel.GetRequiredService<IChatCompletionService>();
var history = new ChatHistory();
history.AddSystemMessage("You are a useful chatbot. Always reply in a funny way with short answers.");
var settings = new OpenAIPromptExecutionSettings
{
    MaxTokens = 500,
    Temperature = 1,
};

while (true)
{
    Console.Write("\nUser: ");
    var userInput = Console.ReadLine();
    if (string.IsNullOrWhiteSpace(userInput)) break;

    history.AddUserMessage(userInput);

    var responseBuilder = new StringBuilder();
    Console.Write("\nAI: ");
    await foreach (var message in chat.GetStreamingChatMessageContentsAsync(userInput, settings, kernel))
    {
        responseBuilder.Append(message);
        Console.Write(message);
    }
}

This is a simple chat completion app.

Run the application with:

dotnet run

My prompt was:

Red or white wine with beef steak?

The response was:

AI: Both red and white wines can pair well with beef steak, but a red wine is generally the more traditional choice. Red wines, such as Cabernet Sauvignon, Merlot, or Pinot Noir, have flavors that complement the rich and savory flavors of beef. However, if you prefer a lighter taste, a white wine such as Pinot Noir or Chardonnay can also work well with beef steak. Ultimately, it comes down to personal preference.

Conclusion

We have seen how to use SLMs hosted by VS Code through the AI Toolkit extension. We were able to communicate with the model from these two C# applications: (1) a app the uses OpenAI packages, and (2) an app that uses Sematic Kernel.

Medhat Elmasry

Sunday, October 19, 2025

Small Language Models with AI Toolkit Extension in VS Code

Prerequisites

What are small language models (SLMs)?

What is the AI Toolkit Extension in VS Code?

Getting Started

Using OpenAI packages

Sematic Kernel packages

Conclusion

No comments:

Post a Comment

About Me

Blog Archive