In this walkthrough, I will demonstrate how you can download the Phi-3 AI SLM (small language model) from Hugging Face and use it in a C# application.
What is Hugging Face?
Hugging Face (https://huggingface.co/) provides AI/ML researchers & developers with access to thousands of curated datasets, machine learning models, and AI-powered demo apps. We will download he Phi-3 SLM model in ONNX format onto our computers from https://huggingface.co/models.
What is ONNX?
ONNX is an open format built to represent machine learning models. Visit https://onnx.ai/ for more information.
Getting Started
We will download the Phi-3 Mini SLM for the ONNX runtime from Hugging Face. Run the following command from within a terminal window so that the destination is a location of your choice. In the below example the destination is a folder named phi-3-mini on a Windows C: drive.
git clone https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx C:/phi-3-mini
Be patient as the download could take some time. On my Windows computer the size of the download is 30.1 GB comprising 97 files and 48 folders.
phi3-mini-4k-instruct-cpu-int4-rtn-block-32.onnx
phi3-mini-4k-instruct-cpu-int4-rtn-block-32.onnx.data
Getting Started
In a working directory, create a C# console app named LocalAiModelSK inside a terminal window with the following command:
dotnet new console -n LocalAiModelSK
Change into the newly created directory LocalAiModelSK with:
cd LocalAiModelSK
Next, let's add two packages to our console application with:
dotnet add package Microsoft.SemanticKernel -v 1.16.2
dotnet add package Microsoft.SemanticKernel.Connectors.Onnx -v 1.16.2-alpha
Open the project in VS Code and add this directive to the .csproj file right below: <Nullable>enable</Nullable>:
<NoWarn>SKEXP0070</NoWarn>
Replace the contents of Program.cs with the following C# code:
using Microsoft.SemanticKernel;using Microsoft.SemanticKernel.ChatCompletion;using Microsoft.SemanticKernel.Connectors.OpenAI;
// PHI-3 local model locationvar modelPath = @"C:\phi-3-mini\cpu_and_mobile\cpu-int4-rtn-block-32";
// Load the model and servicesvar builder = Kernel.CreateBuilder();builder.AddOnnxRuntimeGenAIChatCompletion("phi-3", modelPath);
// Build Kernelvar kernel = builder.Build();
// Create services such as chatCompletionService and embeddingGenerationvar chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();
// Start the conversationwhile (true) {// Get user inputConsole.ForegroundColor = ConsoleColor.Yellow;Console.Write("User : ");var question = Console.ReadLine()!;
OpenAIPromptExecutionSettings openAIPromptExecutionSettings = new() {MaxTokens = 200};
var response = kernel.InvokePromptStreamingAsync(promptTemplate: @"{{$input}}",arguments: new KernelArguments(openAIPromptExecutionSettings){{ "input", question }});Console.ForegroundColor = ConsoleColor.Green;Console.Write("\nAssistant : ");string combinedResponse = string.Empty;await foreach (var message in response) {// Write the response to the consoleConsole.Write(message);combinedResponse += message;}Console.WriteLine();}
In the above code, make sure that modelPath points to the proper location of the model on your computer.
I asked the question: How long do mosquito live?
This is the response I received:
Conclusion
You can choose from a variety of SLMs at Hugging Face. Of course, the penalty is that the actual ONNX model sizes are significant making it, in some circumstances, more desirable to use a model that resides online.
No comments:
Post a Comment