Thursday, February 29, 2024

OpenAI Function Calling with Semantic Kernel, C#, & Entity Framework

In this article, we will create a Semantic Kernel plugin that contains four functions that interact with live SQLite data. Entity Framework will be used to access the SQLite database. The end result is to use the powers of the OpenAI natural language models to ask questions and get answers about our custom data.

Source code: https://github.com/medhatelmasry/EfFuncCallSK

Companion Video: https://youtu.be/4sKRwflEyHk

Getting Started

Let’s start by creating an ASP.NET Razor pages web application. Select a suitable working folder on your computer, then enter the following terminal window commands:

dotnet new razor --auth individual -o EfFuncCallSK
cd EfFuncCallSK

Te above creates a Razor Pages app with support for Entity Framework and SQLite.

Add these packages:

dotnet add package CsvHelper
dotnet add package Microsoft.SemanticKernel 
dotnet add package Microsoft.EntityFrameworkCore.Design
dotnet add package Microsoft.EntityFrameworkCore.Tools
dotnet add package Microsoft.EntityFrameworkCore
dotnet add package Microsoft.EntityFrameworkCore.SQLite.Design

The CsvHelper package will help us load a list of products from a CSV file named students.csv and hydrate a list of Student objects. The second package is needed to work with Semantic Kernel. The rest of the packages support Entity Framework and SQLite.

Let’s Code

appsettings.json

Add these to appsettings.json:

"AIService": "OpenAI", /* Azure or OpenAI */
"AzureOpenAiSettings": {
   "Endpoint": "https://YOUR_RESOURCE_NAME.openai.azure.com/",
   "Model": "gpt-35-turbo",
   "ApiKey": "fake-key-fake-key-fake-key-fake-key"
},
"OpenAiSettings": {
  "ModelType": "gpt-3.5-turbo",
  "ApiKey": "fake-key-fake-key-fake-key-fake-key"
}

The first setting allows you to choose between using OpenAI or Azure OpenAI.

Of course, you need to adjust the endpoint setting with the appropriate value that pertains to the OpenAI and Azure OpenAI services. Also, enter the correct value for the ApiKey.

NOTE: You can use OpenAI or Azure OpenAI, or both.

Data

Create a folder named Models. Inside the Models folder, add the following Student class: 

public class Student {
   public int StudentId { get; set; }
   public string? FirstName { get; set; }
   public string? LastName { get; set; }
   public string? School { get; set; }
 
   public override string ToString() {
      return $"Student ID: {StudentId}, First Name: {FirstName}, Last Name: {LastName}, School: {School}";
   }
}

Developers like having sample data when building data driven applications. Therefore, we will create sample data to ensure that our application behaves as expected. Copy CSV data from this link and save it in a text file wwwroot/students.csv.

Add the following code inside the ApplicationDbContext class located inside the Data folder:

public DbSet<Student> Students => Set<Student>();    
 
protected override void OnModelCreating(ModelBuilder modelBuilder) {
    base.OnModelCreating(modelBuilder);
    modelBuilder.Entity<Student>().HasData(LoadStudents());
}  
 
// Load students from a csv file named students.csv in the wwwroot folder
public static List<Student> LoadStudents() {
    var students = new List<Student>();
    using (var reader = new StreamReader(Path.Combine("wwwroot", "students.csv"))) {
        using var csv = new CsvReader(reader, CultureInfo.InvariantCulture);
        students = csv.GetRecords<Student>().ToList();
    }
    return students;
}

Let us add a migration and subsequently update the database. Execute the following CLI commands in a terminal window.

dotnet ef migrations add Students -o Data/Migrations
dotnet ef database update

At this point the database and tables are created in a SQLite database named app.db.

Helper Methods

We need a couple of static helper methods to assist us along the way. In the Models folder, add a class named Utils and add to it the following class definition:

public class Utils {
  public static string GetConfigValue(string config) {
    IConfigurationBuilder builder = new ConfigurationBuilder();
    if (System.IO.File.Exists("appsettings.json"))
      builder.AddJsonFile("appsettings.json", false, true);
    if (System.IO.File.Exists("appsettings.Development.json"))
      builder.AddJsonFile("appsettings.Development.json", false, true);
    IConfigurationRoot root = builder.Build();
    return root[config]!;
  }
 
  public static ApplicationDbContext GetDbContext() {
    var optionsBuilder = new DbContextOptionsBuilder<ApplicationDbContext>();
    var connStr = Utils.GetConfigValue("ConnectionStrings:DefaultConnection");
    optionsBuilder.UseSqlite(connStr);
    ApplicationDbContext db = new ApplicationDbContext(optionsBuilder.Options);
    return db;
  }
}

Method GetConfigValue() will read values in appsettings.json from any static method. The second GetDbContext() method gets an instance of the ApplicationDbContext class, also from any static method.

Plugins

Create a folder named Plugins and add to it the following class file named StudentPlugin.cs with this code:

public class StudentPlugin {
  [KernelFunction, Description("Get student details by first name and last name")]
  public static string? GetStudentDetails(
  [Description("student first name, e.g. Kim")]
  string firstName,
  [Description("student last name, e.g. Ash")]
  string lastName
  ) {
    var db = Utils.GetDbContext();
    var studentDetails = db.Students
      .Where(s => s.FirstName == firstName && s.LastName == lastName).FirstOrDefault();
    if (studentDetails == null)
      return null;
    return studentDetails.ToString();
  }

  [KernelFunction, Description("Get students in a school given the school name")]
  public static string? GetStudentsBySchool(
    [Description("The school name, e.g. Nursing")]
    string school
  ) {
    var studentsBySchool = Utils.GetDbContext().Students
      .Where(s => s.School == school).ToList();
    if (studentsBySchool.Count == 0)
      return null;
    return JsonSerializer.Serialize(studentsBySchool);
  }


  [KernelFunction, Description("Get the school with most or least students. Takes boolean argument with true for most and false for least.")]
  static public string? GetSchoolWithMostOrLeastStudents(
    [Description("isMost is a boolean argument with true for most and false for least. Default is true.")]
    bool isMost = true
  ) {
    var students = Utils.GetDbContext().Students.ToList();
    IGrouping<string, Student>? schoolGroup = null;
    if (isMost)
      schoolGroup = students.GroupBy(s => s.School)
          .OrderByDescending(g => g.Count()).FirstOrDefault()!;
    else
        schoolGroup = students.GroupBy(s => s.School)
            .OrderBy(g => g.Count()).FirstOrDefault()!;
    if (schoolGroup != null)
      return $"{schoolGroup.Key} has {schoolGroup.Count()} students";
    else
      return null;
  }

  [KernelFunction, Description("Get students grouped by school.")]
  static public string? GetStudentsInSchool() {
    var students = Utils.GetDbContext().Students.ToList().GroupBy(s => s.School)
      .OrderByDescending(g => g.Count());
    if (students == null)
      return null;
    else
      return JsonSerializer.Serialize(students);
  }
}

 In the above code, there are four methods with these purposes:

GetStudentDetails()Gets student details given first and last names
GetStudentsBySchool()Gets students in a school given the name of the school
GetSchoolWithMostOrLeastStudents()Takes a Boolean value isMost – true returns school with most students and false returns school with least students.
GetStudentsInSchool()Takes no arguments and returns a count of students by school.

The User Interface

We will re-purpose the Index.cshtml and Index.cshtml.cs files so the user can enter a prompt in natural language and receive a response that comes from the OpenAI model working with our semantic kernel plugin. 

Index.chtml

Replace the content of Pages/Index.cshtml with:

@page
@model IndexModel
@{
    ViewData["Title"] = Model.Service + " Function Calling with Semantic Kernel";
}
<div class="text-center">
    <h3 class="display-6">@ViewData["Title"]</h3>
    <form method="post">
        <input type="text" name="prompt" size="80" required />
        <input type="submit" value="Submit" />
    </form>
    <div style="text-align: left">
        <h5>Example prompts:</h5>
        <p>Which school does Mat Tan go to?</p>
        <p>Which school has the most students?</p>
        <p>Which school has the least students?</p>
        <p>Get the count of students in each school.</p>
        <p>How many students are there in the school of Mining?</p>
        <p>What is the ID of Jan Fry and which school does she go to?</p>
        <p>Which students belong to the school of Business? Respond only in JSON format.</p>
        <p>Which students in the school of Nursing have their first or last name start with the letter 'J'?</p>
    </div>
    @if (Model.Reply != null)
    {
        <p class="alert alert-success">@Model.Reply</p>
    }
</div>

The above markup displays an HTML form that accepts a prompt from a user. The prompt is then submitted to the server and the response is displayed in a paragraph (<p> tag) with a green background (Bootstrap class alert-success).

Meantime, at the bottom of the page there are some suggested prompts – namely:

Which school does Mat Tan go to?
Which school has the most students?
Which school has the least students?
Get the count of students in each school.
How many students are there in the school of Mining?
What is the ID of Jan Fry and which school does she go to?
Which students belong to the school of Business? Respond only in JSON format.
Which students in the school of Nursing have their first or last name start with the letter 'J'?

Index.chtml.cs

Replace the IndexModel class definition in Pages/Index.cshtml.cs with:

public class IndexModel : PageModel {
  private readonly ILogger<IndexModel> _logger;
  private readonly IConfiguration _config;
 
  [BindProperty]
  public string? Reply { get; set; }
 
  [BindProperty]
  public string? Service { get; set; }
 
  public IndexModel(ILogger<IndexModel> logger, IConfiguration config) {
    _logger = logger;
    _config = config;
    Service = _config["AIService"]!;
  }
  public void OnGet() { }
  // action method that receives prompt from the form
  public async Task<IActionResult> OnPostAsync(string prompt) {
    // call the Azure Function
    var response = await CallFunction(prompt);
    Reply = response;
    return Page();
  }
 
  private async Task<string> CallFunction(string question) {
    string azEndpoint = _config["AzureOpenAiSettings:Endpoint"]!;
    string azApiKey = _config["AzureOpenAiSettings:ApiKey"]!;
    string azModel = _config["AzureOpenAiSettings:Model"]!;
    string oaiModelType = _config["OpenAiSettings:ModelType"]!;
    string oaiApiKey = _config["OpenAiSettings:ApiKey"]!;
    string oaiModel = _config["OpenAiSettings:Model"]!;
    string oaiOrganization = _config["OpenAiSettings:Organization"]!;
    var builder = Kernel.CreateBuilder();
    if (Service!.ToLower() == "openai")
      builder.Services.AddOpenAIChatCompletion(oaiModelType, oaiApiKey);
    else
      builder.Services.AddAzureOpenAIChatCompletion(azModel, azEndpoint, azApiKey);
    builder.Services.AddLogging(c => c.AddDebug().SetMinimumLevel(LogLevel.Trace));
    builder.Plugins.AddFromType<StudentPlugin>();
    var kernel = builder.Build();
    // Create chat history
    ChatHistory history = [];
    // Get chat completion service
    var chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();
    // Get user input
    history.AddUserMessage(question);
    // Enable auto function calling
    OpenAIPromptExecutionSettings openAIPromptExecutionSettings = new() {
      ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions
    };
    // Get the response from the AI
    var result = chatCompletionService.GetStreamingChatMessageContentsAsync(
      history,
      executionSettings: openAIPromptExecutionSettings,
      kernel: kernel);
    string fullMessage = "";
    await foreach (var content in result) {
      fullMessage += content.Content;
    }
    // Add the message to the chat history
    history.AddAssistantMessage(fullMessage);
    return fullMessage;
  }
}

In the above code, the prompt entered by the user is posted to the OnPostAsync() method. The prompt is then passed to the CallFunction() method, which returns the final response from Azure OpenAI.

The CallFunction() method reads the OpenAI or Azure OpenAI settings from appsettings.json, depending on the AIService key.

A builder object is created from Semantic Kernel. If we are using OpenAI, then the AddOpenAIChatCompletion service is added. Otherwise, the AddAzureOpenAIChatCompletion service is added.

The StudentPlugin is then added to the builder object Plugins collection.

The builder Build() method is then called returning a kernel object. From the kernel object we then get a chatCompletionService object by calling the GetRequiredService() method.

Thereafter:

  • Add the prompt to the history
  • Make a call to the chat message service and receive a response
  • Concatenate response into a single string
  • Return the concatenated message

Trying the application

In a terminal window, at the root of the Razor Pages web application, enter the following command:

dotnet watch

The following page will display in your default browser:


You can enter any of the suggested prompts to ensure we are getting the proper results. I entered the last prompt and got these results:




Conclusion

We have seen how Semantic Kernel and Function Calling can be used with data coming from a database. In this example we are using SQLite. However, an other database source can be used using the same technique.

Thursday, February 15, 2024

Azure OpenAI Function Calling - a practical example using C# and ASP.NET Razor Pages

In as much as one can obtain valuable information by prompting the various OpenAI language models, it becomes even more valuable when these models can be integrated with custom systems and tools. In this demo, we will integrate an LLM with the Azure OpenAI Function Calling capability to query local data in the form of a products.csv file. Of course, this concept can easily be extended to more complex systems and tools. The sample application is based on ASP.NET Razor Pages. It receives a natural language prompt from the user and goes through this three-step process before responding:

  1. A call is made to a chat completions API with function definitions and the user’s prompt.
  2. The model’s response initiates calls to a custom function
  3. The chat completion API is called again with the response from the custom function, resulting in a final response.

Source Code: https://github.com/medhatelmasry/OaiFuncCall

Companion Video: https://youtu.be/3yyq3GWIj4o

Getting Started

Let’s start by creating an ASP.NET Razor pages web application. Select a suitable working folder on your computer, then enter the following terminal window commands:

dotnet new razor -o OaiFuncCall
cd OaiFuncCall

Add these packages:

dotnet add package CsvHelper
dotnet add package Azure.AI.OpenAI -v 1.0.0-beta.13

The CsvHelper package will help us load a list of products from a CSV file named products.csv and hydrate a list of Product objects. The second package is needed to work with Azure OpenAI.

Let’s Code

appsettings.json

Add this to appsettings.json:

"AzureOpenAiSettings": {
  "Endpoint": "https://YOUR_RESOURCE_NAME.openai.azure.com/",
  "Model": "gpt-35-turbo-16k",
  "ApiKey": "fake-key-fake-key-fake-key-fake-key"
}

Of course, you need to adjust the endpoint setting with the appropriate value that pertains to the Azure OpenAI service that you created. Also, enter the correct value for the ApiKey.

products.csv

Create a text file named products.csv in the wwwroot folder. Copy some sample data from  https://gist.github.com/medhatelmasry/b250023f3b4b5b14713cfc5165f1d030 and paste it into your products.csv file. 

Contents of products.csv looks like this:

ProductId,ProductName,UnitsInStock,UnitPrice
1,Aniseed Syrup,39,18
2,Chef Anton's Cajun Seasoning,17,19
3,Chef Anton's Gumbo Mix,13,10
4,Grandma's Boysenberry Spread,53,22
5,Uncle Bob's Organic Dried Pears,0,21.35
6,Northwoods Cranberry Sauce,120,25
7,Mishi Kobe Niku,15,30
8,Ikura,6,40
9,Queso Cabrales,29,97
10,Queso Manchego La Pastora,31,31
11,Konbu,22,21
12,Tofu,86,38
13,Genen Shouyu,24,6
14,Pavlova,35,23.25
15,Alice Mutton,39,15.5
16,Carnarvon Tigers,29,17.45
17,Teatime Chocolate Biscuits,0,39
18,Sir Rodney's Marmalade,42,62.5
19,Sir Rodney's Scones,25,9.2
20,Gustaf's KnŠckebršd,40,81
21,Tunnbršd,3,10
22,Guaran‡ Fant‡stica,104,21
23,NuNuCa Nu§-Nougat-Creme,61,9
24,GumbÅ r GummibÅ rchen,20,4.5
25,Schoggi Schokolade,76,14
26,Ršssle Sauerkraut,15,31.23
27,ThŸringer Rostbratwurst,49,43.9
28,Nord-Ost Matjeshering,26,45.6
29,Gorgonzola Telino,0,123.79
30,Mascarpone Fabioli,10,25.89
31,Geitost,0,12.5
32,Sasquatch Ale,9,32
33,Steeleye Stout,112,2.5
34,Inlagd Sill,111,14
35,Gravad lax,20,18
36,C™te de Blaye,112,19
37,Chartreuse verte,11,26
38,Boston Crab Meat,17,263.5
39,Jack's New England Clam Chowder,69,18
40,Singaporean Hokkien Fried Mee,123,18.4
41,Ipoh Coffee,85,9.65
42,Gula Malacca,26,14
43,Rogede sild,17,46
44,Spegesild,27,19.45
45,Zaanse koeken,5,9.5
46,Chocolade,95,12
47,Maxilaku,36,9.5
48,Valkoinen suklaa,15,12.75
49,Manjimup Dried Apples,10,20
50,Filo Mix,65,16.25
51,Perth Pasties,20,53
52,Tourtire,38,7
53,P‰tŽ chinois,0,32.8
54,Gnocchi di nonna Alice,21,7.45
55,Ravioli Angelo,115,24
56,Escargots de Bourgogne,21,38
57,Raclette Courdavault,36,19.5
58,Camembert Pierrot,62,13.25
59,Sirop d'Žrable,79,55
60,Tarte au sucre,19,34
61,Vegie-spread,113,28.5
62,Wimmers gute Semmelknšdel,17,49.3
63,Louisiana Fiery Hot Pepper Sauce,24,43.9
64,Louisiana Hot Spiced Okra,22,33.25
65,Laughing Lumberjack Lager,76,21.05
66,Scottish Longbreads,4,17
67,Gudbrandsdalsost,52,14
68,Outback Lager,6,12.5
69,Flotemysost,26,36
70,Mozzarella di Giovanni,15,15
71,Ršd Kaviar,26,21.5
72,Longlife Tofu,14,34.8
73,RhšnbrŠu Klosterbier,101,15
74,Lakkalikššri,4,10
75,Original Frankfurter grŸne So§e,125,7.75

Note that the above data was taken from the Northwind database that used to come with early versions of SQL Server.

Function Definitions

To demonstrate the power of Azure OpenAI Function Calling, we will create two separate function definitions. The first is a class named ProductAgent, which returns details about a product given the ‘Product Name’. The second is a class named MostExpensiveProductAgent, which returns the most expensive product.

Create a folder named Models and add to it three C# classes, namely: ProductProductAgent  and MostExpensiveProductAgent.

Product.cs

Add the following class to Product.cs:

public class Product{
  public int ProductId { get; set; }
  public string? ProductName { get; set; }
  public int UnitsInStock { get; set; }
  public float UnitPrice { get; set; } 
  
  // Load products from a csv file named products.csv in the wwwroot folder
  public static List<Product> LoadProducts() {
    var products = new List<Product>();
    using (var reader = new StreamReader(Path.Combine("wwwroot", "products.csv"))) {
      using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture)) {
          products = csv.GetRecords<Product>().ToList();
      }
  }
    return products;
  }
  
  public override string ToString() {
    return $"Product ID: {ProductId}, Product Name: {ProductName}, Units In Stock: {UnitsInStock}, Unit Price: {UnitPrice}";
  }
}

The above code declares a class named Product. The class properties match the column names in the products.csv file. The LoadProducts() static method reads contents of the CSV file and returns a hydrated list of Product objects. Note that the Product class also has a ToString() method.

ProductAgent.cs

public class ProductAgent {
  static public string Name = "get_product_details";
  static private List<Product> products = Product.LoadProducts(); 
  
  // Return the function metadata
  static public FunctionDefinition GetFunctionDefinition() {
    return new FunctionDefinition() {
        Name = Name,
        Description = "Get product details by product name",
        Parameters = BinaryData.FromObjectAsJson(
        new{
          Type = "object",
          Properties = new {
            ProductName = new {
              Type = "string",
              Description = "The product name, e.g. Pavlova",
            }
          },
          Required = new[] { "productName" },
        },
        new JsonSerializerOptions() { PropertyNamingPolicy = JsonNamingPolicy.CamelCase }),
    };
  }
  
  static public string? GetProductDetails(string product){
    var productDetails = products.Where(p => p.ProductName == product).FirstOrDefault();
    if (productDetails == null){
      return null;
    }
    return productDetails.ToString();
  }
} 
  
// Argument for the function
public class ProductInput {
    public string ProductName { get; set; } = string.Empty;
}

The above code does the following:

  • The function definition is created as a JSON object with information about the function name, description, and required parameters.
  • The GetProductDetails() method receives a product parameter (essentially ProductName) and returns a string with product details.
  • The ProductInput class exists in the ProductAgent.cs file. It will be later used as the argument that is passed to the GetProductDetails() method.

MostExpensiveProductAgent.cs

public class MostExpensiveProductAgent {
  static public string Name = "get_most_expensive_product";
  static private List<Product> products = Product.LoadProducts(); 
  
  // Return the function metadata
  static public FunctionDefinition GetFunctionDefinition() {
    return new FunctionDefinition() {
      Name = Name,
      Description = "Get details of the most expensive product",
    };
  }
  
  static public string? GetMostExpensiveProductDetails() {
    var productDetails = products.OrderByDescending(p => p.UnitPrice).FirstOrDefault();
    if (productDetails == null) {
      return null;
    }
    return productDetails.ToString();
  }
}

T
he above code does the following:

  • Just as with the previous ProductAgent class, the function definition is created as a JSON object with information about the function name, description, and required parameters.
  •  The GetMostExpensiveProductDetails() method takes no arguments and simply returns a string with details of the most expensive product.

The User Interface

We will re-purpose the Index.cshtml and Index.cshtml.cs files so the user can enter a prompt in natural language and receive a response that comes from the OpenAI model working with our custom function. 

Index.chtml

Replace the content of Pages/Index.cshtml with:

@page
@model IndexModel
@{
    ViewData["Title"] = "OpenAI Function Calling";
}
<div class="text-center">
    <h1 class="display-4">@ViewData["Title"]</h1>
    <form method="post">
        <input type="text" name="prompt" size="80" required/>
        <input type="submit" value="Submit" />
    </form>
    <pre style="text-align: left">
        
        Example prompts:
        
        What is the id number of product: Louisiana Hot Spiced Okra?
        What is the unit price of product: Sir Rodney's Marmalade?
        How many units in stock for product: Tofu?
        What is the most expensive product?
    </pre>
    @if (Model.Reply != null) {
        <p class="alert alert-success">@Model.Reply</p>
    }
</div>

T
he above markup displays an HTML form that accepts a prompt from a user. The prompt is then submitted to the server and the response is displayed in a paragraph (<p> tag) with a green background (Bootstrap class alert-success).

Meantime, at the bottom of the page there are some suggested prompts – namely:

What is the id number of product: Louisiana Hot Spiced Okra?
What is the unit price of product: Sir Rodney's Marmalade?
How many units in stock for product: Tofu?
What is the most expensive product?

Index.chtml.cs

Replace the IndexModel class definition in Pages/Index.cshtml.cs with:

public class IndexModel : PageModel {
  private readonly ILogger<IndexModel> _logger;
  private readonly IConfiguration _config;
  
  [BindProperty]
  public string? Reply { get; set; }
  
  public IndexModel(ILogger<IndexModel> logger, IConfiguration config) {
    _logger = logger;
    _config = config;
  }
  
  public void OnGet() { }
  
  // action method that receives prompt from the form
  public async Task<IActionResult> OnPostAsync(string prompt) {
    // call the Azure Function
    var response = await CallFunction(prompt);
    Reply = response;
    return Page();
  }
  
  private async Task<string> CallFunction(string question) {
    string endpoint = _config["AzureOpenAiSettings:Endpoint"]!;
    string apiKey = _config["AzureOpenAiSettings:ApiKey"]!;
    string model = _config["AzureOpenAiSettings:Model"]!;
    
    Uri openAIUri = new(endpoint);
    
    // Instantiate OpenAIClient for Azure Open AI.
    OpenAIClient client = new(openAIUri, new AzureKeyCredential(apiKey));
    ChatCompletionsOptions chatCompletionsOptions = new();
    chatCompletionsOptions.DeploymentName = model;
    ChatChoice responseChoice;
    Response<ChatCompletions> responseWithoutStream;
    
    // Add function definitions
    FunctionDefinition getProductFunctionDefinition = ProductAgent.GetFunctionDefinition();
    FunctionDefinition getMostExpensiveProductDefinition = MostExpensiveProductAgent.GetFunctionDefinition();
    chatCompletionsOptions.Functions.Add(getProductFunctionDefinition);
    chatCompletionsOptions.Functions.Add(getMostExpensiveProductDefinition);

    chatCompletionsOptions.Messages.Add(
        new ChatRequestUserMessage(question)
    );
    responseWithoutStream =
        await client.GetChatCompletionsAsync(chatCompletionsOptions);
    responseChoice = responseWithoutStream.Value.Choices[0];
    
    while (responseChoice.FinishReason!.Value == CompletionsFinishReason.FunctionCall) {
      // Add message as a history.
      chatCompletionsOptions.Messages.Add(new ChatRequestUserMessage(responseChoice.Message.ToString()));
      if (responseChoice.Message.FunctionCall.Name == ProductAgent.Name) {
        string unvalidatedArguments = responseChoice.Message.FunctionCall.Arguments;
        ProductInput input = JsonSerializer.Deserialize<ProductInput>(unvalidatedArguments,
          new JsonSerializerOptions() { PropertyNamingPolicy = JsonNamingPolicy.CamelCase })!;
        var functionResultData = ProductAgent.GetProductDetails(input.ProductName);
        var functionResponseMessage = new ChatRequestFunctionMessage(
          ProductAgent.Name,
          JsonSerializer.Serialize(
            functionResultData,
            new JsonSerializerOptions() { PropertyNamingPolicy = JsonNamingPolicy.CamelCase }));
        chatCompletionsOptions.Messages.Add(functionResponseMessage);
      } else if (responseChoice.Message.FunctionCall.Name == MostExpensiveProductAgent.Name) {
        
        var functionResultData = MostExpensiveProductAgent.GetMostExpensiveProductDetails();
        var functionResponseMessage = new ChatRequestFunctionMessage(
          MostExpensiveProductAgent.Name,
          JsonSerializer.Serialize(
            functionResultData,
            new JsonSerializerOptions() { PropertyNamingPolicy = JsonNamingPolicy.CamelCase }));
        chatCompletionsOptions.Messages.Add(functionResponseMessage);
      }

      // Call LLM again to generate the response.
      responseWithoutStream = await client.GetChatCompletionsAsync(chatCompletionsOptions);
      responseChoice = responseWithoutStream.Value.Choices[0];
    }
    return responseChoice.Message.Content;
  }
}

In the above code, the prompt entered by the user is posted to the OnPostAsync() method. The prompt is then passed to the CallFunction() method, which returns the final response from Azure OpenAI.

  • The CallFunction() method reads the Azure OpenAI settings from appsettings.json.
  • An OpenAIClient class is instantiated.
  • An ChatCompletionsOptions class is instantiated and the message property is set with the original prompt from the user.
  • Function definitions for ProductAgent and MostExpensiveProductAgent are obtained.
  • A GetChatCompletionsAsync call is then made to Azure Open AI. The response involves a call to a local custom function. The service is smart enough to call the correct function based on the context of the prompt. Responses from local custom function calls are added to the history of messages and sent back to OpenAI. Eventually, after no more local custom function calls are required, the final message is returned.

Trying the application

In a terminal window, at the root of the Razor Pages web application, enter the following command:

dotnet watch

The following page will display in your default browser:

After separately entering the four suggested prompts, you will receive the following responses:

=======================================================================

=======================================================================

=======================================================================

Conclusion

The opportunities that Function Calling opens is enormous. I am simply scratching the surface of what the possibilities are. 

Resources



Wednesday, February 7, 2024

Base64 images with Azure OpenAI Dall-E 3, Semantic Kernel, and C#

We will generate Base64 images using the OpenAI Dall-E 3 service and Semantic Kernel. The Base64 representation of the image will be saved in a text file. Thereafter, we will read the text file from an index.html page using JavaScript and subsequently render the image on a web page.

Source Code: https://github.com/medhatelmasry/DalleImageBase64/

What is Semantic Kernel?

This is the official definition obtained from Create AI agents with Semantic Kernel | Microsoft Learn:

Semantic Kernel is an open-source SDK that lets you easily build agents that can call your existing code. As a highly extensible SDK, you can use Semantic Kernel with models from OpenAI, Azure OpenAI, Hugging Face, and more! 

Getting Started

In a suitable directory, create a console application named DalleImageBase64 and add to it three packages needed for our application with the following terminal window commands:

dotnet new console -o DalleImageBase64
cd DalleImageBase64
dotnet add package Microsoft.SemanticKernel
dotnet add package System.Configuration.ConfigurationManager  
dotnet add package SkiaSharp 
 

Create a file named App.config in the root folder of the console application and add to it the important parameters that allow access to the Azure OpenAI service. Contents of App.config are like the following:

<?xml version="1.0"?>
<configuration>
    <appSettings>
        <add key="endpoint" value="https://fake.openai.azure.com/" />
        <add key="azure-api-key" value="fake-azure-openai-key" />
        <add key="openai-api-key" value="fake-openai-key" />
        <add key="openai-org-id" value="fake-openai-org-id" />
        <add key="gpt-deployment" value="gpt-4o-mini" />
        <add key="dalle-deployment" value="dall-e-3" />
        <add key="openai_or_azure" value="openai" />
    </appSettings>
</configuration> 

NOTE: Since I cannot share the endpoint and apiKey with you, I have fake values for these settings.

Let's Code

Open Program.cs and delete all its contents. Add the following using statements at the top:

using System.Configuration;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.TextToImage;

We need to read the App.config file settings into our application. We will use the ConfigurationManager from namespace System.Configuration. To read settings from App.config with ConfigurationManager, append the following code to Program.cs:

// Get configuration settings from App.config
string _endpoint = ConfigurationManager.AppSettings["endpoint"]!;
string _azureApiKey = ConfigurationManager.AppSettings["azure-api-key"]!;
string _openaiApiKey = ConfigurationManager.AppSettings["azure-api-key"]!;
string _dalleDeployment = ConfigurationManager.AppSettings["dalle-deployment"]!;
string _gptDeployment = ConfigurationManager.AppSettings["gpt-deployment"]!;
string _openai_or_azure = ConfigurationManager.AppSettings["openai_or_azure"]!;
string _openaiOrgId = ConfigurationManager.AppSettings["openai-org-id"]!;

Currently, we need to disable certain warning directives by adding the following into the .csproj file inside the <PropertyGroup> block:

<NoWarn>SKEXP0001, SKEXP0010</NoWarn>

Then, append this code to Program.cs:

// Create a kernel builder
var builder = Kernel.CreateBuilder(); 
 
// Add OpenAI services to the kernel
if (_openai_or_azure == "azure") {
    // use azure openai services
    builder.AddOpenAIChatCompletion(_gptDeployment, _endpoint, _azureApiKey);
    builder.AddOpenAITextToImage(_dalleDeployment, _endpoint, _azureApiKey);
} else {
    // use openai services
    builder.AddOpenAIChatCompletion(_gptDeployment, _openaiApiKey, _openaiOrgId);
    builder.AddOpenAITextToImage(_openaiApiKey, _openaiOrgId);
 
// Build the kernel
var kernel = builder.Build();

W
e created a builder object from SematicKernel, added the AddAzureOpenAITextToImage and AddAzureOpenAIChatCompletion services, then obtained an instance of the kernel object.

Get an instance of the "Dall-E" service from the kernel with the following code:

// Get AI service instance used to generate images
var dallE = kernel.GetRequiredService<ITextToImageService>();

Let us create a prompt that generates an image representing a phrase entered by the user. Append this code to Program.cs:

// create execution settings for the prompt
var prompt = @"
Think about an image that represents {{$input}}.";

We then configure the prompt execution settings with:

var executionSettings = new OpenAIPromptExecutionSettings {
    MaxTokens = 256,
    Temperature = 1
};

Temperature is a measure of how creative you want the AI to be. This ranges from 0 to 1, where 0 is least creative and 1 is most creative.

We will create a semantic function from our prompt with:

// create a semantic function from the prompt
var genImgFunction = kernel.CreateFunctionFromPrompt(prompt, executionSettings);

Let us ask the user for input with this code:

// Get a phrase from the user
Console.WriteLine("Enter a phrase to generate an image from: ");
string? phrase = Console.ReadLine();
if (string.IsNullOrEmpty(phrase)) {
    Console.WriteLine("No phrase entered.");
    return;
}

Next, we will ask the kernel to combine the prompt with the input received from to user.

// Invoke the semantic function to generate an image description
var imageDescResult = await kernel.InvokeAsync(genImgFunction, new() { ["input"] = phrase });
var imageDesc = imageDescResult.ToString();

Finally, ask Dall-E service to do the important work of generating an image based on the description. It returns an image url. This is done with the following code:

// Use DALL-E 3 to generate an image. 
// In this case, OpenAI returns a URL (though you can ask to return a base64 image)
var imageUrl = await dallE.GenerateImageAsync(imageDesc.Trim(), 1024, 1024);

Let’s print the output URL so that the user can pop it into a browser to see what it looks like:

// Display the image URL    
Console.WriteLine($"Image URL:\n\n{imageUrl}"); 

We will next use the SkiaSharp package (installed earlier on) to save the the image to the computer file system. Create a helper class named SkiaUtils with the following code:

public static class SkiaUtils {

           public static async Task<string> SaveImageToFile(string url, int width, int height, string filename = "image.png") {

        SKImageInfo info = new SKImageInfo(width, height);
        SKSurface surface = SKSurface.Create(info);
        SKCanvas canvas = surface.Canvas;
        canvas.Clear(SKColors.White);
        var httpClient = new HttpClient();
        using (Stream stream = await httpClient.GetStreamAsync(url))
        using (MemoryStream memStream = new MemoryStream()) {
            await stream.CopyToAsync(memStream);
            memStream.Seek(0, SeekOrigin.Begin);
            SKBitmap webBitmap = SKBitmap.Decode(memStream);
            canvas.DrawBitmap(webBitmap, 0, 0, null);
            surface.Draw(canvas, 0, 0, null);
        };
        surface.Snapshot().Encode(SKEncodedImageFormat.Png, 100).SaveTo(new FileStream(filename, FileMode.Create));
        return filename;
    }

            public static async Task<string> GetImageToBase64String(string url, int width, int height) {

        SKImageInfo info = new SKImageInfo(width, height);
        SKSurface surface = SKSurface.Create(info);
        SKCanvas canvas = surface.Canvas;
        canvas.Clear(SKColors.White);
        var httpClient = new HttpClient();
        using (Stream stream = await httpClient.GetStreamAsync(url))
        using (MemoryStream memStream = new MemoryStream())  {
            await stream.CopyToAsync(memStream);
            memStream.Seek(0, SeekOrigin.Begin);
            SKBitmap webBitmap = SKBitmap.Decode(memStream);
            canvas.DrawBitmap(webBitmap, 0, 0, null);
            surface.Draw(canvas, 0, 0, null);
        };
        using (MemoryStream memStream = new MemoryStream()) {
            surface.Snapshot().Encode(SKEncodedImageFormat.Png, 100).SaveTo(memStream);
            byte[] imageBytes = memStream.ToArray();
            return Convert.ToBase64String(imageBytes);
        }
    }
}

The above SkiaUtils class contains two static methods: SaveImageToFile() and GetImageToBase64String(). The method names are self-explanatory. Let us use these methods in our application. Add the following code to the bottom of Program.cs:

// generate a random number between 0 and 200 to be used for filename
var random = new Random().Next(0, 200);

// use SkiaUtils class to save the image as a .png file
string filename = await SkiaUtils.SaveImageToFile(imageUrl, 1024, 1024, $"{random}-image.png");

// use SkiaUtils class to get base64 string representation of the image
var base64Image = await SkiaUtils.GetImageToBase64String(imageUrl, 1024, 1024);

// save base64 string representation of the image to a text file
File.WriteAllText($"{random}-base64image.txt", base64Image);
 
// save base64 string representation of the image to a text file
File.WriteAllText($"{random}-base64image.txt", base64Image);
 
// Display the image filename
Console.WriteLine($"\nImage saved as {filename}");
 
// Display the base64 image filename
Console.WriteLine($"\nBase64 image saved as {random}-base64image.txt");

Running App

Let’s try it out. Run the app in a terminal window with:

dotnet run

The user is prompted with “Enter a phrase to generate an image from:”. I entered “a camel roaming the streets of New York”. This is the output I received:


I copied and pasted the URL into my browser. This is what the image looked like:

Two files were created in the root folder of the console application - namely: 94-image.png and 94-base64image.txt. Note that your filenames could be different because the numbers in the name are randomly generated.

You can double-click on the .png image file to view it in the default image app on your computer.

Viewing Base64 representation of image in a web page

In the root folder of your console application, create a file named index.html and add to it the following HTML/JavaScript code:

<!DOCTYPE html>
<html lang="en">
 
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, 
                 initial-scale=1.0">
    <title>Read Base64 image</title>
</head>
 
<body>
    <input type="file" id="fileInput" />
    <img src="" id="img"/>
    <script>
        document.getElementById('fileInput')
            .addEventListener('change', (event) => {
                const file = event.target.files[0];
                const reader = new FileReader();
 
                reader.onload = function () {
                    const content = reader.result;
                    console.log(content);
                    document.getElementById('img')
                        .src = 'data:image/png;base64,' + content;
                };
 
                reader.onerror = function () {
                    console.error('Error reading the file');
                };
 
                reader.readAsText(file, 'utf-8');
            });
    </script>
</body>
 
</html>

The JavaScript in the above index.html file reads the text file and sets its Base64 content to the src attribute of an image tag.

View Base64 representation of the image

Double click on the index.html file on your file system.
Navigate to the text file that contains the Base64 representation of the image and select it. You will see the same image that you had seen earlier loaded to the web page.


Conclusion

You can use the Image URL generated from the Dall-E 3 API, save it to your computer or generate a Base64 representation of the image,