Showing posts with label function. Show all posts
Showing posts with label function. Show all posts

Saturday, January 13, 2024

Generate Images with Azure OpenAI Dall-E 3, Semantic Kernel, and C#

It is very easy to generate images using the OpenAI Dall-E 3 service and Semantic Kernel. You provide the text describing what you want and OpenAI will generate for you the image. In this tutorial, we will use Semantic Kernel and Azure OpenAI to do exactly that.

Source Code: https://github.com/medhatelmasry/DalleImage.git

Companion Video: https://youtu.be/Dr727OhX4HU

What is Semantic Kernel?

This is the official definition obtained from Create AI agents with Semantic Kernel | Microsoft Learn:

Semantic Kernel is an open-source SDK that lets you easily build agents that can call your existing code. As a highly extensible SDK, you can use Semantic Kernel with models from OpenAI, Azure OpenAI, Hugging Face, and more! 

Getting Started

In a suitable directory, create a console application named DalleImage and add to it two packages needed for our application with the following terminal window commands:

dotnet new console -o DalleImage
cd DalleImage
dotnet add package Microsoft.SemanticKernel
dotnet add package System.Configuration.ConfigurationManager

Create a file named App.config in the root folder of the console application and add to it the important parameters that allow access to the Azure OpenAI service. Contents of App.config are like the following:

<?xml version="1.0"?>
<configuration>
    <appSettings>
        <add key="endpoint" value="https://fake.openai.azure.com/" />
        <add key="api-key" value="fakekey-fakekey-fakekey-fakekey" />
        <add key="gpt-deployment" value="gpt-35-turbo" />
        <add key="dalle-deployment" value="dall-e-3" />
    </appSettings>
</configuration>

NOTE: Since I cannot share the endpoint and apiKey with you, I have fake values for these settings.

Currently, the Dall-E 3 model is in preview and only available in the "Sweden Central" Azure data centre according to https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#dall-e-models-preview

Let's Code

Open Program.cs and delete all its contents. Add the following using statements at the top:

using System.Configuration;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.TextToImage;

We need to read the App.config file settings into our application. We will use the ConfigurationManager from namespace System.Configuration. To read settings from App.config with ConfigurationManager, append the following code to Program.cs:

// Get configuration settings from App.config
string _endpoint = ConfigurationManager.AppSettings["endpoint"]!;
string _apiKey = ConfigurationManager.AppSettings["api-key"]!;
string _dalleDeployment = ConfigurationManager.AppSettings["dalle-deployment"]!;
string _gptDeployment = ConfigurationManager.AppSettings["gpt-deployment"]!;

Currently, we need to disable certain warning directives by adding the following into the .csproj file inside the <PropertyGroup> block:

<NoWarn>SKEXP0001, SKEXP0002, SKEXP0011, SKEXP0012</NoWarn>

Then, append this code to Program.cs:

// Create a kernel builder
var builder = Kernel.CreateBuilder(); 
 
// Add OpenAI services to the kernel
builder.AddAzureOpenAITextToImage(_dalleDeployment, _endpoint, _apiKey);
builder.AddAzureOpenAIChatCompletion(_gptDeployment, _endpoint, _apiKey); 
 
// Build the kernel
var kernel = builder.Build();

W
e created a builder object from SematicKernel, added the AddAzureOpenAITextToImage and AddAzureOpenAIChatCompletion services, then obtained an instance of the kernel object.

Get an instance of the "Dall-E" service from the kernel with the following code:

// Get AI service instance used to generate images
var dallE = kernel.GetRequiredService<ITextToImageService>();

Let us create a prompt that generates an image representing a phrase entered by the user. Append this code to Program.cs:

// create execution settings for the prompt
var prompt = @"
Think about an artificial object that represents {{$input}}.";

We then configure the prompt execution settings with:

var executionSettings = new OpenAIPromptExecutionSettings {
    MaxTokens = 256,
    Temperature = 1
};

Temperature is a measure of how creative you want the AI to be. This ranges from 0 to 1, where 0 is least creative and 1 is most creative.

We will create a semantic function from our prompt with:

// create a semantic function from the prompt
var genImgFunction = kernel.CreateFunctionFromPrompt(prompt, executionSettings);

Let us ask the user for input with:

// Get a phrase from the user
Console.WriteLine("Enter a phrase to generate an image from: ");
string? phrase = Console.ReadLine();
if (string.IsNullOrEmpty(phrase)) {
    Console.WriteLine("No phrase entered.");
    return;
}

Next, ask the kernel to combine the prompt with the input received from to user, producing a description.

// Invoke the semantic function to generate an image description
var imageDescResult = await kernel.InvokeAsync(genImgFunction, new() { ["input"] = phrase });
var imageDesc = imageDescResult.ToString();

Finally, ask Dall-E service to do the important work of generating an image based on the description. It returns an image url. This is done with the following code:

// Use DALL-E 3 to generate an image. 
// In this case, OpenAI returns a URL (though you can ask to return a base64 image)
var imageUrl = await dallE.GenerateImageAsync(imageDesc.Trim(), 1024, 1024);

Let’s print the output URL so that the user can pop it into a browser to see what it looks like:

Console.WriteLine($"Image URL:\n\n{imageUrl}");

Running App

Let’s try it out. Run the app in a terminal window with:

dotnet run

The user is prompted with “Enter a phrase to generate an image from:”. I entered “a lobster flying over the pyramids in giza”, and received this output:


I find it pretty fascinating how OpenAI can generate images based on text-based descriptions. I hope you do too.

Wednesday, January 10, 2024

Build simple C# completion app with Azure OpenAI and Semantic Kernel Tool

In this walkthrough, I will show you how easy it is to use the 'Semantic Kernel Tool' in Visual Studio Code to create a cake baking skill without a single line of code. We will then build a C# console application that uses the skill.

Source: https://github.com/medhatelmasry/sk-library

Companion Video: https://youtu.be/eI5Pr58gFZg

What is Semantic Kernel?

This is the official definition obtained from Create AI agents with Semantic Kernel | Microsoft Learn:

Semantic Kernel is an open-source SDK that lets you easily build agents that can call your existing code. As a highly extensible SDK, you can use Semantic Kernel with models from OpenAI, Azure OpenAI, Hugging Face, and more! 

We now have an extension for Visual Studio Code that makes it very easy to build AI apps that use the large language models (LLMs) available through OpenAI

In order to proceed with this tutorial, you will need the following prerequisites:

  1. .NET 8.0 Framework
  2. Visual Studio Code
  3. Access to Azure OpenAI
  4. Install the 'Semantic Kernel Tool' extension into Visual Studio Code.

Getting Started

In a suitable working directory, create a folder named sk-library, change directory into the new folder, then start Visual Studio Code with the following terminal commands:

mkdir sk-library
cd sk-library
code .

In Visual Studio Code, select: View >> Command Palette...

Select "Add AI Endpoint" from the list.

You will get asked to choose between AzureOpenAI and OpenAI. I will choose AzureOpenAI in this example.


The next ask is the name of the model that was created on the Azure portal in the AzureOpenAI service. A suitable model for this completion task is text-davinci-003.

We will be asked to enter the Azure OpenAI Endpoint, which you can obtain from Azure.

Finally, we must enter the Azure OpenAI Key.


If all goes well, you will receive this comforting message.

Create a skill without any coding

We can now create a skill without a single line of code. Create sub-folders Skills/Baking with the following terminal commands:

mkdir Skills
cd Skills 
mkdir Baking
cd ..

Start the "Semantic Kernel" view in Visual Studio Code.

Click on "Add Semantic Skill" tool beside Functions.


Click on "Create a new skill folder for the function".


Choose the Skills/Baking folder.

Enter CakeRecipe for the function name.

A description for the function is required. Enter "Recipe for making a cake" for the description.

Two files get created in the Skills/Baking/CakeRecipe folder: skprompt.txt and config.json

skprompt.txt

config.json

Replace contents of skprompt.txt with the following:

I want to bake a fabulous cake. Give me a recipe using the input provided. The cake must be easy, tasty, and cheap. I don't want to spend more than $10 on ingredients. I don't want to spend more than 30 minutes preparing the cake. I don't want to spend more than 30 minutes baking the cake. 

[INPUT]

{{$input}}

[END INPUT]

The above file contains a prompt and a variable {{$input}}. The AI should give us a recipe for the type of cake that is entered to replace the {{$input}} variable.

Testing our baking skill

Creating a skill with the Visual Studio Code 'Semantic Kernel Tool' is painless. We can now test our baking skill. Click on the arrow on the top-right of the panel.


You will be asked to enter a type of cake that you are interested in baking. I entered: chocolate.


If you check the OUTPUT tab at the bottom of Visual Studio Code, you will see the results.


The actual output I received was:

AI Provider: AzureOpenAI
Model: text-davinci-003
Execute: Baking.CakeRecipe
Parameters:
input: chocolate
Prompt:
I want to bake a fabulous cake. Give me a recipe using the input
provided. The cake must be easy, tasty and cheap. I don't want to spend more than
$10 on ingredients. I don't want to spend more than 30 minutes preparing the
cake. I don't want to spend more than 30 minutes baking the cake. 
[INPUT]
chocolate
[END INPUT]
Result:
Easy Chocolate Cake Recipe
Ingredients:
- 1 ½ cups all-purpose flour
- 1 cup granulated sugar
- ¾ cup cocoa powder
- 1 teaspoon baking soda
- ½ teaspoon baking powder
- ½ teaspoon salt
- 2 eggs
- 1 cup buttermilk
- ½ cup vegetable oil
- 1 teaspoon vanilla extract
Instructions:
1. Preheat oven to 350°F. Grease and flour a 9-inch round cake pan.
2. In a large bowl, whisk together the flour, sugar, cocoa powder, baking soda, baking powder, and salt.
3. In a separate bowl, whisk together the eggs, buttermilk, oil, and vanilla extract.
4. Pour the wet ingredients into the dry ingredients and mix until just combined.
5. Pour the batter into the prepared cake pan and bake for 25-30 minutes, or until a toothpick inserted into the center comes out clean.
6. Allow the cake to cool in the pan for 10 minutes before transferring to a wire rack to cool completely.
7. Serve and enjoy!
Tokens:
Input tokens: 88
Output tokens: 237
Total: 325
Duration:
00:00:11.971
========== Function execution was finished. ==========

Using our baking skill in C# console app

Let us first create a console application in the root sk-library folder, with:

dotnet new console

We need to add two packages. One for SemanticKernel and the other is ConfigurationManager that allows us to read from settings in the App.config XML file.

dotnet add package Microsoft.SemanticKernel
dotnet add package System.Configuration.ConfigurationManager

Create a file named App.config in the root folder of the console application and add to it the important parameters that allow access to your Azure OpenAI service. Contents of App.config are like the following:

<?xml version="1.0"?>
<configuration>
    <appSettings>
        <add key="endpoint" value="https://fake.openai.azure.com/" />
        <add key="api-key" value="fakekey-fakekey-fakekey-fakekey" />
        <add key="deployment-name" value="text-davinci-003" />
    </appSettings>
</configuration>

NOTE: Since I cannot share the endpoint and apiKey, I have fake values for these settings.

Replace the code in Program.cs with the following code:

using System.Configuration;
using Microsoft.SemanticKernel;

string _endpoint = ConfigurationManager.AppSettings["endpoint"]!;
string _apikey = ConfigurationManager.AppSettings["api-key"]!;
string _deploymentname = ConfigurationManager.AppSettings["deployment-name"]!;

var builder = Kernel.CreateBuilder();

builder.Services
    .AddAzureOpenAITextGeneration(
        _deploymentname
        , _endpoint
        , _apikey);

var kernel = builder.Build();

var functionDirectory = Path.Combine(Directory.GetCurrentDirectory(), "Skills", "Baking");
var semanticFunctions = kernel.ImportPluginFromPromptDirectory(functionDirectory);

/* request user for input */
Console.WriteLine("Enter a cake type you want to bake:");
var cakeType = Console.ReadLine();
var functionResult = await kernel.InvokeAsync(semanticFunctions["CakeRecipe"],
    new KernelArguments {
                { "input", cakeType }
    });
Console.WriteLine(functionResult);
Console.WriteLine();

Run the app with:

dotnet run

You will be asked to enter a type of cake. I entered: lemon.


This was the output given by the AI.

Lemon Sponge Cake 
Ingredients: 
- 2 cups all-purpose flour 
- 2 teaspoons baking powder 
- ½ teaspoon salt 
- 4 tablespoons butter 
- 1 cup sugar 
- 2 eggs 
- 1 cup milk 
- juice and zest of one lemon
Instructions: 
1. Preheat oven to 350°F (175°C). Grease and flour an 8-inch cake pan.
2. In a medium bowl, sift together the flour, baking powder, and salt. 
3. In a large bowl, beat the butter and sugar together until light and fluffy.
4. Beat in the eggs, one at a time. 
5. Beat in the flour mixture alternately with the milk, beginning and ending with the flour mixture. 
6. Stir in the lemon juice and zest. 
7. Pour the batter into the prepared cake pan. 
8. Bake for 25-30 minutes or until a toothpick inserted into the center comes out clean. 
9. Allow the cake to cool in the pan before serving.

You can build applications with a variety of AI skills. 

Happy Coding.