Showing posts with label console. Show all posts
Showing posts with label console. Show all posts

Saturday, December 14, 2024

.NET Aspire and Semantic Kernel AI

 Let's learn how to use the .NET Aspire Azure OpenAI client. We will familiarize ourselves with the Aspire.Azure.AI.OpenAI library, which is used to register an OpenAIClient in the dependency injection (DI) container for consuming Azure OpenAI or OpenAI functionality. In addition, it enables corresponding logging and telemetry.

Companion Video: https://youtu.be/UuLnCRdYvEI
Final Solution Code: https://github.com/medhatelmasry/AspireAI_Final

Pre-requisites:

  • .NET 9.0
  • Visual Studio Code
  • .NET Aspire Workload
  • "C# Dev Kit" extension for VS Code

Getting Started

We will start by cloning a simple C# solution that contains two projects that use Semantic Kernel - namely a console project (ConsoleAI) and a razor-pages project (RazorPagesAI). Clone the project in a working directory on your computer by executing these commands in a terminal window:

git clone https://github.com/medhatelmasry/AspireAI.git
cd AspireAI

The cloned solution contains a console application (ConsoleAI) and a razor-pages application (RazorPagesAI). They both do pretty much do the same thing. The objective of today’s exercise is to:

  • use .NET Aspire so that both projects get started from one place 
  • pass environment variables to the console and razor-pages web apps from the .AppHost project that belongs to .NET Aspire

Open the solution in VS Code and update the values in the following appsettings.json files with your access parameters for Azure OpenAI and/or OpenAI:

ConsoleAI/appsettings.json
RazorPagesAI/appsettings.json

The most important settings are the connection strings. They are identical in both projects:

"ConnectionStrings": {
  "azureOpenAi": "Endpoint=Azure-OpenAI-Endpoint-Here;Key=Azure-OpenAI-Key-Here;",
  "openAi": "Key=OpenAI-Key-Here"
}

After you update your access parameters, try each application separately to see what it does:

Here is my experience using the console application (ConsoleAI) with AzureOrOpenAI set to “OpenAI”:

cd ConsoleAI
dotnet run


I then changed the AzureOrOpenAI setting to “Azure” and ran the console application (ConsoleAI) again:

Next, try the razor pages web application (RazorPagesAI) with AzureOrOpenAI set to “OpenAI”:

cd ../RazorPagesAI
dotnet watch


In the RazorPagesAI web app’s appsettings.json file, I changed AzureOrOpenAI to “Azure”, resulting in a similar experience.


In the root folder, add .NET Aspire to the solution:

cd ..
dotnet new aspire --force

Add the previous projects to the newly created .sln file with:

dotnet sln add ./AiLibrary/AiLibrary.csproj
dotnet sln add ./RazorPagesAI/RazorPagesAI.csproj
dotnet sln add ./ConsoleAI/ConsoleAI.csproj

Add the following .NET Aspire agent packages to the client ConsoleAI and RazorPagesAI projects with:

dotnet add ./ConsoleAI/ConsoleAI.csproj package Aspire.Azure.AI.OpenAI --prerelease
dotnet add ./RazorPagesAI/RazorPagesAI.csproj package Aspire.Azure.AI.OpenAI --prerelease

To add Azure hosting support to your IDistributedApplicationBuilder, install the 📦 Aspire.Hosting.Azure.CognitiveServices NuGet package in the .AppHost project:

dotnet add ./AspireAI.AppHost/AspireAI.AppHost.csproj package Aspire.Hosting.Azure.CognitiveServices

In VS Code, add the following references:

  1. Add a reference from the .AppHost project into ConsoleAI project.
  2. Add a reference from the .AppHost project into RazorPagesAI project.
  3. Add a reference from the ConsoleAI project into .ServiceDefaults project.
  4. Add a reference from the RazorPagesAI project into .ServiceDefaults project.

Copy the AI and ConnectionStrings blocks from either the console (ConsoleAI) or web app (RazorPagesAI)  appsettings.json file into the appsettings.json file of the .AppHost project. The appsettings.json file in the .AppHost project will look similar to this:

"AI": {
  "AzureOrOpenAI": "OpenAI",
  "OpenAiChatModel": "gpt-3.5-turbo",
  "AzureChatDeploymentName": "gpt-35-turbo"
},
"ConnectionStrings": {
  "azureOpenAi": "Endpoint=Azure-OpenAI-Endpoint-Here;Key=Azure-OpenAI-Key-Here;",
  "openAi": "Key=OpenAI-Key-Here"
}

Add the following code to the Program.cs file in the .AppHost project just before builder.Build().Run()

IResourceBuilder<IResourceWithConnectionString> openai;
var AzureOrOpenAI = builder.Configuration["AI:AzureOrOpenAI"] ?? "Azure"; ;
var chatDeploymentName = builder.Configuration["AI:AzureChatDeploymentName"];
var openAiChatModel = builder.Configuration["AI:OpenAiChatModel"];
 
// Register an Azure OpenAI resource. 
// The AddAzureAIOpenAI method reads connection information
// from the app host's configuration
if (AzureOrOpenAI.ToLower() == "azure") {
    openai = builder.ExecutionContext.IsPublishMode
        ? builder.AddAzureOpenAI("azureOpenAi")
        : builder.AddConnectionString("azureOpenAi");
} else {
    openai = builder.ExecutionContext.IsPublishMode
        ? builder.AddAzureOpenAI("openAi")
        : builder.AddConnectionString("openAi");
}
 
// Register the RazorPagesAI project and pass to it environment variables.
//  WithReference method passes connection info to client project
builder.AddProject<Projects.RazorPagesAI>("razor")
    .WithReference(openai)
    .WithEnvironment("AI__AzureChatDeploymentName", chatDeploymentName)
    .WithEnvironment("AI__AzureOrOpenAI", AzureOrOpenAI)
    .WithEnvironment("AI_OpenAiChatModel", openAiChatModel);
 
 // register the ConsoleAI project and pass to it environment variables
builder.AddProject<Projects.ConsoleAI>("console")
    .WithReference(openai)
    .WithEnvironment("AI__AzureChatDeploymentName", chatDeploymentName)
    .WithEnvironment("AI__AzureOrOpenAI", AzureOrOpenAI)
    .WithEnvironment("AI_OpenAiChatModel", openAiChatModel);

We need to add .NET Aspire agents in both our console and web apps. Let us start with the web app. Add this code to the Program.cs file in the RazorPagesAI project right before “var app = builder.Build()”: 

builder.AddServiceDefaults();

In the same Program.cs of the web app (RazorPagesAI), comment out the if (azureOrOpenAi.ToLower() == "openai") { …. } else { ….. } block and replace it with this code:

if (azureOrOpenAi.ToLower() == "openai") {
    builder.AddOpenAIClient("openAi");
    builder.Services.AddKernel()
        .AddOpenAIChatCompletion(openAiChatModel);
} else {
    builder.AddAzureOpenAIClient("azureOpenAi");
    builder.Services.AddKernel()
        .AddAzureOpenAIChatCompletion(azureChatDeploymentName);
}

In the above code, we call the extension method to register an OpenAIClient for use via the dependency injection container. The method takes a connection name parameter. Also, register Semantic Kernel with the DI. 

Also, in the Program.cs file in the ConsoleAI project, add this code right below the using statements:

var hostBuilder = Host.CreateApplicationBuilder();
hostBuilder.AddServiceDefaults();

In the same Program.cs of the console app (ConsoleAI), comment out the if (azureOrOpenAi.ToLower() == "azure") { …. } else { ….. } block and replace it with this code:

if (azureOrOpenAI.ToLower() == "azure") {
    var azureChatDeploymentName = config["AI:AzureChatDeploymentName"] ?? "gpt-35-turbo";
    hostBuilder.AddAzureOpenAIClient("azureOpenAi");
    hostBuilder.Services.AddKernel()
        .AddAzureOpenAIChatCompletion(azureChatDeploymentName);
} else {
    var openAiChatModel = config["AI:OpenAiChatModel"] ?? "gpt-3.5-turbo";
    hostBuilder.AddOpenAIClient("openAi");
    hostBuilder.Services.AddKernel()
        .AddOpenAIChatCompletion(openAiChatModel);
}
var app = hostBuilder.Build();

Replace “var kernel = builder.Build();” with this code:

var kernel = app.Services.GetRequiredService<Kernel>();
app.Start();

You can now test that the .NET Aspire orchestration of both the Console and Web apps. Stop all applications, then, in a terminal window,  go to the .AppHost project and run the following command:

dotnet watch

You will see the .NET Aspire dashboard:


Click on Views under the Logs column. You will see this output indicating that the console application ran successfully:


Click on the link for the web app under the Endpoints column. It opens the razor pages web app in another tab in your browser. Test it out and verify that it works as expected.

Stop the .AppHost application, then comment out the AI and ConneectionStrings blocks in the appsettings.json files in both the console and web apps. If you run the .AppHost project again, you will discover that it works equally well because the environment variables are being passed from the .AppHost project into the console and web apps respectively.

One last refinement we can do to the console application is do away with the ConfigurationBuilder because we can get a configuration object from the ApplicationBuilder. Therefore, comment out the following code in the console application:

var config = new ConfigurationBuilder()
    .SetBasePath(Directory.GetCurrentDirectory())
    .AddJsonFile("appsettings.json", optional: true, reloadOnChange: true)
    .Build();

Replace the above code with the following:

var config = hostBuilder.Configuration;

You can delete the following package from the ConsoleAI.csproj file:

<PackageReference Include="Microsoft.Extensions.Configuration.Json" Version="9.0.0" />

Everything works just as it did before.


Wednesday, February 7, 2024

Base64 images with Azure OpenAI Dall-E 3, Semantic Kernel, and C#

We will generate Base64 images using the OpenAI Dall-E 3 service and Semantic Kernel. The Base64 representation of the image will be saved in a text file. Thereafter, we will read the text file from an index.html page using JavaScript and subsequently render the image on a web page.

Source Code: https://github.com/medhatelmasry/DalleImageBase64/

What is Semantic Kernel?

This is the official definition obtained from Create AI agents with Semantic Kernel | Microsoft Learn:

Semantic Kernel is an open-source SDK that lets you easily build agents that can call your existing code. As a highly extensible SDK, you can use Semantic Kernel with models from OpenAI, Azure OpenAI, Hugging Face, and more! 

Getting Started

In a suitable directory, create a console application named DalleImageBase64 and add to it three packages needed for our application with the following terminal window commands:

dotnet new console -o DalleImageBase64
cd DalleImageBase64
dotnet add package Microsoft.SemanticKernel
dotnet add package System.Configuration.ConfigurationManager  
dotnet add package SkiaSharp 
 

Create a file named App.config in the root folder of the console application and add to it the important parameters that allow access to the Azure OpenAI service. Contents of App.config are like the following:

<?xml version="1.0"?>
<configuration>
    <appSettings>
        <add key="endpoint" value="https://fake.openai.azure.com/" />
        <add key="azure-api-key" value="fake-azure-openai-key" />
        <add key="openai-api-key" value="fake-openai-key" />
        <add key="openai-org-id" value="fake-openai-org-id" />
        <add key="gpt-deployment" value="gpt-4o-mini" />
        <add key="dalle-deployment" value="dall-e-3" />
        <add key="openai_or_azure" value="openai" />
    </appSettings>
</configuration> 

NOTE: Since I cannot share the endpoint and apiKey with you, I have fake values for these settings.

Let's Code

Open Program.cs and delete all its contents. Add the following using statements at the top:

using System.Configuration;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.TextToImage;

We need to read the App.config file settings into our application. We will use the ConfigurationManager from namespace System.Configuration. To read settings from App.config with ConfigurationManager, append the following code to Program.cs:

// Get configuration settings from App.config
string _endpoint = ConfigurationManager.AppSettings["endpoint"]!;
string _azureApiKey = ConfigurationManager.AppSettings["azure-api-key"]!;
string _openaiApiKey = ConfigurationManager.AppSettings["azure-api-key"]!;
string _dalleDeployment = ConfigurationManager.AppSettings["dalle-deployment"]!;
string _gptDeployment = ConfigurationManager.AppSettings["gpt-deployment"]!;
string _openai_or_azure = ConfigurationManager.AppSettings["openai_or_azure"]!;
string _openaiOrgId = ConfigurationManager.AppSettings["openai-org-id"]!;

Currently, we need to disable certain warning directives by adding the following into the .csproj file inside the <PropertyGroup> block:

<NoWarn>SKEXP0001, SKEXP0010</NoWarn>

Then, append this code to Program.cs:

// Create a kernel builder
var builder = Kernel.CreateBuilder(); 
 
// Add OpenAI services to the kernel
if (_openai_or_azure == "azure") {
    // use azure openai services
    builder.AddOpenAIChatCompletion(_gptDeployment, _endpoint, _azureApiKey);
    builder.AddOpenAITextToImage(_dalleDeployment, _endpoint, _azureApiKey);
} else {
    // use openai services
    builder.AddOpenAIChatCompletion(_gptDeployment, _openaiApiKey, _openaiOrgId);
    builder.AddOpenAITextToImage(_openaiApiKey, _openaiOrgId);
 
// Build the kernel
var kernel = builder.Build();

W
e created a builder object from SematicKernel, added the AddAzureOpenAITextToImage and AddAzureOpenAIChatCompletion services, then obtained an instance of the kernel object.

Get an instance of the "Dall-E" service from the kernel with the following code:

// Get AI service instance used to generate images
var dallE = kernel.GetRequiredService<ITextToImageService>();

Let us create a prompt that generates an image representing a phrase entered by the user. Append this code to Program.cs:

// create execution settings for the prompt
var prompt = @"
Think about an image that represents {{$input}}.";

We then configure the prompt execution settings with:

var executionSettings = new OpenAIPromptExecutionSettings {
    MaxTokens = 256,
    Temperature = 1
};

Temperature is a measure of how creative you want the AI to be. This ranges from 0 to 1, where 0 is least creative and 1 is most creative.

We will create a semantic function from our prompt with:

// create a semantic function from the prompt
var genImgFunction = kernel.CreateFunctionFromPrompt(prompt, executionSettings);

Let us ask the user for input with this code:

// Get a phrase from the user
Console.WriteLine("Enter a phrase to generate an image from: ");
string? phrase = Console.ReadLine();
if (string.IsNullOrEmpty(phrase)) {
    Console.WriteLine("No phrase entered.");
    return;
}

Next, we will ask the kernel to combine the prompt with the input received from to user.

// Invoke the semantic function to generate an image description
var imageDescResult = await kernel.InvokeAsync(genImgFunction, new() { ["input"] = phrase });
var imageDesc = imageDescResult.ToString();

Finally, ask Dall-E service to do the important work of generating an image based on the description. It returns an image url. This is done with the following code:

// Use DALL-E 3 to generate an image. 
// In this case, OpenAI returns a URL (though you can ask to return a base64 image)
var imageUrl = await dallE.GenerateImageAsync(imageDesc.Trim(), 1024, 1024);

Let’s print the output URL so that the user can pop it into a browser to see what it looks like:

// Display the image URL    
Console.WriteLine($"Image URL:\n\n{imageUrl}"); 

We will next use the SkiaSharp package (installed earlier on) to save the the image to the computer file system. Create a helper class named SkiaUtils with the following code:

public static class SkiaUtils {

           public static async Task<string> SaveImageToFile(string url, int width, int height, string filename = "image.png") {

        SKImageInfo info = new SKImageInfo(width, height);
        SKSurface surface = SKSurface.Create(info);
        SKCanvas canvas = surface.Canvas;
        canvas.Clear(SKColors.White);
        var httpClient = new HttpClient();
        using (Stream stream = await httpClient.GetStreamAsync(url))
        using (MemoryStream memStream = new MemoryStream()) {
            await stream.CopyToAsync(memStream);
            memStream.Seek(0, SeekOrigin.Begin);
            SKBitmap webBitmap = SKBitmap.Decode(memStream);
            canvas.DrawBitmap(webBitmap, 0, 0, null);
            surface.Draw(canvas, 0, 0, null);
        };
        surface.Snapshot().Encode(SKEncodedImageFormat.Png, 100).SaveTo(new FileStream(filename, FileMode.Create));
        return filename;
    }

            public static async Task<string> GetImageToBase64String(string url, int width, int height) {

        SKImageInfo info = new SKImageInfo(width, height);
        SKSurface surface = SKSurface.Create(info);
        SKCanvas canvas = surface.Canvas;
        canvas.Clear(SKColors.White);
        var httpClient = new HttpClient();
        using (Stream stream = await httpClient.GetStreamAsync(url))
        using (MemoryStream memStream = new MemoryStream())  {
            await stream.CopyToAsync(memStream);
            memStream.Seek(0, SeekOrigin.Begin);
            SKBitmap webBitmap = SKBitmap.Decode(memStream);
            canvas.DrawBitmap(webBitmap, 0, 0, null);
            surface.Draw(canvas, 0, 0, null);
        };
        using (MemoryStream memStream = new MemoryStream()) {
            surface.Snapshot().Encode(SKEncodedImageFormat.Png, 100).SaveTo(memStream);
            byte[] imageBytes = memStream.ToArray();
            return Convert.ToBase64String(imageBytes);
        }
    }
}

The above SkiaUtils class contains two static methods: SaveImageToFile() and GetImageToBase64String(). The method names are self-explanatory. Let us use these methods in our application. Add the following code to the bottom of Program.cs:

// generate a random number between 0 and 200 to be used for filename
var random = new Random().Next(0, 200);

// use SkiaUtils class to save the image as a .png file
string filename = await SkiaUtils.SaveImageToFile(imageUrl, 1024, 1024, $"{random}-image.png");

// use SkiaUtils class to get base64 string representation of the image
var base64Image = await SkiaUtils.GetImageToBase64String(imageUrl, 1024, 1024);

// save base64 string representation of the image to a text file
File.WriteAllText($"{random}-base64image.txt", base64Image);
 
// save base64 string representation of the image to a text file
File.WriteAllText($"{random}-base64image.txt", base64Image);
 
// Display the image filename
Console.WriteLine($"\nImage saved as {filename}");
 
// Display the base64 image filename
Console.WriteLine($"\nBase64 image saved as {random}-base64image.txt");

Running App

Let’s try it out. Run the app in a terminal window with:

dotnet run

The user is prompted with “Enter a phrase to generate an image from:”. I entered “a camel roaming the streets of New York”. This is the output I received:


I copied and pasted the URL into my browser. This is what the image looked like:

Two files were created in the root folder of the console application - namely: 94-image.png and 94-base64image.txt. Note that your filenames could be different because the numbers in the name are randomly generated.

You can double-click on the .png image file to view it in the default image app on your computer.

Viewing Base64 representation of image in a web page

In the root folder of your console application, create a file named index.html and add to it the following HTML/JavaScript code:

<!DOCTYPE html>
<html lang="en">
 
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, 
                 initial-scale=1.0">
    <title>Read Base64 image</title>
</head>
 
<body>
    <input type="file" id="fileInput" />
    <img src="" id="img"/>
    <script>
        document.getElementById('fileInput')
            .addEventListener('change', (event) => {
                const file = event.target.files[0];
                const reader = new FileReader();
 
                reader.onload = function () {
                    const content = reader.result;
                    console.log(content);
                    document.getElementById('img')
                        .src = 'data:image/png;base64,' + content;
                };
 
                reader.onerror = function () {
                    console.error('Error reading the file');
                };
 
                reader.readAsText(file, 'utf-8');
            });
    </script>
</body>
 
</html>

The JavaScript in the above index.html file reads the text file and sets its Base64 content to the src attribute of an image tag.

View Base64 representation of the image

Double click on the index.html file on your file system.
Navigate to the text file that contains the Base64 representation of the image and select it. You will see the same image that you had seen earlier loaded to the web page.


Conclusion

You can use the Image URL generated from the Dall-E 3 API, save it to your computer or generate a Base64 representation of the image,

Wednesday, January 24, 2024

Using C# code generated by Azure OpenAI Studio

 In this article I will experience how one can use generated C# code in "Azure OpenAI Studio" to create your own application. Of course, it is assumed that you have an Azure subscription and access to Azure OpenAI.

Getting Started

Go to https://oai.azure.com.

Let us play in the “Chat playground”. Click on “Chat playground”.


We will start by grounding our AI bot. We can do that by telling the AI that it specializes in baking recipes. Enter the following to the "System Message":

You are an AI assistant that helps people with baking.

Next, add a prompt requesting a recipe:

Give me an easy recipe for baking a loaf of bread that takes about 30 minutes preparation and about 40 minutes in the oven.

The AI will suggest to you a recipe. This is the response I received:

Certainly! Here's a simple recipe for baking a loaf of bread with a total time of approximately 1 hour and 10 minutes:

Ingredients:
3 cups all-purpose flour
2 teaspoons active dry yeast
1 teaspoon salt
1 tablespoon sugar
1 cup warm water (around 110°F/43°C)
2 tablespoons olive oil

Instructions:
In a large mixing bowl, combine the warm water and sugar. Stir until the sugar is dissolved, then sprinkle the yeast over the water. Let it sit for about 5 minutes until the yeast becomes frothy.

Add the salt and olive oil to the yeast mixture, and gradually add the flour, one cup at a time. Stir well after each addition until a shaggy dough forms.

Once the dough starts to come together, transfer it onto a lightly floured surface and knead for about 5-7 minutes until it becomes smooth and elastic. If the dough is too sticky, you can add a little more flour as needed.

Shape the dough into a ball, and place it in a greased bowl. Cover the bowl with a clean kitchen towel or plastic wrap, and let the dough rise in a warm place for approximately 30 minutes or until it doubles in size.

Preheat your oven to 375°F (190°C) while the dough is rising.

After the dough has risen, punch it down gently to release any air bubbles. Transfer it to a greased loaf pan, and shape it into a loaf by tucking the edges under.

Cover the loaf pan again with the kitchen towel or plastic wrap, and let it rise for an additional 10 minutes.

Once the dough has risen for the second time, remove the cover and place the pan in the preheated oven. Bake for about 35-40 minutes or until the bread is golden brown and sounds hollow when tapped on the bottom.

Once baked, remove the bread from the oven and let it cool in the pan for a few minutes. Then transfer it to a wire rack to cool completely before slicing.

Enjoy your freshly baked bread!

Using generated C# code

Click on “View code”.


 Change the language to C#.

Note the message at the top. You are asked to add a specific version of a package in your C# app.

// Note: The Azure OpenAI client library for .NET is in preview.
// Install the .NET library via NuGet: dotnet add package Azure.AI.OpenAI --version 1.0.0-beta.5

To get started with a basic console application, click on the “Learn more” link at the bottom.

Choose C#.


Under "Set up" you will be asked to create a new app and add a package to it.

dotnet new console -n azure-openai-quickstart
cd azure-openai-quickstart
dotnet add package Azure.AI.OpenAI --prerelease

Run the above commands, then replace the code in Program.cs with the code that was generated by “Azure AI Studio”.

You will need to enter the AZURE_OPENAI_API_KEY at around line 8 in Program.cs. This is given to you just below the sample code in “Azure OpenAI Studio”.

Copy and paste the key into your code. This is what my code looked like after pasting the key:

If you run “dotnet build”, you will see some errors. This is because we did not use the specific preview version of the Azure.AI.OpenAI package that was suggested. Most likely you have a more recent version. The version I have at the time of writing (January 2024) is 1.0.0-beta.12. All the errors pertain to the ChatMessage property when creating ChatCompletionsOptions. Replace the code for responseWithoutStream with the following:

Response<ChatCompletions> responseWithoutStream = await client.GetChatCompletionsAsync(
   new ChatCompletionsOptions() {
    DeploymentName="gpt-35-turbo",
    Messages =
    {
      new ChatRequestSystemMessage(@"You are an AI assistant that helps people find information."),
      new ChatRequestUserMessage(@"Give me an easy recipe for baking a loaf of bread that takes about 30 minutes preparation and about 40 minutes in the oven."),     
    },
    Temperature = (float)0.7,
    MaxTokens = 800,


    NucleusSamplingFactor = (float)0.95,
    FrequencyPenalty = 0,
    PresencePenalty = 0,
   });

Since nothing is output, let us display the AI response. Add the following code to the bottom of Program.cs:

Console.WriteLine(response.Choices[0].Message.Content);

Run the app and you will see the response from the AI. In my case I received a very similar response to what I previously got.

Conclusion

"Azure OpenAI Studio" can help you get started with the development of a C# app that utilizes services from "Azure OpenAI".


Monday, January 15, 2024

PHP meets OpenAI with image generation

Let's generate images using OpenAI's dall-e-3 service. When using PHP, the open-source openai-php/client library facilitates the process. Check it out at https://github.com/openai-php/client.

In this article, we will learn how easy it is to generate an image with OpenAI and PHP. 

Source code: https://github.com/medhatelmasry/openai-dalle3-php

Prerequisites

In order to proceed, you will need the following:

  1. Subscription with OpenAI - If you do not have a subscription, go ahead and register at https://openai.com/.
  2. PHP - You need to have PHP version 8.2 (or higher) installed on your computer. You can download the latest version from https://www.php.net/downloads.php.
  3. Composer – If you do not have Composer yet, download and install it for your operating system from https://getcomposer.org/download/.

Getting Started

In a suitable working directory, create a folder named openai-dalle3-php with the following terminal window command:

mkdir openai-dalle3-php

Change into the newly created folder with:

cd openai-dalle3-php

Using Composer, install the openai-php/client package by running this command:

composer require openai-php/client

We will be reading our ApoenAPI key from the .env text file. We need to install this package in order to do that.

composer require vlucas/phpdotenv

Getting an API KEY from OpenAI

With your OpenAI credentials, login into https://openai.com/.  Click on API.

In the left navigation, click on "API keys". 


Click on the "+ Create new secret key" button.


Give the new key a name. I named it 20240115 representing the date when it was created. You may wish to give it a more meaningful or creative name. Once you are happy with the name, click on the "Create secret key" to generate the key.


Click on the copy button beside the key and paste the API-Key somewhere safe as we will need to use it later on. Note that you cannot view this key again. Click on Done to dismiss the dialog.

We will create a text file named .env in our  openai-dalle3-php folder with the following content:

OPENAI_API_KEY=put-your-openai-api-key-here

Set the API-Key as the value of OPENAI_API_KEY. This may look like this:

OPENAI_API_KEY=sk-OOghjTs8GsuHQklTCWOeT3BasdGJAjklBr3tr8ViZKv21BRN

Let's get coding

We can generate images by obtaining a URL to the newly created image, or by getting the Base-64 representation of the image. We will try both ways.

1) Get URL of the image

In the openai-dalle3-php folder, create a file named image-url.php with the following content:

<?php

require_once __DIR__ . '/vendor/autoload.php';

$dotenv = Dotenv\Dotenv::createImmutable(__DIR__);
$dotenv->load();
 
$client = OpenAI::client($_ENV['OPENAI_API_KEY']);

$response = $client->images()->create([
    'model' => 'dall-e-3',
    'prompt' => 'A panda flying over Paris at night',
    'n' => 1,
    'size' => '1024x1024',
    'response_format' => 'url',
]);

foreach ($response->data as $data) {
    $data->url; 
    $data->b64_json; // null
}

// display the image
echo '<img src="' . $data->url . '" />';

?>

In the above code, note the following:

  • we read in the API Key from .env file and pass it in the OpenAI::client($_ENV['OPENAI_API_KEY']); statement
  • we request a URL response with an image size 1024 x 1024
  • we prompt the dall-e-3 service to generate an image of 'A panda flying over Paris at night'.

To run the app, start the PHP web server to listen on port number 8888 with the following command in the openai-dalle3-php folder.

php -S localhost:8888

You can view the resulting image that gets created by OpenAI by pointing your browser to the following URL:

http://localhost:8888/image-url.php

This is the image that got created for me:


Every time you run the app you will likely get a different image.

2) Get Base64 encoding of the image

Create another file named image-b64.php with the following content:

    <?php

require_once __DIR__ . '/vendor/autoload.php';

$dotenv = Dotenv\Dotenv::createImmutable(__DIR__);
$dotenv->load();
 
$client = OpenAI::client($_ENV['OPENAI_API_KEY']);

$response = $client->images()->create([
    'model' => 'dall-e-3',
    'prompt' => 'A panda flying over Paris at night',
    'n' => 1,
    'size' => '1024x1024',
    'response_format' => 'b64_json',
]);

foreach ($response->data as $data) {
    $data->url; // null
    $data->b64_json; 
}

// display base64 encoded image
echo '<img src="data:image/jpeg;base64,' . $data->b64_json. '" />';

?>

The only changes that were made are in the following lines of code:

(1) 'response_format' => 'b64_json',

Whereas previously, we requested a URL, this time we request base-64 encoding.

(2) echo '<img src="data:image/jpeg;base64,' . $data->b64_json. '" />';

When rendering the image, we use base64 rendering instead of an image URL.

Point your browser to http://localhost:8888/image-b64.php. This is what I experienced:

Conclusion

There are may services that you can consume from OpenAP, like chat completion, text completion, embeddings, etc.. Now that you know how things work, go ahead and try some of the other services.

Saturday, January 13, 2024

Generate Images with Azure OpenAI Dall-E 3, Semantic Kernel, and C#

It is very easy to generate images using the OpenAI Dall-E 3 service and Semantic Kernel. You provide the text describing what you want and OpenAI will generate for you the image. In this tutorial, we will use Semantic Kernel and Azure OpenAI to do exactly that.

Source Code: https://github.com/medhatelmasry/DalleImage.git

Companion Video: https://youtu.be/Dr727OhX4HU

What is Semantic Kernel?

This is the official definition obtained from Create AI agents with Semantic Kernel | Microsoft Learn:

Semantic Kernel is an open-source SDK that lets you easily build agents that can call your existing code. As a highly extensible SDK, you can use Semantic Kernel with models from OpenAI, Azure OpenAI, Hugging Face, and more! 

Getting Started

In a suitable directory, create a console application named DalleImage and add to it two packages needed for our application with the following terminal window commands:

dotnet new console -o DalleImage
cd DalleImage
dotnet add package Microsoft.SemanticKernel
dotnet add package System.Configuration.ConfigurationManager

Create a file named App.config in the root folder of the console application and add to it the important parameters that allow access to the Azure OpenAI service. Contents of App.config are like the following:

<?xml version="1.0"?>
<configuration>
    <appSettings>
        <add key="endpoint" value="https://fake.openai.azure.com/" />
        <add key="api-key" value="fakekey-fakekey-fakekey-fakekey" />
        <add key="gpt-deployment" value="gpt-35-turbo" />
        <add key="dalle-deployment" value="dall-e-3" />
    </appSettings>
</configuration>

NOTE: Since I cannot share the endpoint and apiKey with you, I have fake values for these settings.

Currently, the Dall-E 3 model is in preview and only available in the "Sweden Central" Azure data centre according to https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#dall-e-models-preview

Let's Code

Open Program.cs and delete all its contents. Add the following using statements at the top:

using System.Configuration;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.TextToImage;

We need to read the App.config file settings into our application. We will use the ConfigurationManager from namespace System.Configuration. To read settings from App.config with ConfigurationManager, append the following code to Program.cs:

// Get configuration settings from App.config
string _endpoint = ConfigurationManager.AppSettings["endpoint"]!;
string _apiKey = ConfigurationManager.AppSettings["api-key"]!;
string _dalleDeployment = ConfigurationManager.AppSettings["dalle-deployment"]!;
string _gptDeployment = ConfigurationManager.AppSettings["gpt-deployment"]!;

Currently, we need to disable certain warning directives by adding the following into the .csproj file inside the <PropertyGroup> block:

<NoWarn>SKEXP0001, SKEXP0002, SKEXP0011, SKEXP0012</NoWarn>

Then, append this code to Program.cs:

// Create a kernel builder
var builder = Kernel.CreateBuilder(); 
 
// Add OpenAI services to the kernel
builder.AddAzureOpenAITextToImage(_dalleDeployment, _endpoint, _apiKey);
builder.AddAzureOpenAIChatCompletion(_gptDeployment, _endpoint, _apiKey); 
 
// Build the kernel
var kernel = builder.Build();

W
e created a builder object from SematicKernel, added the AddAzureOpenAITextToImage and AddAzureOpenAIChatCompletion services, then obtained an instance of the kernel object.

Get an instance of the "Dall-E" service from the kernel with the following code:

// Get AI service instance used to generate images
var dallE = kernel.GetRequiredService<ITextToImageService>();

Let us create a prompt that generates an image representing a phrase entered by the user. Append this code to Program.cs:

// create execution settings for the prompt
var prompt = @"
Think about an artificial object that represents {{$input}}.";

We then configure the prompt execution settings with:

var executionSettings = new OpenAIPromptExecutionSettings {
    MaxTokens = 256,
    Temperature = 1
};

Temperature is a measure of how creative you want the AI to be. This ranges from 0 to 1, where 0 is least creative and 1 is most creative.

We will create a semantic function from our prompt with:

// create a semantic function from the prompt
var genImgFunction = kernel.CreateFunctionFromPrompt(prompt, executionSettings);

Let us ask the user for input with:

// Get a phrase from the user
Console.WriteLine("Enter a phrase to generate an image from: ");
string? phrase = Console.ReadLine();
if (string.IsNullOrEmpty(phrase)) {
    Console.WriteLine("No phrase entered.");
    return;
}

Next, ask the kernel to combine the prompt with the input received from to user, producing a description.

// Invoke the semantic function to generate an image description
var imageDescResult = await kernel.InvokeAsync(genImgFunction, new() { ["input"] = phrase });
var imageDesc = imageDescResult.ToString();

Finally, ask Dall-E service to do the important work of generating an image based on the description. It returns an image url. This is done with the following code:

// Use DALL-E 3 to generate an image. 
// In this case, OpenAI returns a URL (though you can ask to return a base64 image)
var imageUrl = await dallE.GenerateImageAsync(imageDesc.Trim(), 1024, 1024);

Let’s print the output URL so that the user can pop it into a browser to see what it looks like:

Console.WriteLine($"Image URL:\n\n{imageUrl}");

Running App

Let’s try it out. Run the app in a terminal window with:

dotnet run

The user is prompted with “Enter a phrase to generate an image from:”. I entered “a lobster flying over the pyramids in giza”, and received this output:


I find it pretty fascinating how OpenAI can generate images based on text-based descriptions. I hope you do too.