Create Your Own AI Image Generator App With JavaScript and DALL-E 3
DALL-E 3 is an image generation model that excels at generating images from text prompts. It can understand and interpret complex textual descriptions and translate them into visual representations. The images generated are also high-resolution and diverse in style.By the end of this tutorial, we will have something like this:
HTML Structure
The HTML structure will consist of the following elements:
- A small button at the top right, which, when clicked, will allow the user to add their API KEY to local storage.
- A text input where users will enter their prompt
- A button that, when clicked, will take the prompt from the user and call the DALL-E API to generate the image
Install Bootstrap
We will be using Bootstrap to build the interface. Bootstrap is a framework that allows developers to build responsive sites in a short amount of time. Either link to the relevant CSS and JS files in the head of your HTML document, or (if you’re using CodePen) you’ll find the Bootstrap dependencies under the CSS and JS settings tabs.
API Message
First, start by displaying a message that the app requires an API key and then show a link where users can get the API KEY. Here’s the markup:
Next, add the ADD API KEY button
1 |
|
2 |
|
3 |
id="api" |
4 |
type="button" |
5 |
class="btn btn-info" |
6 |
data-bs-toggle="modal" |
7 |
data-bs-target="#KeyModal" |
8 |
>
|
9 |
ADD API KEY |
10 |
|
11 |
|
The button uses absolute positioning to ensure it stays at the top right, and it is also set to data-bs-target="#KeyModal.
This attribute means the button is linked to an element with the ID KeyModal.
KeyModal Button
When clicked, the button will trigger the modal to open. Bootstrap uses data-bs-target
to reference an element by its ID, so when the button is clicked, it will look for the element with the id of KeyModal
and perform the specified actions.
Let’s add the modal below the button.
1 |
|
2 |
|
3 |
|
4 |
|
5 |
|
6 |
|
7 |
Your API Key remains stored locally in your browser |
8 |
|
9 |
|
10 |
|
11 |
|
12 |
|
13 |
type="text" class="form-control" id="apikey" />
|
14 |
|
15 |
|
16 |
|
17 |
|
18 |
type="button" |
19 |
class="btn btn-secondary" |
20 |
data-bs-dismiss="modal" |
21 |
>
|
22 |
Close |
23 |
|
24 |
|
25 |
|
26 |
|
27 |
|
28 |
|
29 |
|
The modal contains the following elements:
- A modal dialog which ensures the modal is centered on the page
- The modal content contains an input text for entering the API key, a button for saving the key, and a close button that removes the modal from the page .
Our App’s Main Section
Now let’s start building the main section of the application. The main section will consist of the following elements
- TextInput: This input field will take in the user’s prompt. The prompt will describe the image they want to generate, for example, a “A cat chasing a mouse”.
- Button: This button will initiate the image generation process when clicked.
- Gallery: A display of sample images previously generated by DALLE to showcase its capabilities.
Create a Bootstrap container which will house the elements:
1 |
|
2 |
|
3 |
|
4 |
|
5 |
|
Let’s start by adding a header at the top of the page with the title “AI Image Generator” and a description of the application
1 |
|
2 |
|
3 |
|
4 |
Bring your vision to life with Generative AI. Simply describe what you |
5 |
want to see! |
6 |
|
7 |
|
The Form
Next add a form that will contain the input text and the generate button.
1 |
|
2 |
|
3 |
|
4 |
|
5 |
|
6 |
|
7 |
type="text" |
8 |
class="form-control py-2 pb-2" |
9 |
id="prompt" |
10 |
placeholder="A cartoon of a cat catching a mouse" |
11 |
/>
|
12 |
|
13 |
|
14 |
|
15 |
|
16 |
|
17 |
Generate Image |
18 |
|
19 |
|
20 |
|
21 |
|
22 |
|
23 |
|
The layout ensures that the input text spans 3/4 of the entire space to provide enough space for the prompt, and the button is positioned at the right to occupy the remaining space.
Spinner
Next, add a spinner that will show when an image is generated.
1 |
|
2 |
|
3 |
class="visually-hidden"https://webdesign.tutsplus.com/>Loading...
|
4 |
|
Image Gallery
The last section will contain a few images generated by the DALL-E model.
1 |
|
2 |
|
3 |
|
4 |
|
5 |
|
6 |
|
7 |
|
8 |
|
We will use JavaScript to display the images dynamically. The gallery container will also be used to display the image generated from a prompt.
Styling With CSS
Besides the Bootstrap framework, we will also add a few custom CSS classes:
1 |
@import url("https://fonts.googleapis.com/css2?family=DM+Mono:ital,wght@0,300;0,400;0,500;1,300;1,400;1,500&display=swap"); |
2 |
|
3 |
body { |
4 |
font-family: "DM Mono"https://webdesign.tutsplus.com/, monospace; |
5 |
}
|
6 |
h1 { |
7 |
font-weight: 900; |
8 |
}
|
9 |
p { |
10 |
font-weight: 500; |
11 |
}
|
12 |
.message, |
13 |
#spinner { |
14 |
display: none; |
15 |
}
|
Here, we are using a custom font from Google Fonts, and we have also set the message element and the spinner to be hidden by default.
JavaScript Functionality
On to the behaviour! The first thing we want to do is to add the functionality for enabling users to add their API key to local storage. We will use jQuery to open and close the modal.
We already have data-bs-target="#KeyModal"
on the ADD API KEY button, which opens the modal. Now, we will listen for the shown.bs.modal
event. The shown.bs.modal
is a Bootstrap functionality for modal dialogs which is triggered after the modal has been shown to the user
1 |
$("https://webdesign.tutsplus.com/#KeyModal"https://webdesign.tutsplus.com/).on("https://webdesign.tutsplus.com/shown.bs.modal"https://webdesign.tutsplus.com/, function () { |
2 |
// get api key from user and save to local storage
|
3 |
|
4 |
});
|
Inside the event listener function, we will get the modal components, which include a text input and a button.
1 |
$("https://webdesign.tutsplus.com/#KeyModal"https://webdesign.tutsplus.com/).on("https://webdesign.tutsplus.com/shown.bs.modal"https://webdesign.tutsplus.com/, function () { |
2 |
const saveButton = document.querySelector("https://webdesign.tutsplus.com/#KeyModal .btn-primary"https://webdesign.tutsplus.com/); |
3 |
const apiKeyInput = document.querySelector("https://webdesign.tutsplus.com/#apikey"https://webdesign.tutsplus.com/); |
4 |
});
|
Save Button Event Listener
Next, we will add an event listener to the save button of the modal. Inside the event listener function, we will get the value of the API KEY, save it to local storage, and then close the modal.
1 |
$("#KeyModal").on("shown.bs.modal", function () { |
2 |
const saveButton = document.querySelector("#KeyModal .btn-primary"); |
3 |
const apiKeyInput = document.querySelector("#apikey"); |
4 |
|
5 |
saveButton.addEventListener("click", function () { |
6 |
const apiKeyValue = apiKeyInput.value; |
7 |
localStorage.setItem("API_KEY", apiKeyValue); |
8 |
$("#KeyModal").modal("hide"); |
9 |
}); |
10 |
}); |
DALL-E 3
OpenAI provides two models for text-to-image generation, DALL·E 3 and DALL·E 2. We are going to use DALLE3 the latest model,
DALL-E 3 is a new state of the art text to image generator which adheres closely to the text provided when generating images.
While you dont have to be an expert in prompt engineering to use DALL-E 3, better prompts will generate better results.
Get API KEY
To obtain an API key, you need an OpenAI account. Go to the OpenAI website and create an account. Once you log in, you will see this page.
On the top left side, click on the API keys icon, and you will be redirected to a page where you can create your API KEY.
Once you create your API KEY, ensure you copy it since it wont be shown again.
How to use DALL-E 3
The DALL·E 3 model allows developers to generate images from text using this API endpoint.
1 |
https://api.openai.com/v1/images/generations |
The API endpoint allows you to create standard and HD-quality images. If the quality is not set, standard images will be generated by default, and the image sizes are 1024×1024, 1024×1792, or 1792×1024 pixels.
DALL-E 3 allows you to request 1 or more images(up to 10). If you want to request more than 1 image, you can do so by making parallel requests This is how you would generate a standard image of size 1024×1024 from the prompt ” a red cat.”
1 |
curl https://api.openai.com/v1/images/generations
|
2 |
-H "Content-Type: application/json" |
3 |
-H "Authorization: Bearer $OPENAI_API_KEY" |
4 |
-d '{ |
5 |
"model": "dall-e-3",
|
6 |
"prompt": "a red cat",
|
7 |
"n": 1,
|
8 |
"size": "1024x1024"
|
9 |
}
|
As you can see above, the API endpoint requires you to include the following headers in your request:
-
Content-Type
set toapplication/json
-
Authorization
set toBearer
, followed by your OpenAI API key
The data sent in the request will include :
-
model
is the model to use for generating an image -
prompt
– this is the text or the description of the image you want generated. -
n
is an integer that specifies the number of images to generate. -
size
is the size of the image in pixels
Image Generation
The next step is to generate an image from the prompt provided by the user. To do that we will add an event listener to the generate form. When the form is submitted, it will retrieve the prompt
from the user, obtain the API key from local storage, and call another function (fetchImage()
), which will in turn generate an image.
But first , let’s get the necessary elements from the DOM:
1 |
const message = document.getElementById("https://webdesign.tutsplus.com/message"https://webdesign.tutsplus.com/); |
2 |
const generateForm = document.getElementById("https://webdesign.tutsplus.com/generate-form"https://webdesign.tutsplus.com/); |
3 |
const spinner = document.getElementById("https://webdesign.tutsplus.com/spinner"https://webdesign.tutsplus.com/); |
Next, let’s add an event listener that listens for the submitted event from the form.
1 |
generateForm.addEventListener("https://webdesign.tutsplus.com/submit"https://webdesign.tutsplus.com/, function (e) { |
2 |
e.preventDefault(); |
3 |
// get prompt
|
4 |
// get api key
|
5 |
// perform validation
|
6 |
// call fetchImage() function
|
7 |
|
8 |
});
|
Inside the event listener function, update the code as follows:
1 |
generateForm.addEventListener("https://webdesign.tutsplus.com/submit"https://webdesign.tutsplus.com/, function (e) { |
2 |
e.preventDefault(); |
3 |
const promptInput = document.getElementById("https://webdesign.tutsplus.com/prompt"https://webdesign.tutsplus.com/); |
4 |
const prompt = promptInput.value; |
5 |
const key = localStorage.getItem("https://webdesign.tutsplus.com/API_KEY"https://webdesign.tutsplus.com/); |
6 |
console.log(key); |
7 |
|
8 |
if (!prompt) { |
9 |
displayMessage("https://webdesign.tutsplus.com/Please enter a prompt"https://webdesign.tutsplus.com/); |
10 |
return; |
11 |
}
|
12 |
if (!key) { |
13 |
displayMessage( |
14 |
"https://webdesign.tutsplus.com/Please add your API KEY, The Key will be store locally on your browser" |
15 |
);
|
16 |
return; |
17 |
} else { |
18 |
fetchImage(prompt, key); |
19 |
|
20 |
}
|
21 |
});
|
In the updated code, after the submit event is fired by the form, we get the prompt from the user and the API key from local storage. If the user has not provided a prompt, we display a message asking the user to enter one.
Similarly, if the API key is missing, we prompt the user to add their API key, if both the prompt and API key are present, we call the fetchImage
function and pass the prompt and the API KEY values as arguments
fetchImage()
is the function that will use the DALL-E 3 API endpoint to generate an image based on the user’s prompt.
The displayMessage()
function looks like this:
1 |
function displayMessage(msg) { |
2 |
message.textContent = msg; |
3 |
message.style.display = "https://webdesign.tutsplus.com/block"https://webdesign.tutsplus.com/; |
4 |
setTimeout(function () { |
5 |
message.style.display = "https://webdesign.tutsplus.com/none"https://webdesign.tutsplus.com/; |
6 |
}, 3000); |
7 |
}
|
We are setting the content of the alert element to the message from the form event. The setTimeout
function ensures that the message element will be hidden after 3 seconds.
fetchImage Function
Next, let’s create the fetchImage
function, which will be an async function. It will take the prompt
and API_KEY
as parameters.
1 |
const fetchImage = async (prompt, API_KEY) => { |
2 |
|
3 |
}
|
Inside the function, we define the API endpoint and store the required headers and data required by the API in a variable called options
.
The options object includes:
- The HTTP method.
- Headers for content type and authorization.
- The body (a JSON string containing the model, prompt, n(number of images), and image size.
1 |
const url = "https://webdesign.tutsplus.com/https://api.openai.com/v1/images/generations"https://webdesign.tutsplus.com/; |
2 |
const options = { |
3 |
method: "https://webdesign.tutsplus.com/POST"https://webdesign.tutsplus.com/, |
4 |
headers: { |
5 |
"https://webdesign.tutsplus.com/content-type"https://webdesign.tutsplus.com/: "https://webdesign.tutsplus.com/application/json"https://webdesign.tutsplus.com/, |
6 |
Authorization: `Bearer ${API_KEY}`, |
7 |
},
|
8 |
body: JSON.stringify({ |
9 |
model: "https://webdesign.tutsplus.com/dall-e-3"https://webdesign.tutsplus.com/, |
10 |
prompt: prompt, |
11 |
n: 1, |
12 |
size: "https://webdesign.tutsplus.com/1024x1024"https://webdesign.tutsplus.com/, |
13 |
}),
|
14 |
};
|
Next, inside a try block, we perform a POST request using the fetch API, specifying the url
and the options
object. While the fetch is happening, we display the spinner immediately.
We then check the response, and if it’s not successful (!response.ok
) , we display an error message to the user, and then we exit the function to prevent further execution.
1 |
const fetchImage = async (prompt, API_KEY) => { |
2 |
const url = "https://webdesign.tutsplus.com/https://api.openai.com/v1/images/generations"https://webdesign.tutsplus.com/; |
3 |
const options = { |
4 |
method: "https://webdesign.tutsplus.com/POST"https://webdesign.tutsplus.com/, |
5 |
headers: { |
6 |
"https://webdesign.tutsplus.com/content-Type"https://webdesign.tutsplus.com/: "https://webdesign.tutsplus.com/application/json"https://webdesign.tutsplus.com/, |
7 |
Authorization: `Bearer ${API_KEY}`, |
8 |
},
|
9 |
body: JSON.stringify({ |
10 |
model: "https://webdesign.tutsplus.com/dall-e-3"https://webdesign.tutsplus.com/, |
11 |
prompt: prompt, |
12 |
n: 1, |
13 |
size: "https://webdesign.tutsplus.com/1024x1024"https://webdesign.tutsplus.com/, |
14 |
}),
|
15 |
};
|
16 |
|
17 |
try { |
18 |
spinner.style.display = "https://webdesign.tutsplus.com/block"https://webdesign.tutsplus.com/; |
19 |
const response = await fetch(url, options); |
20 |
|
21 |
if (!response.ok) { |
22 |
const error = await response.json(); |
23 |
const message = error.error.message ? error.error.message : "https://webdesign.tutsplus.com/Failed to fetch image"https://webdesign.tutsplus.com/; |
24 |
displayMessage(message); |
25 |
return; |
26 |
}
|
27 |
|
28 |
|
29 |
} catch (error) { |
30 |
|
31 |
}finally { |
32 |
|
33 |
}
|
34 |
};
|
If the response is successful, we will asynchronously obtain the JSON data from the response object and store it in a variable called result
.
1 |
const result = await response.json(); |
For example, the prompt “a blue cat ” returns this object. The url has been truncated
1 |
{ |
2 |
"created": 1713625375, |
3 |
"data": [ |
4 |
{ |
5 |
"revised_prompt": "Imagine a cat with the most unique color you can |
6 |
think of - a brilliant shade of dark cerulean. This is no ordinary |
7 |
cat. Picture this feline lounging in the midday sun, its fur |
8 |
shimmering in the light. The color is an almost surreal hue, |
9 |
rich and saturated, as if pulled straight from a painter's palette. |
10 |
The cat's eyes are a contrasting emerald green, watching the world |
11 |
with a wise but relaxed gaze. Imagine the blue cat's body shape, |
12 |
muscular and agile, made for speedy pursuits and stealthy approaches. |
13 |
Now, consider how this splendid creature would look in its natural habitat.", |
14 |
"url": "https://oaidalleapiprodscus.blob.core.windows.net/private/org-..." |
15 |
} |
16 |
] |
17 |
} |
The data also includes a revised_prompt, which DALL-E 3 used to refine the image generation process. From the object received, we can get the url
of the image and pass it to another function displayImage()
, which will display it to the user on the web page.
1 |
const imageUrl = result.data[0].url |
2 |
displayImage(imageUrl); |
The next thing we want to do is pass the image url to a function called displayImage()
.
1 |
const imageUrl = result.data[0].url |
2 |
displayImage(imageUrl); |
In the catch block, we handle any exceptions that might occur during the fetch operation by displaying an appropriate error message to the user.
The final block will be executed regardless of the outcome of the fetch request; therefore, it’s a good place to ensure the spinner is hidden regardless of whether the request is successful.
1 |
catch (error) { |
2 |
console.error(error); |
3 |
displayMessage("https://webdesign.tutsplus.com/There was an error , try again"https://webdesign.tutsplus.com/); |
4 |
}finally { |
5 |
spinner.style.display = "https://webdesign.tutsplus.com/none"https://webdesign.tutsplus.com/; |
6 |
}
|
displayImage Function
The displayImage()
function will look like this:
1 |
function displayImage(image) { |
2 |
|
3 |
const imageMarkup = ` |
4 |
|
5 |
|
6 |
${image}" class="img-fluid" alt="Placeholder Image"> |
7 |
|
8 |
`; |
9 |
|
10 |
imageGallery.innerHTML = imageMarkup; |
11 |
spinner.style.display = "https://webdesign.tutsplus.com/none"https://webdesign.tutsplus.com/; |
12 |
}
|
Let’s break it down ,
First, we create HTML markup to specify a responsive Bootstrap column and set the src
attribute of the img tag to the generated image url
. Then, we inject this markup into the imageGallery
container
The final step is to display some of the images generated by DALL-E 3 as a gallery so that when the users first open the app, the images will showcase the app’s capabilities.
First let’s store the images in an array:
1 |
const images = [ |
2 |
"https://webdesign.tutsplus.com/https://essykings.github.io/JavaScript/image%207.png"https://webdesign.tutsplus.com/, |
3 |
"https://webdesign.tutsplus.com/https://essykings.github.io/JavaScript/image1.png"https://webdesign.tutsplus.com/, |
4 |
"https://webdesign.tutsplus.com/https://essykings.github.io/JavaScript/image2.png"https://webdesign.tutsplus.com/, |
5 |
"https://webdesign.tutsplus.com/https://essykings.github.io/JavaScript/image3.png"https://webdesign.tutsplus.com/, |
6 |
"https://webdesign.tutsplus.com/https://essykings.github.io/JavaScript/image9.png"https://webdesign.tutsplus.com/, |
7 |
"https://webdesign.tutsplus.com/https://essykings.github.io/JavaScript/image5.png"https://webdesign.tutsplus.com/, |
8 |
"https://webdesign.tutsplus.com/https://essykings.github.io/JavaScript/image6.png"https://webdesign.tutsplus.com/, |
9 |
"https://webdesign.tutsplus.com/https://essykings.github.io/JavaScript/cat.png"https://webdesign.tutsplus.com/, |
10 |
];
|
Next, we will use the map()
method to iterate over the images. For each image, we will set the src
attribute of an element to the image URL and then append it to the image gallery container.
Finally we will invoke the displayImages()
function.
1 |
function displayImages() { |
2 |
const imageMarkup = images |
3 |
.map((image) => { |
4 |
return ` |
5 |
|
6 |
${image}" class="img-fluid" alt="Placeholder Image"> |
7 |
|
8 |
`; |
9 |
})
|
10 |
.join(""https://webdesign.tutsplus.com/); |
11 |
|
12 |
imageGallery.innerHTML = imageMarkup; |
13 |
}
|
14 |
|
15 |
displayImages(); |
Final Demo
We’ve done it! Our app is fully functional!
Conclusion
This tutorial has covered how to build an image-generation app with AI. This app can be applied in various fields, such as education to create illustrations, gaming to create visuals, etc. I hope you enjoyed it!