How to Build a Text-to-Voice Application With JavaScript
This tutorial will cover how to convert text into speech using JavaScript using WebSpeechAPI
. It will feature a simple interface where the user adds the text to be spoken, then clicks a button to generate the corresponding speech.
Our Text-to-Speech Demo
Here’s what we’re going to build. Type anything you want in the textarea, select the language you’ve written it in, and click the button to hear the result!
HTML Structure
Okay, let’s start building. The HTML Structure will consist of the following elements:
- a
for the text to be converted.
- A
element. Inside the select element, we will populate language options.
- A generate
which, when clicked, will speak the text content provided.
To keep us focused on functionality, we’ll use Bootstrap to build the interface. Ensure you add the Bootstrap CDN link in your header like this:
1 |
|
2 |
href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/css/bootstrap.min.css" |
3 |
rel="stylesheet" |
4 |
integrity="sha384-QWTKZyjpPEjISv5WaRU9OFeRpok6YctnYmDr5pNlyT2bRjXh0JMhjY6hW+ALEwIH" |
5 |
crossorigin="anonymous" |
6 |
/>
|
Add the HTML Structure.
1 |
|
2 |
|
3 |
|
4 |
Text to Voice Converter |
5 |
|
6 |
|
7 |
|
8 |
|
9 |
|
10 |
|
11 |
|
12 |
|
13 |
|
14 |
|
15 |
|
16 |
|
17 |
|
Additional Styling with CSS
Bootstrap handles pretty much all the styling for us. But let’s add some custom CSS properties to our design. These will give us a custom font, a container, some extra spacing for the elements in the form, and a rule to hide our alert message.
1 |
@import url("https://fonts.googleapis.com/css2?family=DM+Mono:ital,wght@0,300;0,400;0,500;1,300;1,400;1,500&display=swap"); |
2 |
|
3 |
body { |
4 |
font-family: "DM Mono", monospace; |
5 |
}
|
6 |
.container { |
7 |
width: 100%; |
8 |
max-width: 600px; |
9 |
padding: 2rem 0; |
10 |
}
|
11 |
.form-group { |
12 |
margin: 2rem 0; |
13 |
}
|
14 |
label { |
15 |
margin-bottom: 1rem; |
16 |
}
|
17 |
.message{ |
18 |
display: none; |
19 |
}
|
We have set display:none
to the alert component so that it will only appear if there are error messages to display.
JavaScript Functionality
As I explained in the introduction, we can obtain voices using the speechSynthesis.getVoices()
method; let’s start by getting and storing them in an array like this.
1 |
const voices = [ |
2 |
{ name: "Google Deutsch", lang: "de-DE" }, |
3 |
{ name: "Google US English", lang: "en-US" }, |
4 |
{ name: "Google UK English Female", lang: "en-GB" }, |
5 |
{ name: "Google UK English Male", lang: "en-GB" }, |
6 |
{ name: "Google español", lang: "es-ES" }, |
7 |
{ name: "Google español de Estados Unidos", lang: "es-US" }, |
8 |
{ name: "Google français", lang: "fr-FR" }, |
9 |
{ name: "Google हिन्दी", lang: "hi-IN" }, |
10 |
{ name: "Google Bahasa Indonesia", lang: "id-ID" }, |
11 |
{ name: "Google italiano", lang: "it-IT" }, |
12 |
{ name: "Google 日本語", lang: "ja-JP" }, |
13 |
{ name: "Google 한국의", lang: "ko-KR" }, |
14 |
{ name: "Google Nederlands", lang: "nl-NL" }, |
15 |
{ name: "Google polski", lang: "pl-PL" }, |
16 |
{ name: "Google português do Brasil", lang: "pt-BR" }, |
17 |
{ name: "Google русский", lang: "ru-RU" }, |
18 |
{ name: "Google 普通话(中国大陆)", lang: "zh-CN" }, |
19 |
{ name: "Google 粤語(香港)", lang: "zh-HK" }, |
20 |
{ name: "Google 國語(臺灣)", lang: "zh-TW" } |
21 |
];
|
Identify the Required Elements
Next, use the Document Object Model (DOM) to obtain the alert, select, and button elements.
1 |
const optionsContainer = document.querySelector(".select-voices"); |
2 |
const convertBtn = document.querySelector(".convert"); |
3 |
const messageContainer = document.querySelector(".message") |
Create Voices Selection
The optionsContainer
represents the element for the drop-down list of voices from which the user will select an option.
We want to populate it with the voices from the voices array. Create a function called addVoices()
.
1 |
function addVoices(){ |
2 |
// populate options with the voices from array
|
3 |
|
4 |
}
|
Inside the function, use the forEach()
method to loop through the voices array, and for each voice object, set option.value = voice.lang
and option.text = voice.name
, then append the option to the select element.
1 |
function addVoices() { |
2 |
console.log(voices); |
3 |
voices.forEach((voice) => { |
4 |
let option = document.createElement("option"); |
5 |
option.value = voice.lang; |
6 |
option.textContent = voice.name; |
7 |
optionsContainer.appendChild(option); |
8 |
|
9 |
if (voice.lang === "en-US") { |
10 |
option.selected = true; |
11 |
}
|
12 |
});
|
13 |
}
|
We need to invoke the addVoices()
function to apply the functionality, however, for the Chrome browser, we need to listen to the voiceschanged
event and then call the addVoices()
function. So we’ll add a condition:
1 |
if (navigator.userAgent.indexOf("Chrome") !== -1) { |
2 |
speechSynthesis.addEventListener("voiceschanged", addVoices); |
3 |
} else { |
4 |
addVoices(); |
5 |
}
|
The voiceschanged
event is a JavaScript event fired when the list of available speech synthesis voices changes. The event happens when the list of available voices is ready to use.
Button Event Listener
Add a click event listener to the generate button.
1 |
convertBtn.addEventListener("click", function () { |
2 |
// display an alert message if content is empty
|
3 |
// pass the arguments to convertToSpeech()
|
4 |
});
|
Inside the event listener function, we want to display an alert if the content is not provided, get the text from the textarea, get the selected language, and pass the values to the convertToSpeech()
function.
Update the event listener as follows.
1 |
convertBtn.addEventListener("click", function () { |
2 |
convertText = document.querySelector(".content").value; |
3 |
|
4 |
if (convertText === "") { |
5 |
messageContainer.textContent = " Please provide some text"; |
6 |
messageContainer.style.display = "block"; |
7 |
|
8 |
setTimeout(() => { |
9 |
messageContainer.textContent = ""; |
10 |
messageContainer.style.display = "none"; |
11 |
}, 2000); |
12 |
|
13 |
return; |
14 |
}
|
15 |
|
16 |
const selectedLang = |
17 |
optionsContainer.options[optionsContainer.selectedIndex].value; |
18 |
|
19 |
|
20 |
convertToSpeech(convertText, selectedLang); |
21 |
});
|
Create the convertToSpeech()
function and add the code below.
1 |
function convertToSpeech(text, lang) { |
2 |
if (!("speechSynthesis" in window)) { |
3 |
messageContainer.textContent = |
4 |
" Your browser is not supported, try another browser"; |
5 |
messageContainer.style.display ="block" |
6 |
return; |
7 |
}
|
8 |
let utterance = new SpeechSynthesisUtterance(); |
9 |
utterance.lang = lang; |
10 |
utterance.text = text; |
11 |
|
12 |
speechSynthesis.speak(utterance); |
13 |
|
14 |
}
|
The covertToSpeech()
function will take the two parameters, i.e., the text to be converted and the language the text should be spoken in.
Let’s break it down:
- First, we will check if the browser supports speech synthesis; if it doesn’t, we will display the message “Your browser is not supported; try another browser”
- If speech synthesis is supported, we will create a new
SpeechSynthesisUtterance
instance and assign it to the variable utterance. - Then we apply the text to the speech request with
utterance.text
and the language withutterance.lang
. - Finally, the browser will speak the text using
speechSynthesis.speak(utterance)
.
Conclusion
I hope you enjoyed this tutorial and learned something useful! We covered everything you need to create text-to-voice apps by leveraging the capabilities of WebSpeechApi
. Incorporating text-to-voice functionality in your application will cater to diverse user needs and will improve its overall accessibility.
Let’s remind ourselves what we created: