Speechify.com is a service that provides text-to-speech (TTS) technology primarily geared towards educational and productivity purposes. I was looking for the best solution on the highest quality text-to-speech solution and came across this service. I really liked the idea that the service has the functionality to highlight the spoken text in real time. Just check it out:
Speechify.com’s combination of high-quality text-to-speech and real-time highlighting can be particularly beneficial for various use cases, such as improving accessibility for individuals with visual impairments, enhancing comprehension for language learners, or increasing productivity for users who prefer to listen to written content while multitasking.
There is also a very cool presentation video with the creator of this startup —
Cliff Weitzman himself:
Very cool service, but in fact this function can be easily implemented using the API of chrome itself (the easiest way) or by using such services as Yandex TTS, Google Cloud Text-to-Speech, Amazon Polly, or Microsoft Azure Text-to-Speech. Moreover there are a lot of opensource services:
- eSpeak: eSpeak is a compact, open-source TTS engine that supports multiple languages and platforms. It’s designed to be lightweight and can be used in various applications, including embedded systems.
- Festival and Flite (Festival Lite): Festival is a general-purpose TTS system developed by the University of Edinburgh. It offers support for multiple languages and provides tools for building custom voices. Flite is a smaller and faster version of the Festival TTS system. It’s designed for resource-constrained environments and can be easily integrated into embedded systems and mobile applications.
- MaryTTS: MaryTTS is an open-source TTS platform that aims to provide high-quality, multilingual speech synthesis. It supports various languages and features customizable voice models.
- Mozilla TTS: Mozilla TTS is a deep learning-based TTS system developed by Mozilla. It uses neural network models to generate speech and offers high-quality, natural-sounding output.
- OpenTTS: OpenTTS is an open-source TTS engine developed by OpenAI. It’s based on cutting-edge research in deep learning and offers high-quality, expressive speech synthesis.
So, there is a lot of pre-ready-made solutions.
And now goes the question:
How to make Text-to-Speech with Word Highlighting in any web app?
Well, it is quite easy. I sketched out a simple demo example based on the Web Speech API.
And, by the way, such tasks are easily can be solved by using ChatGPT, Gemini or another AI-agent, just send the prompt message:
Create a CodePen project that implements a simple Text-to-Speech (TTS) functionality with word highlighting. The application should have the following features:
- An input area where users can input the text they want to be read aloud.
- A button to initiate the TTS process.
- As the text is being read aloud, each word should be highlighted.
- The highlighting should move along with the spoken word.
- The application should use the Web Speech API for TTS functionality.
- Implement error handling for speech synthesis errors.
Tasks:
- Set up a new code CodePen project.
- Create an HTML structure with an input area, a button, and a container for the highlighted text.
- Write JavaScript code to handle the TTS functionality.
- Use the Web Speech API to synthesize speech from the input text.
- Implement word highlighting as the text is being spoken.
- Handle errors that may occur during speech synthesis.
- Test the application to ensure proper functionality.
- Optionally, style the application to make it visually appealing.
- Ensure the code is properly documented and organized.
See the Pen TTS with Word Highlighting Demo by Nickolai Yegorov (@e-Nicko) on CodePen.
Pretty cool, right?
It really is that easy these days!
This is a basic example to get started. It is easy to enhance it further by adding more features, such as controlling the speed of the speech (rate property), pausing and resuming speech, or handling more complex text formatting (just as Speechify, it will be cool to make light highlight for whole current sentence and main highlight for the current reading word). Additionally, you can style the highlighted text and its container using CSS to make it visually appealing.
Definitely, this is a great find today!
Surely will use this functionality in my digital products and services! Of course with better TTS voices and more advanced features.





