As I have covered previously, audiobook production and consumption continue to grow fiercely. I discuss the continually expanding voice options with my growing list of author clients who produce, publish and distribute books, ebooks and audiobooks through my TecnoTur. Those options vary depending upon personal taste and budget. Some authors absolutely want to hire a human professional live voice talent and are willing to pay for the associated rate. Other authors look for a middle ground, using a professional human voice talent who has cloned her/his voice, since that makes it available at a much lower cost. Yet other authors prefer to record the entire production with their own voice, while others want to clone their own voice for greater efficiency for their current and upcoming audiobook productions. Whether it is a live human voice talent or an AI, it is often necessary to «direct» the voice talent. Indeed, it is possible to «direct» an AI voice talent to improve results. Ahead, I’ll share examples of directing to get ideal pronunciation of proper names and terms, as well as better pace, speed, even when dealing with AI voice talent, share some tools and techniques. I recently did that when producing and directing the English-language audiobook for Why Fidel abandoned Che?, written by Alberto Müller and narrated by Alberto’s chosen AI voice, Archie. Ahead, I’ll give you examples and explain where AI audiobooks can be currently distributed.
In this article
- Audiobook distribution using AI voices
- Advantages and disadvantages of Google AI voices
- Advantages and disadvantages of ElevenLabs AI voices
- Advantages and disadvantages of Descript AI voices
- A brief summary of seseo versus distinguished voices in Castilian
- Tools and techniques to direct an AI voice
- Conclusions
- Related articles
- Lee este artículo en buen castellano
Audiobook distribution using AI voices
Currently, the following platforms accept AI voices, although the sources from which they accept them vary:
- Amazon (in ßeta, by invitation, details ahead)
- Audible.com (in ßeta, by invitation, details ahead)
- Audiobooks.com
- Baker & Taylor
- Bibliotecha
- B&N audiobooks (Barnes & Noble)
- Everand
- Google (currently from Google AI voices only, as far as I know)
- Kobo (Raketen Kobo)
- Overdrive
- Raketen Kobo
- Spotify
- TuneIn
Since I have several published books, I have been invited by Amazon to be a ßeta tester. However, the Castilian AI voices offered by Amazon are currently seseo-only, not distinguished and cloning is not currently available.
Advantages and disadvantages of Google AI voices
Among the advantages of Google AI voices are:
- They can be distributed in all of the platforms mentioned above (except Amazon and Audible).
- If the book is in Castilian (aka «Spanish»), we have the choice of distinguished or seseo voices.
- Google allows us to sell audiobook made with the Google AI voice on other platforms (beyond Google), as long as Google is not undercut in price on other platforms, i.e. the author’s own book portal website (for direct sales) or any of the other platforms
- Currently, Google is not charging for the AI voices
The only disadvantage of the Google AI voices is that currently, cloning of our own voices is not yet available.
Advantages and disadvantages of ElevenLabs AI voices
Among the advantages of ElevenLabs voices are:
- We can clone our own voice or the author’s voice, if desired.
- They can be distributed in all of the platforms mentioned above (except Amazon, Audible and Google).
- If the book is in Castilian (aka «Spanish»), we have the choice of distinguished or seseo voices.
- Audiobooks produced by ElevenLabs AI voices can be produced by paying users and can be sold anywhere, at any price, and are currently accepted in all of those places except Amazon, Audible and Google.
I have already cloned my voice in two languages (Castilian and English) with ElevenLabs and will publish another soon.
Advantages and disadvantages of Descript AI voices
Among the advantages of Descript AI voices are:
- We can clone our own voice or the author’s voice, if desired.
- Voices are available in different languages, with the limitations mentioned below.
Among the disadvantages of Descript AI voices to my knowledge are:
- Castilian language voices currently seem to be seseo only, not distinguished. It is unclear whether Descript would allow cloning a voice in Castilian with distinction (not seseo), the other disadvantage covered below already caused me to lose interest in it for my own use.
- Although Descript allows us to use the voice anywhere at any price, the only platform that accepts them for audiobooks seems to be the author’s own book portal website (for direct sales), but not anywhere else. That is why I chose to clone my voice with ElevenLabs instead.
A brief summary of seseo versus distinguished voices in Castilian
The word seseo describes the Castilian language voices where the soft C and the Z are both pronounced identically as the letter S. This is the way Castilian is pronounced by most native speakers in the Americas and in certain regions of Spain, i.e. parts of Andalusia (spelled Andalucía in Castilian) and the Canary Islands. Distinguished voices in Castilian are those speakers (mainly from Spain) who distinguish the sounds of the soft C and the Z from the letter S. Distinguished voices in Castilian pronounce the soft C and the Z with the sound of the soft th in English, as in words like «Thanksgiving». (I personally distinguish.)
The seseo is quite different from the ceceo. The ceceo is a lisp which is nearly always unintentional, where all three mentioned letters (C, S and Z) are pronounced like the soft th in English. The ceceo is rarely desired and is not instructed in schools in any country.
Tools and techniques to direct an AI voice
Even when an AI voice doesn’t offer any discreet tools, we have always been able to add or subtract commas or periods to add or subtract delays in an audible performance. We have also been able to write a proper name or unknown term phonetically to direct the AI voice to pronounce it correctly.
Fortunately, at least Google has recently improved upon the above, a few different ways:
We can either select a word and then type it in phonetically. Alternatively, we can click on the microphone symbol and «teach» the AI robot how we want the word pronounced. After doing either of that, we have the option to Apply once or Apply to all so it knows to pronounce it that way each time that name or term appears in the manuscript.
Once that is done, the name or term is underlined to indicate that it has learned a preferred pronunciation of that name or term, as you’ll see in the above screenshot.
Above you can listen to a sample of this audiobook, with Archie’s AI voice.
As I produced and directed the audiobook of Why Fidel abandoned Che?, I learned the following about these new tools:
- With about 88% of the names or terms, it worked well, so that the terms would at least sound like the speaker know the proper pronunciation, not to the point of sounding like a native speaker of a foreign name, but enough to sound like an educated person.
- Archie is a British AI voice, and seemed to know in advance that certain Castilian names should be pronounced with a silent H, but did not know it about all of them. For those, I had to remove the H from the name manually.
- Archie also knew that some Castilian names with a double L should be pronounced like a Y. However, in other cases, Archie didn’t know it, so I had to type it phonetically with a Y and set it to Apply to all.
I have not yet used these direction tools in ElevenLabs so far, but will publish about that in the future.
Conclusions
Both my own experience and Forbes magazine have shown that audiobook production and purchasing is rising. It is good that the tools and options also expand and improve with time. Even though some professional voice over talent initially were afraid that AI voices could take their work away, they now realize that by cloning their own voice, they can sell it exponentially more times at a lower rate each time, controlling its use as desired. That is what I will be covering in at least one new article. If you need help with your book, ebook or audiobook production, publishing or distribution, contact me via TecnoTurPublishing.com in English or EditorialTecnoTur.com en castellano.
Related articles
- Audiobook distribution strategy in 2024 + why M4B is ideal for direct sales
- Review: Audiobook Builder, an ideal authoring tool for audiobooks for direct distribution
Lee este artículo en buen castellano
(Re-)Subscribe for upcoming articles, reviews, radio shows, books and seminars/webinars
Stand by for upcoming articles, reviews, books and courses by subscribing to my bulletins.
In English:
- Email bulletins, bulletins.AllanTepper.com
- In Telegram, t.me/TecnoTurBulletins
- Twitter (bilingual), AllanLTepper
En castellano:
- Boletines por correo electrónico, boletines.AllanTepper.com
- En Telegram, t.me/boletinesdeAllan
- Twitter (bilingüe), AllanLTepper
Most of my current books are at books.AllanTepper.com, and also visit AllanTepper.com and radio.AllanTepper.com.
FTC disclosure
None of the platforms listed has paid for this article. Some of the manufacturers listed above have contracted Tépper and/or TecnoTur.LLC to carry out consulting and/or translations/localizations/transcreations. So far, none of the manufacturers listed above is/are sponsors of the TecnoTur, BeyondPodcasting, CapicúaFM or TuSaludSecreta programs, although they are welcome to do so, and some are, may be (or may have been) sponsors of ProVideo Coalition magazine. Some links to third parties listed in this article and/or on this web page may indirectly benefit TecnoTur.LLC via affiliate programs. Allan Tépper’s opinions are his own. Allan Tépper is not liable for misuse or misunderstanding of information he shares.

Filmtools
Filmmakers go-to destination for pre-production, production & post production equipment!
Shop Now