Authors were told that Apple – which at the time was not named as the company behind the technology – would shoulder the costs of production and writers would receive royalties from sales.
While there is potential for backlash by professional voice actors, authors themselves are increasingly being asked to narrate their own books. There is a financial incentive for the writers, both in the upfront payments and the expanded availability of their work.
But producing an audiobook with a human voice can take weeks and can cost publishers thousands of dollars. The lure of AI promises to significantly cut the costs.'
Death of the narrator? Apple unveils suite of AI-voiced audiobooks
In the long run this is almost certainly a good thing. It won't end professional audiobook narration, and great performers will be able to train AI on their voices and performance styles, so many more people will have access to them at minimal cost.
OTOH when an audiobook performer sounds like they have no idea WTF they're saying it annoys me. The worst is when they don't even seem to have read ahead to understand the syntax or context and so get the flow and inflection very wrong. (For syntax at least AI might actually be an improvement over those people....)
While machine learning can, given enough examples, accurately predict how a great human narrator would be influenced by the contexts of words, it's seems unlikely to have been implemented (yet...). It probably works more like the virtual singer AI I have rather than ChatGPT or diffusion-based AI art generators---trained on the singer's general singing style and the syntactic flow of singing in English, but not responding intelligently to the particular meaning of a phrase (which can be gotten around through a mix of pitch and timbre re-takes and manual editing---a time-consuming process that's unlikely to be applied to audiobooks, except perhaps for short fiction by authors who don't want to record their own voices).
There's the question of how much recognizable emotion should be audible (which in many cases risks misinterpretation, or imposing the performer's own interpretation at the expense of other possibilities).
Obviously many people like character voices. One benefit of AI will hopefully be better cross-gender voices (provided they're not stubborn about using the same voice for the whole thing) and accents. Parameters can be tweaked and randomized to create a better and wider variety of character or creature voices.
Personally I often like listening to authors perform their own works. Even though they probably aren't great at reproducing the way they'd like it to be heard, it's usually possible to get an idea of what they were going for, what sorts of cadences and voices they were imagining. Ideally they would work with audiobook performers to better realize that. AI could in principle help them do that, at much lower cost and with much greater control over vocal parameters---though in the near term that will be very time consuming and require substantial editing skill. Whereas this is being promoted as a time-saver....
AI voices may eventually be honed to create more ideal voices for each genre, audience, or (with enough relevant data) individual... and also optimize variation and novelty (with listeners having the option to switch voices or alter them---change the accent, be more this or less that, more emotional or less, etc.).
This post has been edited by Azath Vitr (D'ivers: 05 January 2023 - 01:42 PM