1/15/2025 at 10:11:31 AM
On the one hand, this is very convenient. Probably cool for some non-fiction.On the other, some of my favorite audio books all stood out because the narrator was interpreting the text really well, for example by changing the pacing during chaotic moments. Or those audiobooks with multiple narrators and different voices for each character. Not to mention that sometimes the only cue you get for who's speaking during dialogue is how the voice actor changes their tone. I have mixed feelings about using this and losing some of that quality.
I would totally use this over amateur ebooks or public domain audiobooks like the ones on project guttenberg. As cool as it is/was for someone to contribute to free books... as a listener it was always jarring to switch to a new chapter and hear a completely different voice and microphone quality for no reason.
by laserbeam
1/15/2025 at 11:06:41 AM
> On the other, some of my favorite audio books all stood out because the narrator was interpreting the text really wellThis (and everything else with AI) isn't saying "you don't need good actors any more". It's saying "if you don't have an audiobook, you can make a mediocre one automatically".
AI (text, images, videos, whatever) doesn't replace the top end, it replaces the entire bottom-to-middle end.
by stavros
1/15/2025 at 11:28:34 AM
RIP to future top-enders that would normally have started out on the bottom to middle end.by j4coh
1/15/2025 at 11:44:13 AM
Bingo. AI is going to destroy any pathway for training and accruing experience.An embalming tech for our dying civilization.
by aredox
1/15/2025 at 7:34:54 PM
It's kind of wild to me that the future will look like the 80s imagined it all because AI killed the creative seed corn when retro-future 80s was the aesthetic.by _DeadFred_
1/15/2025 at 11:59:36 AM
Just like printing presses killed the profession of copying books by hand, eliminating the training pathway for illuminated manuscripts. Death of civilization itself I say, damn those printing presses.by lupusreal
1/15/2025 at 1:12:11 PM
There's a big difference.Printing presses produce superior products.
A mediocre audiobook is certainly better than no audiobook at all, but it is an inferior product to a well produced audiobook.
by oldgradstudent
1/15/2025 at 1:22:22 PM
> Printing presses produce superior products.That seems like a highly dubitable statement. Many hand illuminated manuscripts are masterpieces of art. The advantage of the printing press was chiefly economical making the cost of a copy dramatically less, not an increase in quality (especially so by the aesthetical standards of the time).
by gampleman
1/15/2025 at 3:45:53 PM
Indeed. Even Gutenberg had his Bibles touched up by artists after they were printed (illuminated capital letters and so on) because even he believed his printed copies were inferior to the hand-made ones.by jhbadger
1/16/2025 at 2:44:09 AM
I would say it is the perfect metaphor.I love audiobooks but at this point, most of what I want to listen to is stuff that would not sell enough to bother having someone read.
There are also many voice actors who I simply don't like the way they read.
A future that I can pick a voice that I like for any PDF is a huge upgrade.
I think a problem people have is if on the young side, maybe didn't expect the future to change like this.
No one I knew went on the internet when I graduated high school. Change like this is all par for the course. The only advice I got in high school from a guidance counselor was that I had a nice voice for radio. Books on tape was not exactly a career option at the time. The culture will survive the death of a career path that didn't even really exist when I was a senior in high school.
by monophonica
1/15/2025 at 8:33:08 PM
As a work of art, sure. But as books containing information, printing presses produced superior products.by oldgradstudent
1/15/2025 at 3:09:52 PM
Many (most, if not all) hand-made copies contained errors, which printed books did not. They were much closer to 1:1 copies.by karamanolev
1/15/2025 at 3:51:32 PM
If the mistake happened in the typesetting stage, printed books could spread errors much more efficiently, as in the infamous "wicked bible" of 1631, where a typesetting error made the ten commandments contain the amusing phrase "Thou shalt commit adultery". Surviving copies are quite the collectors' item as most were destroyed.by jhbadger
1/15/2025 at 8:30:57 PM
Usually, though, errors are corrected and every every printing has fewer errors than the previous one.by oldgradstudent
1/15/2025 at 10:13:37 PM
What percentage of books get a second print run on a printing press? And what's the process for that? Do they have to reset each word for the second run? I genuinely don't know how a physical process like typesetting can result in increased accuracy on each print.by kamarg
1/17/2025 at 12:40:27 PM
Any interesting book gets a second print run - except if it was on purpose a limited edition with some exceptional quirk.by aredox
1/15/2025 at 2:50:24 PM
What we have today is early gen "practical" AI.Even current SOTA models would almost certainly be able to handle multiple speakers and pick-up on the intended tone and intonation.
Don't make the mistake of thinking what we have today is what we will still be working with in 5 or 10 years.
by Workaccount2
1/15/2025 at 9:04:57 PM
Some people will learn to use these AIs to make top-quality audiobooks (and books, movies, TV shows, comics...). It will be a more manual process than pressing a button, but still orders of magnitude less than what it took before. As a result there will be a tsunami or high-quality content.There will be curation and specialization. Previously ignored niches now will be economically profitable. It will be a Renaissance of creativity, and millions of jobs will be created.
by fidelramos
1/15/2025 at 12:05:22 PM
If you see podcasts as useless in modern society as illuminated manuscripts, no big loss I suppose, but I do enjoy the human made ones and would be sad to see them go extinct as the manuscripts did. And the same thing is happening to other entry-level creative roles, some of which you may personally regret the loss of too.by j4coh
1/15/2025 at 2:35:32 PM
Actually I think illuminated manuscripts had more value, insofar as they were art, than podcasts (99% of which are vapid timewasters and/or friend simulators.) The good podcasts are those view which involve interviewing interesting people, and AI isn't replacing those.There's a lot more to be said for the value of audio books, but the accessibility gains of proliferated auto-generated audiobooks outweigh the downside of losing a small number of expertly produced audio books.
For context, I listen to audio books a lot, and for years I have listened to traditional TTS readings of books too. Better voice generation for books without audiobooks is a great win for society.
by lupusreal
1/15/2025 at 1:44:21 PM
I enjoy looking at illuminated manuscripts. Podcasts are bullshit and can die in a ditch.by akho
1/15/2025 at 2:10:30 PM
I enjoy podcasts but I still hope illuminated manuscripts won’t die in a ditch so other people can enjoy content the way they prefer ;)by teekert
1/15/2025 at 12:06:14 PM
Given that the printing press was the root cause for the century of religious wars that soaked Europe with blood, and was key in the revolutions that overthrown absolute monarchies all over Europe, I don't think it's as good as an example as you think it is.Death of a civilization doesn't mean disappearance of mankind or even overall regression on the long term.
by littlestymaar
1/15/2025 at 12:23:54 PM
Do you have a source for that? I don't think the printing press was the cause of religious wars any more than bullets were the cause of WWIIby megaloblasto
1/15/2025 at 1:03:18 PM
Have you heard of the Protestant Reformation and the following 120 years of war? The entire Protestant <> Catholic blow up that consumed Europe was pretty directly attributable to the printing press.(To be clear, nothing is solely and exclusively caused by any one thing. Causality is a very fuzzy concept. But sans printing press, those wars certainly wouldn’t have happened when/where/how they did, if they ever happened at all).
by llamaimperative
1/17/2025 at 12:26:21 PM
Do you know Hussites? [1] The Hussite Wars (1419–1434) predate printing press and Luther told: "We are all Hussites without knowing it."by xkriva11
1/15/2025 at 12:35:44 PM
Easy access to the Bible text instead of being only read to, hence high literacy of the faithful, was one of the core tenets of some branches of Protestantism.by baq
1/15/2025 at 1:17:40 PM
This is common enough knowledge that “read, like, any history” is an appropriate response. However, if you’re genuinely curious, here’s a random link:https://ehne.fr/en/encyclopedia/themes/european-humanism/eur...
by thoroughburro
1/15/2025 at 2:39:20 PM
I blame canned food and trains for solving the logistics problems that previously prevented massive wars.by lupusreal
1/15/2025 at 7:40:38 PM
An interesting one I read was public schools and their creation of a national identity. Before public schools there weren't really standardized languages forced upon an entire nation, etc. The countryside was more one country/people/language morphing into the next, not clean delineated lines where country/language switched instantly. It was also said borders were much more open/abstract before the resultant shift as well.by _DeadFred_
1/15/2025 at 3:34:12 PM
Napoleonic wars beg to differ.by littlestymaar
1/15/2025 at 7:32:05 PM
While they didn't have trains, the Napoleanic wars did feature the first use of canned food to aid in logistical supply of armies. You could argue that the lack of trains (and can openers) probably meant that they jumped the gun on starting giant wars. We Americans fixed that in the Civil War, to great and deadly effect.by sigilis
1/15/2025 at 11:23:40 PM
Appertization was invented in 1804 but Appert did not sell his technology to the French army before 1810 so it's fair to say that most of the Napoleonic wars were run before canned food was even a thing. Maybe it has seen mainstream use in the Grande Armée in the end of his reign, but it was definitely not a deciding factor in Napoleon's logistics for most of his campaigns.Without trains, the logistics of canned food isn't much better than the logistics of any bread-based food you give to your soldiers. It doesn't solve the weight problem which is the key problem with preindustrial army logistical issue.
by littlestymaar
1/15/2025 at 2:30:55 PM
Those revolutions were ultimately positive. The alternative would be the continued rule by monarchs and a single powerful religionby turnsout
1/15/2025 at 3:33:34 PM
See my second paragraph. It can be ultimately positive while still being civilization-ending.by littlestymaar
1/15/2025 at 9:05:10 PM
No comfort to the millions who died though.by chairmansteve
1/15/2025 at 1:18:38 PM
[dead]by Melomomololo
1/15/2025 at 9:45:29 PM
We'll be ok lol, while it is a significant transition, it IS just a transition in the media landscape.AI is big and significant, but we'll be ok. There is also no such "one" thing as "our civilisation". We're deeply interconnected extremely vast and complex interconnected networks of ever-changing relationships.
AI does indeed represent the commoditisation of things we used to really value like "craftsmanship in book narration" and "intelligence". But we've had commoditisations of similar media in the past.
Paper used to be extremely expensive, but as time went on, it became more and more commoditised.
Memory used to be extremely expensive (2000-3000 years ago, we needed to encode memory in _dance_, _stories_ and _plays_. Holy shit). Now you can purchase enough memory to store a billion books for maybe two hours of labor.
Most of these things don't really matter. What is happening is that the media landscape is significantly shifting, and that is a tale as old as history.
I do think the intellectual class will be affected the most. People who understand this shift stand to benefit enormously, while those who don't _might_ end up in a super awful super low class.
And yet, all of that doesn't really matter if you just move to, I dunno, Paramaribo or whatever. The people there are pragmatic and friendly. They don't care about AI too much. Or maybe New Zealand, or Iceland, or Peru, or Nepal or I don't know.
The world isn't ending. Civilisation isn't being destroyed at our core.
The media landscape is changing, classes are shifting, power-relationships are changing. I suggest you think deeply about where you want to live, what you stand for and what is most important to you in life.
I don't need money or tech to be happy. I am fine with just my cats, my closest friends and family and healthy food.
If it happens to be the case that I need to leave tech or that extremely high-end narrated audiobooks cease to exist? Then all I have to say is "oh no, anyway".
We'll be fine. One way or another.
Just different.
by azeirah
1/16/2025 at 3:40:51 PM
That sure is some naivety ya got there. But good luck on the move. Keep your friends and family close.by n3rv
1/15/2025 at 1:43:32 PM
> RIP to future top-enders that would normally have started out on the bottom to middle end.This stance always reminds me of the Profession, a 1957 novella by Isaac Asimov that depicts pretty much the future where there are only top performers and the ignorant crowd.
by sam_lowry_
1/15/2025 at 1:46:10 PM
He was a clear thinker.by xyproto
1/15/2025 at 3:36:28 PM
Virtually every book I want this for has been around for 70+ years and still no high or low quality audiobook has been produced. How long do I have to wait for those aspiring top-enders before an audiobook can be made available?by anothermathbozo
1/15/2025 at 6:36:49 PM
That has nothing to do with audiobook voice actors and everything to do with copyright and who owns the rights to the book (and whether they believe there's any money to be made selling an audiobook version).by Arainach
1/15/2025 at 11:31:41 PM
Piracy may have made some of these accessible by ripping the US library of congress recordings for the blind.by stevenwoo
1/15/2025 at 2:14:52 PM
I'm super opposed to AI, but I see this as a rare positive. As someone already said, the win here is to have a audiobook where one doesn't yet exist. hell, maybe the tables will turn and the scrubs will do the hard work of discovering which titles are popular with an audience, then the ebook industry can capitalize on AI by hiring voice actors to produce proper titles?by gosub100
1/15/2025 at 6:05:22 PM
Not gonna happen. Once the AI shit is out there, people will have consumed it by the time a real actor can create (and edit) the audiobook.by DidYaWipe
1/15/2025 at 8:47:13 PM
It's common for shows to use big name actors as voices because they draw an audience, nothing will change. Just means a smaller pool of voice actors and they'll mostly be good looking.by CuriouslyC
1/15/2025 at 8:45:11 PM
The value of distribution is increasing while the value of content and product is decreasing for all but the top end.by cmdtab
1/15/2025 at 1:59:58 PM
Not RIP at all. "Meritocracy" was coined in a book literally warning us about how terrible such a society would be: https://en.wikipedia.org/wiki/The_Rise_of_the_MeritocracyThe "top-enders" are the privileged who need to have some of their gains for their intelligence redistributed to others. The alternative is "survival of the smartest", which is de-facto what we have today and what Young was trying to warn us about.
by Der_Einzige
1/15/2025 at 12:23:48 PM
By that time, AI will beat the toppest of the top enders. Remember the time Deep Blue barely beat Kasparov? Now no human, or group of humans can beat a chess engine, even one that runs on an iPhone.by credit_guy
1/15/2025 at 12:52:02 PM
I don’t think chess is a good example of AI destroying the path to the top. Chess is more popular now and humans keep advancing even though it is futile effort against computers.by plastic3169
1/15/2025 at 1:10:50 PM
And people are better at chess now in part because of practicing with/against machines. But chess has never been something you can make a living off of unless you were at the very top.by rcxdude
1/15/2025 at 11:58:34 AM
AI TTS has been available for quite some time. Tacotron V1 is about 8 years old. I don't think we saw much bottom end replacement.IMGO(gut opinion), generative AI is a consumption aid, like a strong antacid. It lets us be done with $content quicker, for content = {book, art, noisy_email, coding_task}. There's obvious preconceptions forming among us all from "generative" nomenclature, but lots of surviving usages are rather reductive in relevant useful manners.
by numpad0
1/15/2025 at 1:45:38 PM
Yeah, let us not blame AI. Audible damaged the quality of audiobooks than AI.by sam_lowry_
1/15/2025 at 2:39:23 PM
Bottom end really, Middle end is still superior to this AI drivel.by no_wizard
1/15/2025 at 10:43:31 AM
I wholeheartedly agree. https://en.m.wikipedia.org/wiki/Stephen_Briggs got me hooked on Terry Pratchett's Discworld series. I loved "Going Postal".by felixhummel
1/15/2025 at 11:02:48 AM
I know someone who listened Terry Pratchett's "Wachen! Wachen!" audiobook on Spotify while living in Germany for few years. It was so well narrated that he also acquired some peculiarities of local dialects used by specific characters in the book. Locals in Bavaria were quite surprised of a foreigner speaking such language.by IndrekR
1/15/2025 at 11:12:38 AM
Absolutely.Even on the non-fiction side, the narration for Gleick's The Information adds something.
While I want this tool for all the stuff with no narration, NYT/New Yorker/etc replacing human narrators with AI ones has been so shitty. The human narrators sound good, not just average. They add something. The AI narrators are simply bad.
by dmazin
1/15/2025 at 3:02:17 PM
I agree with you, but also want to point out:New authors, self-publishers, can't afford tens of thousands of dollars to get an audiobook recorded professionally... This can limit their distribution.
Authors might even choose not to make such version (or lack confidence to record themselves), so AI capable of making a decently passable version would be nice -- something more than reading text blandly. AI in theory could attempt to track the scene and adjust.
by ldoughty
1/15/2025 at 6:50:52 PM
By observation the current approach is for authors to narrate the book themselves of they think their readers will want it and if they feel reasonably confident in their own narration.by plorg
1/15/2025 at 6:06:23 PM
You can get narrators to work on a royalty basis.by DidYaWipe
1/15/2025 at 1:21:20 PM
Yes, but if the alternative is not having a book, or having to listen to one poorly read (I love Librivox, but there are some books which I just haven't been able to finish because of readers, and many more which were nixed for family vacation travel listening on that account), this may be workable.by WillAdams
1/15/2025 at 10:58:15 AM
With this technology, one could produce high quality audio books without having access to high quality narrators by annotating the books with the voice, speed and such things.I wonder if a standardized markup exists to do so.
by micw
1/15/2025 at 11:09:42 AM
There is SSML for speech markup to indicate various characters of speech like whispers, pronunciation, pace, emphasis, etc.With LLMs proving to be very good at generating code, it may be reasonable to assume they can get good at generating SSML as well.
Not sure if there is a more direct way to channel the interpretation of the tone/context/emotion etc from prose into generated voice qualities.
If we train some models on ebooks along with their professionally produced human-narrated audiobooks, with enough variety and volume of training data, the models might capture the essence of that human-interpretation of written text? Just maybe?
Amazon with its huge collection of Audible + Kindle library -- if it can do this without violating any rights -- has a huge corpus for this. They already have "whispersync" which is a feature that syncs text in a kindle ebook with words in corresponding audible audiobook.
by albert_e
1/15/2025 at 11:22:40 AM
Good points, thank you! I just tested it. While ChatGPT was very good in adding generic (textual) annotations, the result for generating SSML where very poor (lack of voice names, lack of distinction between narrator and character etc).Probably the results with a model trained for this plus human audit could lead to very good results.
by micw
1/15/2025 at 11:09:22 AM
They still wouldn't be high quality. It's just not possible to capture the precise tone of voice in an annotation, and that precision I believe really makes a difference. My experience is that the deeper the narrator understands the text and conveys that understanding, the easier it becomes for me to absorb that information.by pegasus
1/15/2025 at 11:52:46 AM
Have you tried those "podcast from a paper" models? They do some of the things you are saying they don't, although it's not 100% it's also miles ahead of for example human Polish TV lectors, or other monotone style narrations.by vasco
1/15/2025 at 11:04:15 AM
Don't end to end trained models already do this to some extent? Like raising the pitch towards a question mark, like a human would.TortoiseTTS has a few examples under prompt engineering on their demo site: https://nonint.com/static/tortoise_v2_examples.html
by KeplerBoy
1/15/2025 at 11:11:17 AM
That's a bit of basic and random. Some models have the features you describe. From the better models you get a slightly different voice for text in quotes.But the difference to good audio books is that you have * different voices for the narrator and each character * different emotions and/or speed in certain situations.
I guess you could use a LLM to "understand" and annotate an existing book if there's a markup and then use TTS to create an audio book from it and so automate most of the the process.
by micw
1/15/2025 at 11:14:55 AM
Edit: I actually tried this. I prompted in ChatGPT:"Annotate the following text with speakers and emotions so that it can be turned into an audiobook via TTS", followed by a short text from "The Hobbit" (The "Good morning scene"). The result is very good.
by micw
1/15/2025 at 10:18:23 AM
I guess this is still very useful if you are blind.by ahoka
1/15/2025 at 10:19:10 AM
Yeah, for accessibility purposes on things that aren't already narrated, this is kind of thing is huge.by loktarogar
1/15/2025 at 11:32:16 AM
that's the thing. it's not just for accessibility. anything not already narrated is a fair target for TTS. i don't have time to sit down and read books. all reading is done on the go, while getting around or doing daily routines at home. i have a small book that i am reading now, which should take a few hours to finish, but in the time i manage to get done reading it i will probably have listened to two or three audio books.oh, and it's also a boon for those who can't afford to buy audiobooks.
by em-bee
1/16/2025 at 2:11:08 AM
Accessibility is generally framed around providing accomodations to people with disabilities, but at its core it's about more people being able to access things they otherwise couldn't. By this metric we agreeby loktarogar
1/15/2025 at 11:48:15 AM
You don't choose to spend your time reading books. You probably roll your eyes when someone tells you they don't have time for some activity you deem valuable. This is the 'no time to exercise' debate in a different shape.They are also different activities, with audio it's easier to listen to more but retention is usually lower. Not casting any elitist "you need to read" bullshit by the way, but find it odd to define it in terms of lack of time, and I really like both mediums.
by vasco
1/15/2025 at 12:08:10 PM
there is not much of a choice here. sure, i could use the time i spend reading and commenting on HN to read books instead. so technically speaking it is a choice. but i want to do both and many other things besides also having to work and a family to take care of. so the result is, i can't afford the time to read without giving up other things that are also important to me. listening to books allows me to access books i would otherwise not be able to read because of these priorities.there are other factors as well. i love reading so much that i tend to forget time around me. as a result reading would cause me to neglect other duties. i can't allow that, and therefore i am forced to avoid reading. i also don't like long form reading on electronic devices, and as a frequent traveler, printed books are simply not practical and often not even accessible.
i agree with the retention issue, but i found that a much larger factor for retention is how well i can follow the story. a good story that is easy to get into is also easier to retain. and finally, reading fiction is for entertainment. i don't have to retain it.
by em-bee
1/15/2025 at 2:03:15 PM
> You probably roll your eyes when someone tells you they don't have time for some activity you deem valuable.There's a few categories where it makes sense to roll your eyes, like if they say they have no time to shower or have never been to one of their kid's baseball games.
But for things that aren't basic human expectations, I think you'd have to a real jerk to roll your eyes at someone not having time. No time to cook multi-pot dishes? No time to exercise? No time to read? No time to go to museums? No time to meet at the bar for a drink? Any of them sensible.
No one can do everything, we all make our priorities and its well within their choice not to have any one optional life thing at the top of their personal stack.
by esrauch
1/15/2025 at 2:35:11 PM
Agree completely, my point was indeed they are choices, not lack of time. I think I came across too judgy even trying not to. You made a better job of it.by vasco
1/15/2025 at 4:25:41 PM
This is a weird comment. They are just saying why they prefer audiobooks thus why general TTS is useful for them.Why are you trying to argue about their preference? They didn't cast any judgement on others with different preferences.
This is nothing like “no time for exercise”.
It's more like "I have no time (preference) to fire up the wood stove so I use microwave" and then you come in with "wow so you roll your eyes at us fire stove users?"
by hombre_fatal
1/15/2025 at 6:09:54 PM
Two hours before you posted this there was already an admission from me in a sister comment that I came across too judgy and someone else made the point I tried better than myself - not sure how much penitence I need to do but sorry again :)by vasco
1/15/2025 at 1:01:01 PM
I was just thinking about automatically slapping an mp3 on every blog post, just an accessibility nicety.Can someone with low vision tell me if this would be useful to them? It may be that specialist tools already do this better.
by flir
1/15/2025 at 1:15:41 PM
People use screen readers for accessibility. I would not expect anyone to be able to "look for and find" your mp3... I would instead expect them to use the tool they normally use for accessibility.The real question is "what tools are they already using and how can I make sure those tools are providing higher quality output?". There are standards in browsers for these kinds of things (ways to hint navigation via accessibility tools for example).
by laserbeam
1/15/2025 at 2:12:36 PM
> I would instead expect them to use the tool they normally use for accessibility.Yes, that was my second thought. But I'd rather ask someone than rely on my assumptions.
by flir
1/15/2025 at 8:31:03 PM
Agree with you on this.My example, I was never a Wheel of Time fan, but the new audio editions done by Rosamund Pike are quite the performance, and make me like the story. She brings all the characters to life in a way thats different than just reading. It's a true performance.
by taude
1/16/2025 at 5:11:36 AM
I guess using different narrators is essential for both fiction and non-fiction books if you want the full experience. Personally, I love it when audiobooks have narrators who stick to the characters’ personalities—it just feels right. Some of the audiobooks I’ve listened to have narrators who switch up their voices for each character, and others even use a different narrator for every character, which gets really good. Narration Box has been doing a really great job with this latelyby Oneunscripted
1/15/2025 at 11:29:01 PM
A couple of my favorite audiobooks are Stranger in a Strange Land and Flowers for Algernon where the performer changes the intonation and enunciation of main character with the character’s journey and it was a revelation and made me appreciate the stories in a way I did not get reading the printed books the first time. Just the consistency of the performance is sometimes difficult to do in my imagination perhaps.by stevenwoo
1/15/2025 at 1:56:53 PM
A GenAI model that read audiobooks with such dramatisation is really my dream. There are so many books that I would want to listen to, but still lack such an adaptation. Also it takes months after the book release before the audiobook gets released.Just imagine what this would do for writers. They can get instant feedback and adjust their book for the audiobook.
by whazor
1/15/2025 at 1:46:48 PM
I agree but the opposite can be true too. Sometimes the narrator seems to target some general audience that doesn’t fit me at all, in a way that makes me cringe when I listen, until I stop listening altogether. In these cases I’d rather listen to a relatively flat narration from a tool like this.by rd11235
1/15/2025 at 5:20:51 PM
Would a "better" AI would do a "better" narration with a better understanding of the text? Of course that it would imply a different (and far bigger?) model.Anyway, even if in theory it might, in practice things may end even worse than doing it with a monotone voice.
by gmuslera
1/17/2025 at 5:16:05 PM
On the other hand, there are a lot of narrators who are just bad, and the publisher is not going to pay for an alternate narration. These tools are a good way to re-narrate Wil Wheaton narrated books with correct pronunciation and inflection, for example.Computer chess took a long time to get better than the best players in the world, but it was better than most chess players for many years before that. We're seeing that a lot with these generative models.
by lern_too_spel
1/15/2025 at 11:15:57 AM
I like one speaker in one particular book.He also narrates another scifi book series and honestly I dislike this a lot.
He became the voice of one particular character for me.
I would love variety
by Melomomololo