33
Better TTS on Linux
(shkspr.mobi)
A community for everything relating to the GNU/Linux operating system (except the memes!)
Also, check out:
Original icon base courtesy of lewing@isc.tamu.edu and The GIMP
Many people who are visually impaired and rely on TTS don't want it to sound "better". The ultra robotic voices, have extremely consistent sounds, which makes it possible to make out what they are saying at many time accelerated speeds. Though it seems to take some practice.
On the other hands, "more natural" sounding voices, slur into eachother at high speeds, and aren't comprehensible. They are only listenable to at slower speeds.
example: https://web.archive.org/web/20220525081607/https://www.vincit.fi/en/software-development-450-words-per-minute/
The og site seems to be down. The audio files work for me though. It sounds like gibberish to me, but it's comprehensible easily to the author.
I mean consistent sound is fully in-line with what I'm saying, I am fine with robotic sound though the issue I have is that it can be grating for newer. Which I just assumed was something about how samples are used (compared to older speech synthesis). Is the sound actually part of the design to allow such high-speed?
Even if it were, older-style synthesis could likely have that as a parameter or option (or just... a dedicated voice).
I've seen some videos on screen-readers with a somewhat fast voice (not quite as fast as your link) that does sound better, similar voices to DECtalk Paul. They don't seem to always give the voice name but I've seen some mention of IBMTTS so it might be related (though current results give AI service stuff that I'm not sure would trace back to those old videos (2016) but either way it might be some Paul derivative). EDIT: It might be ETI Eloquence?
It seems ETI Eloquence is both beloved in the blind community as well as something that has had support issues (proprietary abandonware). And I've seen one person on the subject: