AI is everywhere now. Some hail the positives, while others concede the negatives of job losses and increased automation of the workforce, seemingly unstoppable since the Industrial Revolution. Personally, I’d rather go to a till staffed by a friendly employee than use a self-checkout machine in a supermarket. Society needs human contact, which is essential for our emotional wellbeing, mental health, and a sense of belonging. One of the positives to come out of AI is undoubtedly text-to-speech software which converts text into spoken words, effectively reading it for you. As our awareness of neurodiversity grows, so too our knowledge and appreciation of assistive technologies. Text-to-speech has come a long way since it was first developed in the 1930s.
Development of text-to-speech software
Text-to-speech has surprisingly long origins. The first computer-based speech-synthesis systems emerged in the 1950s, yet the earliest known text-to-speech programme was VODER , developed by Bell Laboratories in 1939 and was demonstrated at New York’s prestigious World’s Fair. In a fascinating blog post Grundhauser (2017) described that this first attempt at replicating the human voice apparently spoke ‘like a robot demon’ and ‘could create 20 or so different electric buzzes and chirps, which the operator would manipulate using 10 keys, a wrist plate, and a pedal’. It is even credited with inspiring Numbers by Kraftwerk that transformed musical genres as diverse as techno, hip-hop, new wave, and early rap (Sanusi, 2023). A general English text-to-speech system was developed by Noriko Umeda in 1968 at the Electrotechnical Laboratory in Japan.
Sounds like a real human
In recent years text-to-speech has drastically improved since the mechanical narration it used to render. There are some exceptions to this innovation like eBook text-to-speech, for instance, which need some development. We have named some of the pros and cons in our Library Wellbeing guide. The deliciously-named IceCreamApps site provides a list of eight recommended eBook screen readers, if you are interested. The fundamental issue is that there is no universal screen reader that works for everything online. That aside, the revelation is Microsoft’s Speak text-to-speech feature. It reads like a dream. Or rather, like a human voice. The ’voice’ is female, well-spoken, annunciating to give emphasis, giving pauses where needed and is easy on the ear. If you are not happy with the ‘voice’ then you can go to Microsoft’s Speech Platform enabling you to choose a different voice package. It’s a bit like choosing your speech choice on a SATNAV when you drive a car. Text-to-speech software has been humanised, the ultimate acclaim of any person-centred AI technology.
Drawbacks are minimal. Homonyms are occasionally an issue like the word ‘reading’ (e.g. reading text) pronounced as ‘Reading’ (the Berkshire town located west of London). I have also caught myself anthropomorphising the ‘voice’ as a person (‘her’). There are many benefits to using text-to-speech.
Benefits on literacy
Although few studies indicate whether text-to-speech increases literacy, rates of listening comprehension was found in a study by Brunow & Cullen (2021) to be beneficial, although it is not comparable to the interventionist support of a human teacher. Research conducted by Svensson et al. (2019) have found that reading ability, motivation, and performance increases with the use of text-to-speech. These suggest that text-to-speech is supplementary, rather than comprehensive, and does not substitute human involvement in the educational process (Wood et al., 2018).
Visual stress
One of the benefits of using text-to-speech is to alleviate visual stress, reducing eye strain. This function is necessary when someone has a neurodiverse condition like dyslexia or ADHD. Text-to-speech relies upon auditory skills rather than the complexity of visually reading a page. This is a revolutionary step for dyslexic students struggling to read text on the screen.
Editing and proofreading
For the purposes of editing and proofreading the immediate benefits of text-to-speech are huge and impactful, allowing for error detection, spelling and grammatical mistakes, awkward sentence structures and consistency and coherence. I have found it particularly useful in identifying word misplacement.
Writing style analysis
Even though I am not dyslexic I use MS Speak. I used it repeatedly for this blog post, both in Word and in WordPress. What does my writing sound like? Are there any errors, misplaced words, gaps, too many words..? How does it flow? What is the personality of my writing voice? These are simple questions and text-to-speech, I feel, has the ready answers. Such writing style analysis identifies your writing voice using natural language processing (NLP) tools, analysing writing patterns, sentence structures and other linguistic features.
The allyship of text-to-speech software
Text-to-speech software has become an indispensable ally in writing. Will you invite this accessible technology into your assignments and check your writing? Nowadays I would not write a longer piece of writing without it. Text-to-speech is a welcome friend in that regard.
References
Brunow, D.A. & Cullen, T.A. (2021). Effect of Text-to-Speech and Human Reader on Listening Comprehension for Students with Learning Disabilities. Computers in the schools: Interdisciplinary Journal of Practice, Theory, and Applied Research, 38 (3), 214-231.
Grundhauser, E. (2017). The Voder, the first machine to create human speech. Available from: The Voder, the First Machine to Create Human Speech – Atlas Obscura [Accessed 15th May 2024].
Icecreamapps.com. (2024). Best Text To Speech Book Readers 2024: Top 8 – Icecream Apps. Available from: Best Text To Speech Book Readers 2024: Top 8 – Icecream Apps [Accessed 21st May 2024].
Sanusi, T. (2023). From Hawking to Siri: The evolution of speech synthesis. Available from: From Hawking to Siri: The Evolution of Speech Synthesis | Deepgram [Accessed 15th May 2024].
Svensson, I., Nordström, T., Lindeblad, E., Gustafson, S., Björn, M., Sand, C., … Nilsson, S. (2021). Effects of assistive technology for students with reading and writing disabilities. Disability and Rehabilitation: Assistive Technology, 16 (2), 196–208.
University of Lincoln. (2024). Screen Readers – Screen Readers and Accessibility – Guides at University of Lincoln. Available from: Screen Readers – Screen Readers and Accessibility – Guides at University of Lincoln [Accessed 21st May 2024].
Wood, S.G. et al. (2018) ‘Does Use of Text-to-Speech and Related Read-Aloud Tools Improve Reading Comprehension for Students with Reading Disabilities? A Meta-Analysis’, Journal of Learning Disabilities, 51(1), 73–84.