Inviting a welcome friend: using text-to-speech software to read assignments

AI is everywhere now. Some hail the positives, while others concede the negatives of job losses and increased automation of the workforce, seemingly unstoppable since the Industrial Revolution. Personally, I’d rather go to a till staffed by a friendly employee than use a self-checkout machine in a supermarket. Society needs human contact, which is essential for our emotional wellbeing, mental health, and a sense of belonging. One of the positives to come out of AI is undoubtedly text-to-speech software which converts text into spoken words, effectively reading it for you. As our awareness of neurodiversity grows, so too our knowledge and appreciation of assistive technologies. Text-to-speech has come a long way since it was first developed in the 1930s.

Development of text-to-speech software

Text-to-speech has surprisingly long origins. The first computer-based speech-synthesis systems emerged in the 1950s, yet the earliest known text-to-speech programme was VODER , developed by Bell Laboratories in 1939 and was demonstrated at New York’s prestigious World’s Fair. In a fascinating blog post Grundhauser (2017) described that this first attempt at replicating the human voice apparently spoke ‘like a robot demon’ and ‘could create 20 or so different electric buzzes and chirps, which the operator would manipulate using 10 keys, a wrist plate, and a pedal’. It is even credited with inspiring Numbers by Kraftwerk that transformed musical genres as diverse as techno, hip-hop, new wave, and early rap (Sanusi, 2023). A general English text-to-speech system was developed by Noriko Umeda in 1968 at the Electrotechnical Laboratory in Japan.

Sounds like a real human

In recent years text-to-speech has drastically improved since the mechanical narration it used to render. There are some exceptions to this innovation like eBook text-to-speech, for instance, which need some development. We have named some of the pros and cons in our Library Wellbeing guide. The deliciously-named IceCreamApps site provides a list of eight recommended eBook screen readers, if you are interested. The fundamental issue is that there is no universal screen reader that works for everything online. That aside, the revelation is Microsoft’s Speak text-to-speech feature. It reads like a dream. Or rather, like a human voice. The ’voice’ is female, well-spoken, annunciating to give emphasis, giving pauses where needed and is easy on the ear.  If you are not happy with the ‘voice’ then you can go to Microsoft’s Speech Platform enabling you to choose a different voice package. It’s a bit like choosing your speech choice on a SATNAV when you drive a car. Text-to-speech software has been humanised, the ultimate acclaim of any person-centred AI technology.

Drawbacks are minimal. Homonyms are occasionally an issue like the word ‘reading’ (e.g. reading text) pronounced as ‘Reading’ (the Berkshire town located west of London).  I have also caught myself anthropomorphising the ‘voice’ as a person (‘her’). There are many benefits to using text-to-speech.

Benefits on literacy

Although few studies indicate whether text-to-speech increases literacy, rates of listening comprehension was found in a study by Brunow & Cullen (2021) to be beneficial, although it is not comparable to the interventionist support of a human teacher. Research conducted by Svensson et al. (2019) have found that reading ability, motivation, and performance increases with the use of text-to-speech. These suggest that text-to-speech is supplementary, rather than comprehensive, and does not substitute human involvement in the educational process (Wood et al., 2018).

Visual stress

One of the benefits of using text-to-speech is to alleviate visual stress, reducing eye strain. This function is necessary when someone has a neurodiverse condition like dyslexia or ADHD. Text-to-speech relies upon auditory skills rather than the complexity of visually reading a page. This is a revolutionary step for dyslexic students struggling to read text on the screen.

Editing and proofreading

For the purposes of editing and proofreading the immediate benefits of text-to-speech are huge and impactful, allowing for error detection, spelling and grammatical mistakes, awkward sentence structures and consistency and coherence.  I have found it particularly useful in identifying word misplacement.

Writing style analysis

Even though I am not dyslexic I use MS Speak. I used it repeatedly for this blog post, both in Word and in WordPress. What does my writing sound like? Are there any errors, misplaced words, gaps, too many words..? How does it flow? What is the personality of my writing voice? These are simple questions and text-to-speech, I feel, has the ready answers. Such writing style analysis identifies your writing voice using natural language processing (NLP) tools, analysing writing patterns, sentence structures and other linguistic features.

The allyship of text-to-speech software

Text-to-speech software has become an indispensable ally in writing. Will you invite this accessible technology into your assignments and check your writing? Nowadays I would not write a longer piece of writing without it. Text-to-speech is a welcome friend in that regard.

References

Brunow, D.A. & Cullen, T.A. (2021). Effect of Text-to-Speech and Human Reader on Listening Comprehension for Students with Learning Disabilities. Computers in the schools: Interdisciplinary Journal of Practice, Theory, and Applied Research, 38 (3), 214-231.

Grundhauser, E. (2017). The Voder, the first machine to create human speech. Available from: The Voder, the First Machine to Create Human Speech – Atlas Obscura [Accessed 15th May 2024].

Icecreamapps.com. (2024). Best Text To Speech Book Readers 2024: Top 8 – Icecream Apps. Available from: Best Text To Speech Book Readers 2024: Top 8 – Icecream Apps [Accessed 21st May 2024].

Sanusi, T. (2023). From Hawking to Siri: The evolution of speech synthesis. Available from: From Hawking to Siri: The Evolution of Speech Synthesis | Deepgram [Accessed 15th May 2024].

Svensson, I., Nordström, T., Lindeblad, E., Gustafson, S., Björn, M., Sand, C., … Nilsson, S. (2021). Effects of assistive technology for students with reading and writing disabilities. Disability and Rehabilitation: Assistive Technology, 16 (2), 196–208.

University of Lincoln. (2024). Screen Readers – Screen Readers and Accessibility – Guides at University of Lincoln. Available from: Screen Readers – Screen Readers and Accessibility – Guides at University of Lincoln [Accessed 21st May 2024].

Wood, S.G. et al. (2018) ‘Does Use of Text-to-Speech and Related Read-Aloud Tools Improve Reading Comprehension for Students with Reading Disabilities? A Meta-Analysis’, Journal of Learning Disabilities, 51(1), 73–84.

AI and Academic Writing

Where are we with AI and academic writing? Frankly the situation is a little chaotic. The institution line is still often that of ‘academic offence’, even though the institutions know this would be extremely hard to enforce. One reason for this being that anti-plagiarism software like Turnitin are in a perpetual catchup mode with the AI available, so unless everyone sticks with chatgpt (free on openai (they won’t)) then Turnitin’s detection algorithms will be outpaced by newcomers and rephrasers. Another one is that sometimes students write in styles identical to (an) AI. The common form of this is the very capable second (or more) language student; owing to the academic way the language is often learned, these students follow precise rules and sometimes do it extremely well. In turn, they follow the rules into stylistic use and end up sounding sufficiently like chatgpt that the software (and sometimes staff) pick it up, in turn they end up hauled in front of some disciplinary body just for being extremely smart.

This means that enforcement is hard to achieve, as one has to coordinate appearance of AI like style with some other evidence e.g. sudden alteration is style. This is possible, but time consuming for academics and if the student goes straight in with using the AI throughtout, then no style change detection will be possible; rephrasing software compounds the issue. Bearing in mind we’re in the total infancy of this technology, this is a difficult situation. I say difficult with no little thought; the situation is difficult not because of a negative connotation of difficulty, but rather because it is literally difficult to know what to do from here.

The essential question being ‘is assessment by academic writing in its current form a dead horse that we need to stop flogging?’ and if it isn’t dead yet, how long before it is dead (if indeed it will be dead at some point)? How will we know? Personally I would say it isn’t dead yet, but its death is probably between 2-5 years away. How will we know? We’ll know because the ability of AI to construct academic writing for students (and staff) will have permanently outstripped our ability to detect it either with software or with our minds.

That is, both in content and style, AI will produce work for students who wish to use it that will mean, if they don’t want to, then at least for the written components, their engagement with the material can be pretty much nil. Furthermore any student who, let’s say for integrity reasons, chooses to write their own work, may find themselves penalised by handicapping themselves to their human writing skills. Thus their integrity will get them quite possibly a lesser grade than their AI using colleagues.

But as we’re not there (yet) what can we do in this strange hinterland? This issue itself seems related to the future of AI and our interactions with it. That is, how guilty we feel about the interactions that we encourage, turns partially on what it will become. However since we cannot know where we are headed we don’t know how guilty to feel. What do I mean by ‘feeling guilty’? I mean this sense that we are cheating when we get AI to do work for us. Isn’t this a kind of crucial border, this meeting place between a legitimate productive use and losing part of ourselves which we possibly need to preserve?

Maybe we can sketch out two broad trajectories. In one, AI supplants our need for writing skills as it can produce any text we need more accurately and with greater detail than we can achieve. In another, writing skills continue to be needed because AI continues to fail to capture human synthetic abilities to generate insights. Because these insights were formed from human generated cognitive concatenations (consciously or unconsciously) the argumentative structures cannot be automatically written up by the AI and hence the ability to lay out the argument etc is still needed.

What is obvious is the blur of these heuristics. The former seems strange insofar as it indicates that whatever we want to write on, the AI can do it for us. This aligns this trajectory roughly with what some (mostly undergraduate) students might use it for, whilst the latter one seems more indicative of research usage.

The blur occurs because in the first case the student will still have an idea that they want the AI to write the essay on (admitting they also might not). Either way they have to engage with the AI and unless they literally want to hand in the first thing it writes, they have to do some thinking and engaging. No one is saying this minimal engagement is a good thing, it just means that even the laziest version has to have some effort in it. The second trajectory suggests that writing is still needed, however once the researcher has had this synthesising insight, whilst the AI may not be able to reconstruct their argument by itself, it can certainly help if you give it the different propositions and ask for paragraphs to be constructed around them. The point generally being that with the second trajectory, unless the academic is a kind of purist, doesn’t deny that AI could be used to help out with the writing.

It seems fairly clear that trajectory one we want to avoid, yet trajectory two could easily encompass quite a lot of AI written input. It seems to me the crucial part here was the academic’s synthesising idea. This idea was only made possible by the reading and thinking (conscious and unconscious) that the academic did. This reminds us that of course what is important in the educational/research process is actually comprehension. The first option strikes us as so bad, because comprehension is extremely low. I tried to highlight how the redeeming part or trajectory one is that it is on a gradient on which some students will at least have an idea on the topic, that they then get the AI to write the paper and then they read it to make sure it’s good. This redeeming aspect is their thinking engagement and comprehension.

Going forward with AI we need to find ways to emphasise comprehension of subject matters. We also need to accept the potential of AI to write for us, to help us write our ideas. The danger does lie in the lack of comprehension, but arguably there is a lot of lack of comphrension already, AI is just bringing out of the system the latent lack of student integrity and exposing it.

Academic writing in the traditional sense may well be ultimately largely supplanted by AI, but academic reading (and all other forms of learning, argument formation and thinking) cannot be allowed to do so. Indeed, in exposing the possible lack of motivation in the system, we can use this to think of new ways to engage students in understanding their subjects and helping them want to understand their subjects. The best the AI can be for us is probably be a new interlocutor. As soon as we have our new research insight, it goes into the system (the available research). From here it can be accessed by the AI to help other researchers, who must think carefully and through their own multiple inputs create new insights.

So the guilt issue should not be view so much as an issue with writing; it’s an issue with comprehension. We need to absolve ourselves of this nebulous guilt by the best practice of writing with AI and ensure that we remain active comprehenders, processors and producers of information —as opposed to passive receivers of AI insights. So long as we are exercising our capacities to think and comprehend to the best of our ability, then the AI becomes a partner that could be incredibly empowering. The danger lies in our, handing cognition and production over to it.

Podcasts and social confidence

Podcasts. I love them. The range of subject matter available is astonishing. I listen to several, ranging from esoteric to travelogues to sport on a regular basis. Each have their own personality: some quirky, others informative, and if you are lucky – both.

Soaring in popularity, over forty percent of internet users listen to at least one podcast a month in the UK alone (Götting, 2023).  UK podcast demographic data is revealing too: listeners tend to be younger males, higher earners, looking for innovation, urbanised, into sports and fitness, keen to learn new things, and politically left leaning (Götting, 2023). They also have greater informational needs too, as well as experiencing a heightened sense of community (Ellwood, 2022). The market is huge and accessible with YouTube, Apple Podcasts, SoundCloud, Wondery, Sticher, Spotify and BBCiPlayer occupying some of the main platforms. Yet their influence goes beyond consumerist tastes or self-identity.

Podcasts can also boost our social confidence. Like reading, podcasts can influence us subconsciously. Their conversational style can be infectious. For when I am presenting or in a meeting, I am mindful of the way I sound, conscious of the use of my voice and energy generated in the room. Have you ever sat in a lecture or a meeting and sensed the energy of the person who is speaking? It has an instrumental quality. Each of us have a projection of some kind. Radio, podcasts, the listening ear all have a duty to perform. Tobin & Guadagno’s (2022) illuminating study into podcast listening elicited positive outcomes such as parasocial relationships and social engagement, and fulfilled basic psychological needs for autonomy, competence, and relatedness. Their research also found that listening to podcasts improve active listening skills and subconsciously they may influence the way we act socially.

Since Covid, many of us started to present online for the first time, which is something which felt very alien to begin with. Podcasts are one way to think about the way we want to present ourselves. Chatty, friendly, engaging, social, adaptable, podcasts are a brilliant way of entering a space where the possibilities of being demonstrably authentic and comfortable in our skins to a broader public. Examine the way some presenters engage with their audience, or how they deal with someone asking a difficult or challenging question, it is often something to admire and absorb. These are the tools of the trade when presenting with social confidence, which are, perhaps, subconsciously channelled via podcast listening.

What podcasts do you listen to? Do they orientate your worldview or conversational style?

References

Ellwood, B. (2022). Listening to podcasts may help satisfy our psychological need for social connection, study finds. Available from: Listening to podcasts may help satisfy our psychological need for social connection, study finds (psypost.org) [Accessed 29th April 2024]

Götting, M. C. (2022). Leading podcast platforms in the U.S. 2020, by age group. Available from: Top podcast platforms in the U.S. 2020 | Statista (lincoln.ac.uk) [Accessed 26th April 2024]

Götting, M. C. (2024). Podcast listenership: selected countries and regions worldwide 2022-2026. Available from: Podcast listeners worldwide by country and region 2022-2026 | Statista (lincoln.ac.uk) [Accessed 26th April 2024]

Götting, M. C. (2023). Podcasts in the UK – statistics & facts | Statista (lincoln.ac.uk). Available from: Podcasts in the UK – statistics & facts | Statista (lincoln.ac.uk) [Accessed 26th April 2024]

Tobin, S. J. & Guadagno, R. E. (2022). Why people listen: Motivations and outcomes of podcast listening. PLoS ONE 17(4). Available from: Why people listen: Motivations and outcomes of podcast listening | PLOS ONE [Accessed 29th April 2024]