Press "Enter" to skip to content

Decoding Speech: The Shared Struggles of Speech-to-Text Apps and Deaf Individuals

A human torso in a gray shirt with hands extended to front. In between the hands, a bot is floating with a speech bubble and other graphics.

In some situations where interpreters are not available, I’ve had the opportunity to be accompanied by a hard-of-hearing person who knows sign language. However, due to their limited hearing, they may encounter challenges in processing speech accurately. I’m sharing this because their experiences closely resemble the struggles faced by speech-to-text apps in many ways.

In this article, I delve into these shared experiences, helping you better understand the intricacies of processing speech, and shedding light on why relying solely on auto captions may not be the perfect solution.

Many assume that auto-generated text serves as an ideal fix for all deaf people, yet few have truly witnessed its performance. Some may grow annoyed with captioning errors, while others find humor in them. However, it’s vital to grasp that laughing at captioning errors is akin to chuckling at a deaf person’s misunderstanding of spoken words. Similarly, getting annoyed with errors is like being impatient with a deaf person who asks you to repeat yourself. Just like individuals with hearing loss, those relying on lipreading or hearing devices often find themselves filling in gaps to grasp conversations or presentations.

Now, let’s explore the experiences of deaf and hard-of-hearing individuals:

As a deaf person myself, I know lipreading only provides about 30% of the visual information I need, leaving the rest to various factors. Lipreading can feel like a guessing game, similar to playing Wheel of Fortune or Hangman – a truly exhausting process. I can effectively lipread only those whom I know well and who articulate clearly. One-on-one conversations are manageable, but group settings render understanding nearly impossible, even with people I can comprehend individually. This often leads to impatience or annoyance when I request others to repeat or rephrase their words.

Even individuals with some residual hearing relying on hearing devices may not catch every word accurately. Similar to lipreading, they frequently have to fill in blanks while listening and might misinterpret or guess words based on context. Sadly, they hesitate to ask for repetition or move to quieter areas due to the impatience they encounter from hearing individuals. Following group conversations becomes particularly challenging when voices overlap and/or when they are in noisy environments.

Considerations for using speech-to-text apps:

Enter speech-to-text apps, mirroring the behavior of deaf and hard-of-hearing individuals. These apps thrive when the speaker is clear and loud. However, their accuracy wanes in noisy environments, with multiple people speaking simultaneously, poor audio quality, or when someone mumbles or has a foreign accent. Specialized terminology further complicates matters. They stop working in a noisy environment or when people talk over each other.

When I use a speech-to-text app, I feel equally frustrated as a deaf person. I have much less ability to decipher speech than hard-of-hearing people or users who benefit from hearing devices. In noisy situations or when people talk over each other, the app stops working – in the same way hard-of-hearing people have difficulties understanding speech in a noisy or multi-talking environment. When frustrations compound to the point that even a speech-to-text app cannot decipher speech well or at all, it makes my experience even less optimal.

Therefore, even in informal situations, relying solely on auto captions falls short. Please don’t assume that just pushing a button to auto-generate text solves the problem.

To ensure optimal performance, certain prerequisites must be considered when using speech-to-text apps:

  • Internet access: Many apps rely on an internet connection, though some work offline with slightly less accuracy.
  • Battery life: Due to the intensive text generation process, these apps tend to drain battery quickly.
  • Good audio: Noisy environments, soft or quick speech, and multiple simultaneous speakers challenge speech-to-text accuracy.

Tips for Informal and Formal Situations:

Understanding these limitations is crucial for you as a hearing person:

  • Laughing at auto-generated text errors or getting annoyed without realizing its impact on those who rely on accurate speech-to-text access is unjust.
  • Recognize the frustration experienced by lipreaders or hearing device users when filling in blanks, occasionally being laughed at when they misunderstand.
  • Most importantly, never assume speech-to-text access is a perfect solution for deaf individuals, as comprehending its limitations hinders effective communication in seemingly simple conversations.

In informal settings, consider these tips:

  • Use your own speech-to-text apps on your phones to avoid burdening deaf or hard-of-hearing individuals.
  • Ensure power access for your phone and theirs.
  • Make sure audio is clear or consider moving to quieter areas.
  • Allow deaf individuals to be close to the speaker for better results.
  • Provide wifi access if requested.
  • Ask if they need any other accommodations.
  • For small events, using a microphone benefits hearing device users and speech-to-text technology.
  • Test various apps to understand how they work and avoid assuming it’s a magic solution. Periodically check auto-generated captions during conversations to avoid misunderstandings and to make sure they keep running.

For larger and more formal events, it is crucial to prioritize the involvement of consultants, like myself, to ensure the optimal accessibility of event content.

Having firsthand experience with the challenges, I can attest that following text with numerous typos and continuously guessing what should have been said is truly exhausting, akin to you, as a hearing person, trying to decipher poor speech for an extended period.

Moreover, it’s important to recognize that solely relying on speech-to-text accuracy, whether generated by humans or machines, falls short in providing a truly inclusive experience. The readability of on-screen text is equally vital, and there are numerous factors to consider for ensuring the optimal accessibility of media and events.

As an accessibility consultant, I specialize in coordinating various elements for both the physical and digital aspects of an event. Working closely with event organizers, I ensure that all accessibility elements are seamlessly integrated, allowing disabled attendees to fully enjoy the event without having to repeatedly explain their needs.

If your organization is committed to optimizing media and event accessibility, I encourage you to get in touch with me for consulting and training services. Together, we can bridge the gap and ensure seamless communication for all, fostering a more inclusive and welcoming environment.

error: Content is protected !!