Accents & AI - I’ve Got Questions.

I recently had a discussion with a friend about the power of AI, the role of AI in speech therapy, and the future of communication.

What Is Communication?

Communication is at its most basic form a means of sending or receiving information. A way to transit, impart or exchange information between two parties.

As technology allows, the exact vehicle that this exchange occurs on changes.

Early on, this was visually communicated via gestures and behaviours.

This grew into a mutually-agreed upon system of sound-meaning connections - a socially-agreed upon system.

As verbal communication took off, the next stage was written communication. Themeans to communicate our verbal histories via pictographic representations of meaning.

The development of writing allowed humans to record information, transmit knowledge across generations, and communicate over long distances. Written communication played a crucial role in the development of civilizations, facilitating trade, governance, and cultural exchange.

My question is, as AI increasingly inserts itself into our lifestyles - making things more convenient and applying what is known about the world and applying it to our individual lives - does it have a role in our communication?

Science Fiction stories have already imagined it.

The universal translator.

The interesting thought is what goes into a translator AI product or service.

Now, let’s be clear. I don’t know what I’m talking about from a technological standpoint. That’s not where I want to take this thought exercise. What I am instead trying to do is consider the human communication implications, consider what it means to communicate, and even understand the concept of identity as it pertains to communication and accents.

Who Designs It & With What Intentionality?

We all know it’s coming. The real question is how soon and who is creating it? It is important that an intentional effort be placed into designing an inclusivity and diversity-promoting AI algorithmic design.

With such large language model programs, AI models, even if not intentionally malicious, are shaped by the type of data they are fed. With an awareness that English represents the primary language of the internet, recognizing and understanding that the type of data fed into these large language models can shape the AI algorithmic can introduce bias in training and inequality in access and representation is critical. This means bias towards English language patterns, cultural references, and perspectives. This can lead to limitations in understanding and generating content in other languages and cultures.

If there’s less data in other languages, then models trained primarily on English data will have limited proficiency in other languages, which can result in lower accuracy and fluency when processing or generating non-English output.

Obviously, this also means a development bias toward English. Research and development efforts in AI may prioritize English-language applications and use cases, potentially neglecting the needs and preferences of non-English-speaking populations.

Cultural and non-English phrases, idioms, sayings may not transfer or translate as well into English and are subsequently discarded or ignored.

Thinking About Use Cases.

The model use case for a universal translator device would be during travel. Stumbling into a coffee shop, pulling out a translation device and having a seamless interaction with someone with real-time translation.

The translation device would translate the user’s request for coffee into the native language and translate the corresponding response by the native language communicator into the user’s native language.

But what if the translation isn’t as good in one direction as it is in the other?

What if slang, specific idioms, or colloquialisms don’t work? What if they only work one way and not the other?

For example. the Canadian double-double (2 cream, 2 sugar). Would a translation from Japanese to English convert a 2 cream, 2 sugar coffee order into a double-double? And would that matter? Or would it not?

Would losing that double-double slang in translation make a difference to the communication? Probably not. But would it impact the cultural communication or slowly shift communication to a “right” way to say things and a “wrong” way - possibly.

What makes language and communication so interesting is the creativity and seemingly infinite possibilities of communication. Narrowing that creativity into acceptable or preferred standardizations makes the content communication easier but may in turn sacrifice some of the nuance.

Communication Is Interconnected with Identity.

When we talk about language and communication, it is absolutely interconnected with identity. One’s cultural identity is a reflection of their upbringing, their socio-economic background, and their language. It shapes one’s perception of the world and filters one’s interactions through a lens. When translation services exist, a third party and an additional lens are applied. When moderated by a human, it is this third human’s lens. When it is a program or service, it is this program or service’s designer and creator’s lens.

By being a part of that interaction, does it impact one’s identity? Now for trivial interactions, it may not matter. Asking for directions in a foreign country to the nearest bathroom? Probably an inconsequential slight on your identity.

But what about important speeches, debates, diplomatic interactions? There’s a reason why important texts have multiple interpretations - because they meaningfully can change the tone and feel of a text.

A current-day same vein consideration would be translated books. Do they still get the gist of the ideas across? Sure. But whether they capture the nuance, the subtlety, the slang, the metaphor, the hyperbole, the intricacies of the original text, may be up for debate.

And that brings me to the next point.

The Burden Of Responsibility.

With something like AI coming into this space, it can foreseeably get hairy if an AI system is overlaid atop interpersonal communication because it becomes the middleman. And in its role as middleman, can impact an interaction’s success and failure.

The decision-making built into this AI’s responsibilities and duties in an act such as translation can hugely impact an interaction and when touted as a go-to service, can be a vulnerability.

When we get to a point in which translation is widespread, would learning other languages still be important? Would it be meaningful? Would it be rude to ask someone to use a translation service rather than attempting to communicate in the learned language yourself?

What About Accents?

Where would accented speech land in this space? Is there a best accent or model accent to use? When translating for English, are we defaulting to British English, North American English, Australian English, or any other option available? If we communicate in a Texan English accent, how do we feel having our accented English converted to a General American English accent?

What about Spanish? How do we feel about a Columbian Spanish accent compared with a Peruvian Spanish accent or a Mexican Spanish accent? (And yes, I understand there are regional accents within these larger groupings).

What Matters?

When I think about it, I think it’s an unimaginably amazing technology that is coming. The capacity to have increased interconnected of communities and globalization is at the level of air travel. This level of interconnectedness will 100% help many people.

But just like how air travel was and still remains a luxury for those in a position of privilege, there’s an invisible cost to these types of services and a potential for biased predisposition.

Maybe individuals are fine with sacrificing the nuance if it means their content is understood. After all, it’s better than even the gist of things not being understood.

Maybe cultural identity can be disassociated with the act of communicating. After all, I can still self-identity with my culture while using technology. It’s not like google translate takes something away from me as a Canadian Chinese person when I travel to foreign places.

I don’t have an answer into what is best.

If anything, this feels more like the disorganized ramblings of a person who has no idea what he’s talking about.

- A Concerned & Ignorant Thinker.

Next
Next

Atomic Pronunciation Habits