indigenous languages in the age of the AI

Indigenous languages are at the core of the development of the linguistic science, connected to recent advances in computational language models that will change society for ever. Without indigenous languages, no computational innovation connected to language would have been possible. And I think that in the AI era that will happen again.

One of the main discoveries about the sounds of human languages happened when Edward Sapir was studying sounds of Paiute at the beginning of the 20th century. He noticed that the Paiutes would say /d/ and /t/ interchangeable in certain contexts, and that the /d/ would turn into /t/ when speaking slowly. Thus, the sounds of languages could change according to the surrounding sound, morphological function, or speed, but regardless of the changes, the speakers maintained that the /t/ was /t/ even when sometimes it would sound /d/. After that, language experts started to find the same phenomenon was possible in all languages. For example, in American English the /t/ and /d/ turns into a tap sound similar to Spanish /r/, but the native speakers do not realize they are saying something different. This is intriguing when the speaker is able to trace the tap towards a /t/ or /d/, which means that such distinction remains in the mind of the speaker. In Spanish, on the other hand, the /d/ becomes a /t/ or gets completely deleted at the end of the word, but native speakers are able to recognize there’s supposed to be a /d/ regardless of how they pronounce it.

Generative linguistics as proposed by Chomsky seemed straightforward when English was the only language being studied. Words and grammatical categories were possible to be arranged in tree-like structures to show their connections to form phrases. However, when studying Navajo or Athabaskan languages, Ken Hale discovered that such structural approach needed an intense rearrangement in order to explain indigenous languages. This showed that structures were also internal to the words, and connections among words implied myriads of networks in the various layers of meaning among words. This provided a foundational theory to describe Asian and African languages, and expanded the theory in a way that influenced the development of language processors and Artificial Inteligence.

In the military realm, speakers of indigenous languages contributed their intricate codes in order to conceal military communication against enemy armies. The Japanese excelled at breaking every code the American armies created, yet they were unable to break the “Navajo Code,” which was crucial to the victory against Japan in World War II.

Going further back in time, indigenous languages provided adequate taxonomies to describe plants and animals in the Americas during the Colonial period and early Republican times. The long-term contribution to science, preservation and exploitation of resources is invaluable. In the 16th century a Nahuatl indigenous scholar wrote an herborium by including the Nahuatl term, its translation to Latin, and a description of its characteristics and medicinal properties in Latin also, and detailed and colorful drawings. Such herbal compendium was later reproduced several times in the Colonial period, sometimes without proper recognition to the authors, Martín de la Cruz (who wrote it in Nahuatl) and Juan Badiano (translation to Latin).

In the AI current revolution, language is at the heart of a new wave of social change, whereby indigenous languages may bring another unexpected breakthrough. AI, like the WWII imperial Japanese, is still unable to crack the code of indigenous languages. At the moment, thus, indigenous languages may provide a source of resistance against machine domination. Once the AI attempts to break the code, new unexpected developments are going to surface. Thus, viewed as tools in favor or against the talking robot invasion, indigenous languages matter now more than ever.

Indigenous languages may exhibit more complex grammars and sound systems than European languages. Describing the system when you are not a native speaker of such languages requires intense study for a skill that does not seem marketable. As many speakers of those languages are bilingual, choosing the lingua franca is always the most natural outcome in social interaction, which discourages second language immersion. Also, some learners often hear discouraging words about their ability to learn those languages, increasing loneliness and lack of support. However, second language acquisition research shows that any language can be learned at any age. The adult learner may not be able to sound like a native speaker, but any adult learner is able to attain a level of proficiency in a target language.

Indigenous languages bear the power of being inextricable to the machine, but at the same time indigenous languages should take part in the AI revolution and start forming part of Large Language Models. That would accelerate the much needed description and documentation of Indigenous words and grammars. Linguists have attempted descriptions of words and grammars, but if those were put on the computer, then the AI could generate faster insights on how to produce speaker-appropriate sentences. AI might transform existing sources into large grammatical descriptions and pedagogical resources suited to the user.

A common misconception is believing that AI just chose not to add indigenous languages to train its model. Yes, probably that was not a top priority, but another reason why AI is so ignorant of indigenous languages relates to one simple fact: little is written on the web using indigenous languages as the vehicle of expression. There may be a lot ABOUT indigenous languages, but little IN indigenous languages. Thus, populating the web with text written in indigenous languages is the very first step. This is one reason why we created Corpus Of Diné Bizaad, to make a small contribution to the needs of the future.

Writing for no one to read, as disheartening as it may be, is something that we need to keep doing. Writing in indigenous languages, no matter proficiency levels, is something that we must be stubborn about when looking ahead to the society we want to build for the AI. Writing on the web, not on social media platforms, keep the content open to everyone to see on search engines. Writing instead of doing videos keeps the content easier to find to the reader. It is tempting to lose interest in writing when web pages struggle to earn traffic as it competes with AI platforms. However, this is a temporary situation of perplexity that will not stay forever.

References for this article can find in my conversation with ChatGPT:

https://chatgpt.com/share/68e1a232-00f8-8008-9679-724b2d480060


Posted

in

by

Tags:

Comments

5 responses to “indigenous languages in the age of the AI”

  1. jljl90casinologin Avatar

    jljl90casinologin is the place to be, I tell ya. Smooth login process and a surprisingly good selection of games. I was pleasantly surprised. Check it out: jljl90casinologin

  2. playgold365 Avatar

    Yo, been playing on playgold365 for a bit now. I’m digging the interface and the games are fun. No complaints so far. Check it out: playgold365

  3. hz888 Avatar

    Just tried hz888 and it’s pretty cool! Easy to navigate and seems legit. Definitely worth checking out if you’re looking for something new. Check it at hz888.

  4. novabetoficial Avatar

    Novabetoficial is legit! Great odds and a smooth interface. Definitely recommend giving it a shot if you’re into sports betting. Place your bet at novabetoficial.

  5. 70bet16 Avatar

    70bet16 is worth exploring. Simple, easy, and offers a fairly impressive experience. If you’re into betting, give it a look. Explore your odds at 70bet16.

Leave a Reply

Your email address will not be published. Required fields are marked *