The Race to Save Indigenous Languages, Using Automatic Speech Recognition

Michael Running Wolf still has that old TI-89 graphing calculator he used in high school that helped propel his interest in technology. 

“Back then, my teachers saw I was really interested in it,” says Running Wolf, clinical instructor of computer science at Northeastern University. “Actually a couple of them printed out hundreds of pages of instructions for me on how to code” the device so that it could play games. 

What Running Wolf, who grew up in a remote Cheyenne village in Birney, Montana, didn’t realize at the time, poring over the stack of printouts at home by the light of kerosene lamps, was that he was actually teaching himself basic programming.

“I thought I was just learning how to put computer games on my calculator,” Running Wolf says with a laugh. 

But it hadn’t been his first encounter with technology. Growing up in the windy plains near the Northern Cheyenne Indian Reservation, Running Wolf says that although his family—which is part Cheyenne, part Lakota—didn’t have daily access to running water or electricity, sometimes, when the winds died down, the power would flicker on, and he’d plug in his Atari console and play games with his sisters. 

These early experiences would spur forward a lifelong interest in computers, artificial intelligence, and software engineering that Running Wolf is now harnessing to help reawaken endangered indigenous languages in North and South America, some of which are so critically at risk of extinction that their tallies of living native speakers have dwindled into the single digits. 

Running Wolf’s goal is to develop methods for documenting and maintaining these early languages through automatic speech recognition software, helping to keep them “alive” and well-documented. It would be a process, he says, that tribal and indigenous communities could use to supplement their own language reclamation efforts, which have intensified in recent years amid the threats facing languages

“The grandiose plan, the far-off dream, is we can create technology to not only preserve, but reclaim languages,” says Running Wolf, who teaches computer science at Northeastern’s Vancouver campus. “Preservation isn’t what we want. That’s like taking something and embalming it and putting it in a museum. Languages are living things.”

The better thing to say is that they’ve “gone to sleep,” Running Wolf says. 

Read more: Northeastern University

Meet The Woman Who Created An App To Save Her Endangered Language

After the Vietnam War, Annie Vang’s parents escaped persecution in Laos and traversed the Mekong River in the dead of night to seek safety in Thailand. “My family had no choice but to flee or die,” she says.

Vang and her family are Hmong, an ethnic and cultural group who lost their land—and way of life—after siding with the U.S. in the fight against Communism. Like so many other Hmong people, Vang’s family resettled in a refugee camp before coming to the U.S. in the late ’70s. Growing up in Iowa, Vang remembers being bullied for having an accent and “looking different” than everyone else. “I was told to go back to my country every day,” she says. “I just wanted to be like everyone else and assimilate and forget about my Hmong roots.”

Yet as an adult, the 44-year-old is doing everything in her power to preserve her cultural history. For more than a decade, the app developer has been digitally documenting the Hmong language with HmongPhrases, an app she created to teach the Hmong language to English speakers. “It is critical we capture this, so that our language, legacy, and stories can live on,” she says.

Read more: Elle

Linguists predict unknown words using language comparison

For a long time, historical linguists have been using the comparative method to reconstruct earlier states of languages that are not attested in written sources. The method consists of the detailed comparison of words in the related descendant languages and allows linguists to infer the ancient pronunciation of words which were never recorded in any form in great detail. That the method can also be used to infer how an undocumented word in a certain language would sound, provided that at least some information on that language, as well as information on related languages is available, has been known for a long time, but so far never explicitly tested.

Two researchers from SOAS University of London and the Max Planck Institute for the Science of Human History have recently published a paper in the renowned international journal for historical linguistics, Diachronica. In the article, they describe the results of an experiment in which they applied the traditional comparative method to explicitly predict the pronunciation of words in eight Western Kho-Bwa linguistic varieties spoken in India. Belonging to the Trans-Himalayan family (also known as Sino-Tibetan and Tibeto-Burman language family), these varieties have not yet been described in much detail and many words had not yet been documented in field work. The scholars started their experiment with an existing etymological dataset of Western Kho-Bwa varieties that was collected during fieldwork in the Indian state of Arunachal Pradesh between 2012 and 2017. Within the dataset, the authors observed multiple gaps in which the word forms for certain concepts were missing.

“When conducting fieldwork, it is inevitable that you miss out on some words. It’s kind of annoying when you observe that afterwards, but in this case, we realized that this was the perfect opportunity to test how well the methods for linguistic reconstruction actually work,” says Tim Bodt, first author of the study.

Read more: EurekAlert!

In Brazil, smartphone initiative keeps indigenous languages alive

For smartphone users deep in the Amazon, sending a text message in the Nheengatu language just got easier – giving their endangered native tongue a better chance of survival in the digital age.

Nheengatu, spoken by Amazon tribes in Brazil, Venezuela and Colombia, is now available as a language option on a range of new Motorola phones and any mobiles that run the Android 11 operating system.

Kaingang, an Indian language spoken by some 19,000 people in southern Brazil, is also being offered as part of the project financed by the smartphone manufacturer.

Both languages are at risk, as young people stop using them, but linguists say initiatives like Motorola’s not only encourage their use in daily life but help restore “prestige” to endangered tongues.

While not spoken by many people, Nheengatu and Kaingang have had an important cultural impact. Many words in Brazilian Portuguese originate from Nheengatu, and the language has given names to hundreds of species of fauna and flora in the Amazon. 

Wilmar D’Angelis, a professor at Campinas State University (Unicamp) who led the smartphone project, told the Thomson Reuters Foundation why taking indigenous languages digital could help to keep them alive.

Read more: Thomson Reuters Foundation

Can virtual reality help save endangered Pacific languages?

The Pacific is the most linguistically rich region in the world, with Papua New Guinea alone being home to a staggering 850 languages.

Yet experts fear that widespread language loss could be the future for the region.

To draw attention to the issue, and to document more Pacific languages, Australian researchers are trialling a new way of making their database of languages more exciting and accessible.

To do this, they are turning to virtual reality technology.

“We’ve got this fantastic resource — a database of a thousand endangered languages,” lead researcher Dr Nick Thieberger from the University of Melbourne said.

“But it’s not very engaging, it’s a bit dull, so we wanted to do something to change that.”

Over the past 15 years, researchers from Australian universities have been digitalising recordings of languages and storing them in the Pacific and Regional Archive for Digital Sources in Endangered Cultures (PARADISEC).

The database has documented more than 6,000 hours of recordings from over 1,000 languages.

Earlier this year, Dr Thieberger, Dr Rachel Hendry — a lecturer in digital humanities — and media artist Dr Andrew Burrell created a virtual reality experience using files from the database.

Audiences don a pair of virtual reality goggles, allowing them to “fly across” Pacific nations such as Vanuatu and Papua New Guinea.

As they do so, shards of light emerge that play clips of local languages.

The VR display is currently only exhibited in museums, but the team is working on versions that could be accessed anywhere.

Read more: ABC News

AI is inventing languages humans can’t understand. Should we stop it?

Bob: “I can can I I everything else.”

Alice: “Balls have zero to me to me to me to me to me to me to me to me to.”

To you and I, that passage looks like nonsense. But what if I told you this nonsense was the discussion of what might be the most sophisticated negotiation software on the planet? Negotiation software that had learned, and evolved, to get the best deal possible with more speed and efficiency–and perhaps, hidden nuance–than you or I ever could? Because it is.

This conversation occurred between two AI agents developed inside Facebook. At first, they were speaking to each other in plain old English. But then researchers realized they’d made a mistake in programming.

“There was no reward to sticking to English language,” says Dhruv Batra, visiting research scientist from Georgia Tech at Facebook AI Research (FAIR). As these two agents competed to get the best deal–a very effective bit of AI vs. AI dogfighting researchers have dubbed a “generative adversarial network”–neither was offered any sort of incentive for speaking as a normal person would. So they began to diverge, eventually rearranging legible words into seemingly nonsensical sentences.

“Agents will drift off understandable language and invent codewords for themselves,” says Batra, speaking to a now-predictable phenomenon that Facebook as observed again, and again, and again. “Like if I say ‘the’ five times, you interpret that to mean I want five copies of this item. This isn’t so different from the way communities of humans create shorthands.”

Read more: Fast Company

Glove turns sign language into text for real-time translation

Handwriting will never be the same again. A new glove developed at the University of California, San Diego, can convert the 26 letters of American Sign Language (ASL) into text on a smartphone or computer screen. Because it’s cheaper and more portable than other automatic sign language translators on the market, it could be a game changer. People in the deaf community will be able to communicate effortlessly with those who don’t understand their language. It may also one day fine-tune our control of robots.

ASL is a language all of its own, but few people outside the deaf community speak it. For many signing is their only language, as learning written English, for example, can be difficult without having the corresponding sounds to go with it.

“For thousands of people in the UK, sign language is their first language,” says Jesal Vishnuram, the technology research manager at the charity Action on Hearing Loss. “Many have little or no written English. Technology like this will completely change their lives.”

When they need to communicate with people who are not versed in ASL, their options are limited. In the UK, someone who is deaf is entitled to a sign language translator at work or when visiting a hospital, but at a train station, for example, it can be incredibly difficult to communicate with people who don’t sign. In this situation a glove that can translate for them would make life much easier.

Read more: New Scientist

Elon Musk and linguists say that AI is forcing us to confront the limits of human language

In analytic philosophy, any meaning can be expressed in language. In his book Expression and Meaning (1979), UC Berkeley philosopher John Searle calls this idea “the principle of expressibility, the principle that whatever can be meant can be said”. Moreover, in the Tractatus Logico-Philosophicus (1921), Ludwig Wittgenstein suggests that “the limits of my language mean the limits of my world”.

Outside the hermetically sealed field of analytic philosophy, the limits of natural language when it comes to meaning-making have long been recognized in both the arts and sciences. Psychology and linguistics acknowledge that language is not a perfect medium. It is generally accepted that much of our thought is non-verbal, and at least some of it might be inexpressible in language. Notably, language often cannot express the concrete experiences engendered by contemporary art and fails to formulate the kind of abstract thought characteristic of much modern science. Language is not a flawless vehicle for conveying thought and feelings.

In the field of artificial intelligence, technology can be incomprehensible even to experts. In the essay “Is Artificial Intelligence Permanently Inscrutable?” Princeton neuroscientist Aaron Bornstein discusses this problem with regard to artificial neural networks (computational models): “Nobody knows quite how they work. And that means no one can predict when they might fail.” This could harm people if, for example, doctors relied on this technology to assess whether patients might develop complications.

Bornstein says organizations sometimes choose less efficient but more transparent tools for data analysis and “even governments are starting to show concern about the increasing influence of inscrutable neural-network oracles.” He suggests that “the requirement for interpretability can be seen as another set of constraints, preventing a model from a ‘pure’ solution that pays attention only to the input and output data it is given, and potentially reducing accuracy.” The mind is a limitation for artificial intelligence: “Interpretability could keep such models from reaching their full potential.” Since the work of such technology cannot be fully understood, it is virtually impossible to explain in language.

Read more: Quartz

Has Google made the first step toward general AI?

Artificial Intelligence (AI) has long been a theme of Sci-fi blockbusters, but as technology develops in 2017, the stuff of fiction is fast becoming a reality. As technology has made leaps and bounds in our lives, the presence of AI is something we are adapting to and incorporating in our everyday existence. A brief history of the different types of AI helps us to understand how we got where we are today, and more importantly, where we are headed.

A Brief History of AI

Narrow AI – Since the 1950’s, specific technologies have been used to carry out rule-based tasks as well as, or better than, people. A good example of this is the Manchester Electronic Computer for playing chess or the automated voice you speak with when you call your bank.

Machine Learning – Algorithms which use large amounts of data to ‘train’ machines to properly identify and separate appropriate data into subsets that can be used to make predictions has been in use since the 1990s. The large amounts of data are basically allowing programming machines to learn rather than follow defined rules. Apple’s digital assistant, Siri, is one example of this. Machine translations for processes like web page translation is aso a common tool

Read more: The London Economic

Instant Braille translator can fit in your hand

An all-woman team of six engineering undergraduate students at MIT has created an inexpensive, hand-held device prototype that provides real-time translation of printed text to Braille — which could greatly increase accessibility of written materials for the blind.

Team Tactile was one of the winners of the Lemelson-MIT Student Prize this year for their creation, which translates printed text into the raised-dot language.

Here’s how it works: The device has an internal camera that takes photos of the printed text, which is then converted into digital text using optical character recognition software. Next, the text is translated into Braille, and a mechanical system raises and lowers pins on the surface of the Tactile that form the characters to be read by one’s fingertips.

Though the current version is limited in the number of characters it can translate and display, the team hopes to make the device capable of scanning an entire page at a time and displaying two lines of text at once.

Read more: New Scientist

How Language Led To The Artificial Intelligence Revolution

In 2013 I had a long interview with Peter Lee, corporate vice president of Microsoft Research, about advances in machine learning and neural networks and how language would be the focal point of artificial intelligence in the coming years.

At the time the notion of artificial intelligence and machine learning seemed like a “blue sky” researcher’s fantasy. Artificial intelligence was something coming down the road … but not soon.

I wish I had taken the talk more seriously.

Language is, and will continue to be, the most important tool for the advancement of artificial intelligence. In 2017, natural language understanding engines are what drive the advancement of bots and voice-activated personal assistants like Microsoft’s Cortana, Google Assistant, Amazon’s Alexa and Apple’s Siri. Language was the starting point and the locus of all new machine learning capabilities that have come out in recent years.

Language—both text and spoken—is what is giving rise to a whole new era of human-computer interaction. When people had trouble imagining what could possibly come after smartphone apps as the pinnacle of user experience, researchers were building the tools for a whole new generation of interface based on language.

Read more: ARC

Is language as we know it still relevant for the digital age?

In analytic philosophy, any meaning can be expressed in language. In his book Expression and Meaning (1979), UC Berkeley philosopher John Searle calls this idea ‘the principle of expressibility, the principle that whatever can be meant can be said’. Moreover, in the Tractatus Logico-Philosophicus (1921), Ludwig Wittgenstein suggests that ‘the limits of my language mean the limits of my world’.

Outside the hermetically sealed field of analytic philosophy, the limits of natural language when it comes to meaning-making have long been recognised in both the arts and sciences. Psychology and linguistics acknowledge that language is not a perfect medium. It is generally accepted that much of our thought is non-verbal, and at least some of it might be inexpressible in language. Notably, language often cannot express the concrete experiences engendered by contemporary art and fails to formulate the kind of abstract thought characteristic of much modern science. Language is not a flawless vehicle for conveying thought and feelings.

In the field of artificial intelligence, technology can be incomprehensible even to experts. In the essay ‘Is Artificial Intelligence Permanently Inscrutable?’, Princeton neuroscientist Aaron Bornstein discusses this problem with regard to artificial neural networks (computational models): “nobody knows quite how they work. And that means no one can predict when they might fail.” This could harm people if, for example, doctors relied on this technology to assess whether patients might develop complications.

Read more: openDemocracy