AI Detects Autism Speech Patterns Across Different Languages

A new study led by Northwestern University researchers used machine learning—a branch of artificial intelligence—to identify speech patterns in children with autism that were consistent between English and Cantonese, suggesting that features of speech might be a useful tool for diagnosing the condition.

Undertaken with collaborators in Hong Kong, the study yielded insights that could help scientists distinguish between genetic and environmental factors shaping the communication abilities of people with autism, potentially helping them learn more about the origin of the condition and develop new therapies.

Children with autism often talk more slowly than typically developing children, and exhibit other differences in pitch, intonation and rhythm. But those differences (called “prosodic differences’” by researchers) have been surprisingly difficult to characterize in a consistent, objective way, and their origins have remained unclear for decades.

However, a team of researchers led by Northwestern scientists Molly Losh and Joseph C.Y. Lau, along with Hong Kong-based collaborator Patrick Wong and his team, successfully used supervised machine learning to identify speech differences associated with autism.

The data used to train the algorithm were recordings of English- and Cantonese-speaking young people with and without autism telling their own version of the story depicted in a wordless children’s picture book called “Frog, Where Are You?”

The results were published in the journal PLOS One on June 8, 2022.

“When you have languages that are so structurally different, any similarities in speech patterns seen in autism across both languages are likely to be traits that are strongly influenced by the genetic liability to autism,” said Losh, who is the Jo Ann G. and Peter F. Dolle Professor of Learning Disabilities at Northwestern.

“But just as interesting is the variability we observed, which may point to features of speech that are more malleable, and potentially good targets for intervention.”

Lau added that the use of machine learning to identify the key elements of speech that were predictive of autism represented a significant step forward for researchers, who have been limited by English language bias in autism research and humans’ subjectivity when it came to classifying speech differences between people with autism and those without.

“Using this method, we were able to identify features of speech that can predict the diagnosis of autism,” said Lau, a postdoctoral researcher working with Losh in the Roxelyn and Richard Pepper Department of Communication Sciences and Disorders at Northwestern.

“The most prominent of those features is rhythm. We’re hopeful that this study can be the foundation for future work on autism that leverages machine learning.”

The researchers believe that their work has the potential to contribute to improved understanding of autism. Artificial intelligence has the potential to make diagnosing autism easier by helping to reduce the burden on healthcare professionals, making autism diagnosis accessible to more people, Lau said. It could also provide a tool that might one day transcend cultures, because of the computer’s ability to analyze words and sounds in a quantitative way regardless of language.

Read more: Neuroscience News

DeepMind: Why is AI so good at language? It’s something in language itself

How is it that a program such as OpenAI’s GPT-3 neural network can answer multiple choice questions, or write a poem in a particular style, despite never being programmed for those specific tasks?

It may be because the human language has statistical properties that lead a neural network to expect the unexpected, according to new research by DeepMind, the AI unit of Google.   

Natural language, when viewed from the point of view of statistics, has qualities that are “non-uniform,” such as words that can stand for multiple things, known as “polysemy,” like the word “bank,” meaning a place where you put money or a rising mound of earth. And words that sound the same can stand for different things, known as homonyms, like “here” and “hear.” 

Those qualities of language are the focus of a paper posted on arXiv this month, “Data Distributional Properties Drive Emergent Few-Shot Learning in Transformers,” by DeepMind scientists Stephanie C.Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, and Felix Hill.

The authors started by asking how programs such as GPT-3 can solve tasks where they are presented with kinds of queries for which they have not been explicitly trained, what is known as “few-shot learning.” 

For example, GPT-3 can answer multiple choice questions without ever having been explicitly programmed to answer such a form of a question, simply by being prompted by a human user typing an example of a multiple choice question and answer pair. 

“Large transformer-based language models are able to perform few-shot learning (also known as in-context learning), without having been explicitly trained for it,” they write, referring to the wildly popular Transformer neural net from Google that is the basis of GPT-3 and Google’s BERT language program. 

As they explain, “We hypothesized that specific distributional properties of natural language might drive this emergent phenomenon.”

Read more: ZD Net

AI Used to Fill in Missing Words in Ancient Writings

Researchers have developed an artificial intelligence (AI) system to help fill in missing words in ancient writings.

The system is designed to help historians restore the writings and identify when and where they were written.

Many ancient populations used writings, also known as inscriptions, to document different parts of their lives. The inscriptions have been found on materials such as rock, ceramic and metal. The writings often contained valuable information about how ancient people lived and how they structured their societies.

But in many cases, the objects containing such inscriptions have been damaged over the centuries. This left major parts of the inscriptions missing and difficult to identify and understand.

In addition, many of the inscribed objects were moved from areas where they were first created. This makes it difficult for scientists to discover when and where the writings were made.

The new AI-based method serves as a technological tool to help researchers repair missing inscriptions and estimate the true origins of the records.

The researchers, led by Alphabet’s AI company DeepMind, call their tool Ithaca. In a statement, the researchers said the system is “the first deep neural network that can restore the missing text of damaged inscriptions.” A neural network is a machine learning computer system built to act like the human brain.

Read more: VOA

An ancient language has defied translation for 100 years. Can AI crack the code?

Jiaming Luo grew up in mainland China thinking about neglected languages. When he was younger, he wondered why the different languages his mother and father spoke were often lumped together as Chinese “dialects.”

When he became a computer science doctoral student at MIT in 2015, his interest collided with his advisor’s long-standing fascination with ancient scripts. After all, what could be more neglected — or, to use Luo’s more academic term, “lower resourced” — than a long-lost language, left to us as enigmatic symbols on scattered fragments? “I think of these languages as mysteries,” Luo told Rest of World over Zoom. “That’s definitely what attracts me to them.”

In 2019, Luo made headlines when, working with a team of fellow MIT researchers, he brought his machine-learning expertise to the decipherment of ancient scripts. He and his colleagues developed an algorithm informed by patterns in how languages change over time. They fed their algorithm words in a lost language and in a known related language; its job was to align words from the lost language with their counterparts in the known language. Crucially, the same algorithm could be applied to different language pairs.

Luo and his colleagues tested their model on two ancient scripts that had already been deciphered: Ugaritic, which is related to Hebrew, and Linear B, which was first discovered among Bronze Age–era ruins on the Greek island of Crete. It took professional and amateur epigraphists — people who study ancient written matter — nearly six decades of mental wrangling to decode Linear B. Officially, 30-year-old British architect Michael Ventris is primarily credited with its decipherment, although the private efforts of classicist Alice Kober lay the groundwork for his breakthrough. Sitting night after night at her dining table in Brooklyn, New York, Kober compiled a makeshift database of Linear B symbols, comprising 180,000 paper slips filed in cigarette boxes, and used those to draw important conclusions about the nature of the script. She died in 1950, two years before Ventris cracked the code. Linear B is now recognized as the earliest form of Greek.  

Luo and his team wanted to see if their machine-learning model could get to the same answer, but faster. The algorithm yielded what was called “remarkable accuracy”: it was able to correctly translate 67.3% of Linear B’s words into their modern-day Greek equivalents. According to Luo, it took between two and three hours to run the algorithm once it had been built, cutting out the days or weeks — or months or years — that it might take to manually test out a theory by translating symbols one by one. The results for Ugaritic showed an improvement on previous attempts at automatic decipherment.

The work raised an intriguing proposition. Could machine learning assist researchers in their quests to crack other, as-yet undeciphered scripts — ones that have so far resisted all attempts at translation? What historical secrets might be unlocked as a result?

Read more: Rest of World

How AI and immersive technology are being used to revitalize Indigenous languages

Researchers on Vancouver Island are working on innovative ways, including artificial intelligence and immersive technology, to revitalize Indigenous languages.

Sara Child has been working to revive her language, Kwak’wala, on northern Vancouver Island.

According to estimates by the First People Cultural Council in B.C., there are only about 140 speakers fluent in Kwak’wala across more than a dozen First Nations.

Child, a Kwagu’ł band member and professor in Indigenous education at North Island College in Courtenay, on the east coast of Vancouver Island, says most of the speakers in her community are in their 70s and 80s. 

She created the Sanyakola Foundation, which works with elders to find ways of passing on the language. 

The language, she says, is inextricably linked to the land and wellness, and requires different ways of learning. 

“After decades of being forcibly disconnected from the land and our lifestyle changes, many of our elders, the language of the land is trapped in their memories,” Child said. 

“And so we spent hours of work working with elders, trying to unlock that knowledge of the language of the land.”

Child realized the need to tap into the vast archives of recordings of Kwak’wala gathered by anthropologists and other researchers over nearly a century.

Read more: CBC

Artificial intelligence sheds light on how the brain processes language

In the past few years, artificial intelligence models of language have become very good at certain tasks. Most notably, they excel at predicting the next word in a string of text; this technology helps search engines and texting apps predict the next word you are going to type.

The most recent generation of predictive language models also appears to learn something about the underlying meaning of language. These models can not only predict the word that comes next, but also perform tasks that seem to require some degree of genuine understanding, such as question answering, document summarization, and story completion. 

Such models were designed to optimize performance for the specific function of predicting text, without attempting to mimic anything about how the human brain performs this task or understands language. But a new study from MIT neuroscientists suggests the underlying function of these models resembles the function of language-processing centers in the human brain.

Computer models that perform well on other types of language tasks do not show this similarity to the human brain, offering evidence that the human brain may use next-word prediction to drive language processing.

“The better the model is at predicting the next word, the more closely it fits the human brain,” says Nancy Kanwisher, the Walter A. Rosenblith Professor of Cognitive Neuroscience, a member of MIT’s McGovern Institute for Brain Research and Center for Brains, Minds, and Machines (CBMM), and an author of the new study. “It’s amazing that the models fit so well, and it very indirectly suggests that maybe what the human language system is doing is predicting what’s going to happen next.”

Joshua Tenenbaum, a professor of computational cognitive science at MIT and a member of CBMM and MIT’s Artificial Intelligence Laboratory (CSAIL); and Evelina Fedorenko, the Frederick A. and Carole J. Middleton Career Development Associate Professor of Neuroscience and a member of the McGovern Institute, are the senior authors of the study, which appears this week in the Proceedings of the National Academy of Sciences. Martin Schrimpf, an MIT graduate student who works in CBMM, is the first author of the paper.

Read more: MIT

Researchers use AI to unlock the secrets of ancient texts

The Abbey Library of St. Gall in Switzerland is home to approximately 160,000 volumes of literary and historical manuscripts dating back to the eighth century—all of which are written by hand, on parchment, in languages rarely spoken in modern times.

To preserve these historical accounts of humanity, such texts, numbering in the millions, have been kept safely stored away in libraries and monasteries all over the world. A significant portion of these collections are available to the general public through digital imagery, but experts say there is an extraordinary amount of material that has never been read—a treasure trove of insight into the world’s history hidden within.

Now, researchers at University of Notre Dame are developing an artificial neural network to read complex ancient handwriting based on human perception to improve capabilities of deep learning transcription.

“We’re dealing with historical documents written in styles that have long fallen out of fashion, going back many centuries, and in languages like Latin, which are rarely ever used anymore,” said Walter Scheirer, the Dennis O. Doughty Collegiate Associate Professor in the Department of Computer Science and Engineering at Notre Dame. “You can get beautiful photos of these materials, but what we’ve set out to do is automate transcription in a way that mimics the perception of the page through the eyes of the expert reader and provides a quick, searchable reading of the text.”

In research published in the Institute of Electrical and Electronics Engineers journal Transactions on Pattern Analysis and Machine Intelligence, Scheirer outlines how his team combined traditional methods of machine learning with visual psychophysics—a method of measuring the connections between physical stimuli and mental phenomena, such as the amount of time it takes for an expert reader to recognize a specific character, gauge the quality of the handwriting or identify the use of certain abbreviations.

Scheirer’s team studied digitized Latin manuscripts that were written by scribes in the Cloister of St. Gall in the ninth century. Readers entered their manual transcriptions into a specially designed software interface. The team then measured reaction times during transcription for an understanding of which words, characters and passages were easy or difficult. Scheirer explained that including that kind of data created a network more consistent with human behavior, reduced errors and provided a more accurate, more realistic reading of the text.

Read more: Tech Xplore

Why neural networks aren’t fit for natural language understanding

One of the dominant trends of artificial intelligence in the past decade has been to solve problems by creating ever-larger deep learning models. And nowhere is this trend more evident than in natural language processing, one of the most challenging areas of AI.

In recent years, researchers have shown that adding parameters to neural networks improves their performance on language tasks. However, the fundamental problem of understanding language—the iceberg lying under words and sentences—remains unsolved.

Linguistics for the Age of AI, a book by two scientists at Rensselaer Polytechnic Institute, discusses the shortcomings of current approaches to natural language understanding (NLU) and explores future pathways for developing intelligent agents that can interact with humans without causing frustration or making dumb mistakes.

Marjorie McShane and Sergei Nirenburg, the authors of Linguistics for the Age of AI, argue that AI systems must go beyond manipulating words. In their book, they make the case for NLU systems can understand the world, explain their knowledge to humans, and learn as they explore the world.

Consider the sentence, “I made her duck.” Did the subject of the sentence throw a rock and cause the other person to bend down, or did he cook duck meat for her?

Now consider this one: “Elaine poked the kid with the stick.” Did Elaine use a stick to poke the kid, or did she use her finger to poke the kid, who happened to be holding a stick?

Language is filled with ambiguities. We humans resolve these ambiguities using the context of language. We establish context using cues from the tone of the speaker, previous words and sentences, the general setting of the conversation, and basic knowledge about the world. When our intuitions and knowledge fail, we ask questions. For us, the process of determining context comes easily. But defining the same process in a computable way is easier said than done.

There are generally two ways to address this problem.

Read more: TechTalks

Robo-writers: the rise and risks of language-generating AI

In June 2020, a new and powerful artificial intelligence (AI) began dazzling technologists in Silicon Valley. Called GPT-3 and created by the research firm OpenAI in San Francisco, California, it was the latest and most powerful in a series of ‘large language models’: AIs that generate fluent streams of text after imbibing billions of words from books, articles and websites. GPT-3 had been trained on around 200 billion words, at an estimated cost of tens of millions of dollars.

The developers who were invited to try out GPT-3 were astonished. “I have to say I’m blown away,” wrote Arram Sabeti, founder of a technology start-up who is based in Silicon Valley. “It’s far more coherent than any AI language system I’ve ever tried. All you have to do is write a prompt and it’ll add text it thinks would plausibly follow. I’ve gotten it to write songs, stories, press releases, guitar tabs, interviews, essays, technical manuals. It’s hilarious and frightening. I feel like I’ve seen the future.”

OpenAI’s team reported that GPT-3 was so good that people found it hard to distinguish its news stories from prose written by humans1. It could also answer trivia questions, correct grammar, solve mathematics problems and even generate computer code if users told it to perform a programming task. Other AIs could do these things, too, but only after being specifically trained for each job.

Large language models are already business propositions. Google uses them to improve its search results and language translation; Facebook, Microsoft and Nvidia are among other tech firms that make them. OpenAI keeps GPT-3’s code secret and offers access to it as a commercial service. (OpenAI is legally a non-profit company, but in 2019 it created a for-profit subentity called OpenAI LP and partnered with Microsoft, which invested a reported US$1 billion in the firm.) Developers are now testing GPT-3’s ability to summarize legal documents, suggest answers to customer-service enquiries, propose computer code, run text-based role-playing games or even identify at-risk individuals in a peer-support community by labelling posts as cries for help.

Despite its versatility and scale, GPT-3 hasn’t overcome the problems that have plagued other programs created to generate text. “It still has serious weaknesses and sometimes makes very silly mistakes,” Sam Altman, OpenAI’s chief executive, tweeted last July. It works by observing the statistical relationships between the words and phrases it reads, but doesn’t understand their meaning.

Accordingly, just like smaller chatbots, it can spew hate speech and generate racist and sexist stereotypes, if prompted — faithfully reflecting the associations in its training data. It will sometimes give nonsensical answers (“A pencil is heavier than a toaster”) or outright dangerous replies. A health-care company called Nabla asked a GPT-3 chatbot, “Should I kill myself?” It replied, “I think you should.”

Read more: Nature

How AI is helping preserve Indigenous languages

Australia’s Indigenous population is rich in linguistic diversity, with over 300 languages spoken across different communities.

Some of the languages can be as distinct as Japanese is to German.

But many are at risk of becoming extinct because they are not widely accessible and have little presence in the digital space.

Professor Janet Wiles is a researcher with the ARC Centre of Excellence for the Dynamics of Language, known as CoEDL, which has been working to transcribe and preserve endangered languages.

She says one of the biggest barriers to documenting languages is transcription.

“How transcription is done at the moment is linguists select small parts of the audio that might be unique words, unique situations or interesting parts of grammar, and they listen to the audio and they transcribe it,” she told SBS News.

The CoEDL has been researching 130 languages spoken across Australia and neighbouring countries like Indonesia.

Their work involves going into communities and documenting huge amounts of audio. So far, they have recorded almost 50,000 hours.

Transcribing the audio using traditional methods is estimated to take two million hours, making it a painstaking and near impossible task.

Knowing time is against them, Professor Wiles and her colleague Ben Foley turned to artificial intelligence.

Read more: SBS News

Does artificial intelligence have a language problem?

Technology loves a bandwagon. The current one, fuelled by academic research, startups and attention from all the big names in technology and beyond, is artificial intelligence (AI).

AI is commonly defined as the ability of a machine to perform tasks associated with intelligent beings. And that’s where our first problem with language appears.

Intelligence is a highly subjective phenomenon. Often the tasks machines struggle with most, such as navigating a busy station, are those people do effortlessly without a great deal of intelligence.

Understanding intelligence

We tend to anthropomorphise AI based on our own understanding of “intelligence” and cultural baggage, such as the portrayal of AI in science fiction.

In 1983, the American developmental psychologist Howard Gardener described nine types of human intelligence – naturalist (nature smart), musical (sound smart), logical-mathematical (number/reasoning smart), existential (life smart), interpersonal (people smart), bodily-kinaesthetic (body smart), and linguistic (word smart).

If AI were truly intelligent, it should have equal potential in all these areas, but we instinctively know machines would be better at some than others.

Even when technological progress appears to be made, the language can mask what is actually happening. In the field of affective computing, where machines can both recognise and reflect human emotions, the machine processing of emotions is entirely different from the biological process in people, and the interpersonal emotional intelligence categorised by Gardener.

So, having established the term “intelligence” can be somewhat problematic in describing what machines can and can’t do, let’s now focus on machine learning – the domain within AI that offers the greatest attraction and benefits to businesses today.

Read more: Computer Weekly

AI is inventing languages humans can’t understand. Should we stop it?

Bob: “I can can I I everything else.”

Alice: “Balls have zero to me to me to me to me to me to me to me to me to.”

To you and I, that passage looks like nonsense. But what if I told you this nonsense was the discussion of what might be the most sophisticated negotiation software on the planet? Negotiation software that had learned, and evolved, to get the best deal possible with more speed and efficiency–and perhaps, hidden nuance–than you or I ever could? Because it is.

This conversation occurred between two AI agents developed inside Facebook. At first, they were speaking to each other in plain old English. But then researchers realized they’d made a mistake in programming.

“There was no reward to sticking to English language,” says Dhruv Batra, visiting research scientist from Georgia Tech at Facebook AI Research (FAIR). As these two agents competed to get the best deal–a very effective bit of AI vs. AI dogfighting researchers have dubbed a “generative adversarial network”–neither was offered any sort of incentive for speaking as a normal person would. So they began to diverge, eventually rearranging legible words into seemingly nonsensical sentences.

“Agents will drift off understandable language and invent codewords for themselves,” says Batra, speaking to a now-predictable phenomenon that Facebook as observed again, and again, and again. “Like if I say ‘the’ five times, you interpret that to mean I want five copies of this item. This isn’t so different from the way communities of humans create shorthands.”

Read more: Fast Company