Puzzle Monday: Secrets of the Original Code-Talkers

June 27th, 2022 by In autumn 1918, in the Allied trenches, a U.S. military captain walked past two Native American soldiers chatting in a language he didn’t understand. They were speaking Choctaw. With about 7,000 speakers, it is one of the 10 most-spoken Native American languages in the United States. The unnamed captain had an idea: Why not use this language for sending secret military messages? The Germans had managed to tap the Allies’ phone lines and were deciphering their codes. The captain reasoned that the Germans may not be able to decipher a message if it was spoken in a language they had no knowledge of or access to. Within hours, a group of eight Choctaw soldiers were dispatched to strategic positions. The Choctaw Telephone Squad began communicating in their mother tongue, and, say historians, their messages were instrumental in helping the Allies win key battles in the final weeks of World War I. The Choctaw were the first Native Americans to be used by the U.S. military as “code-talkers.” More famously, during World War II, the military repeated the idea, but with a group of Navajo. Choctaw was a good choice, linguistically speaking, for a military code because the language is notoriously complicated and unlike other languages. Indeed, it ranked as one of the world’s most “unusual” languages in a 2013 survey of the World Atlas of Language Structures (WALS), a database of the linguistic properties of almost 3,000 languages kept by the Max Planck Institute for the Science of Human History in Germany. The survey ranked languages on how dissimilar they are to each other. Chalcatongo Mixtec, with about 6,000 speakers in Oaxaca, Mexico took the top spot, followed by Nenets, which has about 20,000 speakers in Siberia. Third was Choctaw. Read more: Atlas Obscura

AI Detects Autism Speech Patterns Across Different Languages

June 20th, 2022 by A new study led by Northwestern University researchers used machine learning—a branch of artificial intelligence—to identify speech patterns in children with autism that were consistent between English and Cantonese, suggesting that features of speech might be a useful tool for diagnosing the condition. Undertaken with collaborators in Hong Kong, the study yielded insights that could help scientists distinguish between genetic and environmental factors shaping the communication abilities of people with autism, potentially helping them learn more about the origin of the condition and develop new therapies. Children with autism often talk more slowly than typically developing children, and exhibit other differences in pitch, intonation and rhythm. But those differences (called “prosodic differences'” by researchers) have been surprisingly difficult to characterize in a consistent, objective way, and their origins have remained unclear for decades. However, a team of researchers led by Northwestern scientists Molly Losh and Joseph C.Y. Lau, along with Hong Kong-based collaborator Patrick Wong and his team, successfully used supervised machine learning to identify speech differences associated with autism. The data used to train the algorithm were recordings of English- and Cantonese-speaking young people with and without autism telling their own version of the story depicted in a wordless children’s picture book called “Frog, Where Are You?” The results were published in the journal PLOS One on June 8, 2022. “When you have languages that are so structurally different, any similarities in speech patterns seen in autism across both languages are likely to be traits that are strongly influenced by the genetic liability to autism,” said Losh, who is the Jo Ann G. and Peter F. Dolle Professor of Learning Disabilities at Northwestern. “But just as interesting is the variability we observed, which may point to features of speech that are more malleable, and potentially good targets for intervention.” Lau added that the use of machine learning to identify the key elements of speech that were predictive of autism represented a significant step forward for researchers, who have been limited by English language bias in autism research and humans’ subjectivity when it came to classifying speech differences between people with autism and those without. “Using this method, we were able to identify features of speech that can predict the diagnosis of autism,” said Lau, a postdoctoral researcher working with Losh in the Roxelyn and Richard Pepper Department of Communication Sciences and Disorders at Northwestern. “The most prominent of those features is rhythm. We’re hopeful that this study can be the foundation for future work on autism that leverages machine learning.” The researchers believe that their work has the potential to contribute to improved understanding of autism. Artificial intelligence has the potential to make diagnosing autism easier by helping to reduce the burden on healthcare professionals, making autism diagnosis accessible to more people, Lau said. It could also provide a tool that might one day transcend cultures, because of the computer’s ability to analyze words and sounds in a quantitative way regardless of language. Read more: Neuroscience News

Crow Nation celebrates culture, language as new dictionary is published

June 17th, 2022 by CROW AGENCY – A drum circle sang songs of victory. A smudging ceremony wiped away the tears. And Crow tribal elders spoke in Apsáalooke (Crow language) about the next generation that has yet to be born. Friday’s celebration at Little Big Horn College wasn’t just the culmination of a years-long project to capture the words and culture of the Crow people, it was also a testament to saving the words that had been buried deep in many tribal members’ memories, preserving them and making them live again. On Friday, at a three-hour ceremony, The Language Conservancy, an Indiana-based group focused on preserving languages, especially indigenous tongues, unveiled the “Crow Dictionary,” a massive collection of nearly 850 pages that documents the language and is the first major collection of the language published since 1975. Not only is the dictionary more user-friendly and modern, it doubles the number of collected words from 5,500 to more than 10,000 – a huge accomplishment for saving a language that had been on the decline, but has recently seen a turnaround as language immersion programs grow on the reservation and a popular phone app has digitized the dictionary. In many ways, the songs and speeches weren’t just a celebration of the dictionary’s arrival, they were a victory against time itself. “For other languages, you can go somewhere else in the world to still hear them being spoken,” said Jacob Brien, whose Crow name is Ishkoochìia Chiiakaamnáah. “But this is the only place in the world where you can learn about this and hear it.” Estimates range on how many people speak Apsáalooke, but many peg the number around 2,000. Read more: Missoula Current

DeepMind: Why is AI so good at language? It’s something in language itself

May 31st, 2022 by How is it that a program such as OpenAI's GPT-3 neural network can answer multiple choice questions, or write a poem in a particular style, despite never being programmed for those specific tasks? It may be because the human language has statistical properties that lead a neural network to expect the unexpected, according to new research by DeepMind, the AI unit of Google.    Natural language, when viewed from the point of view of statistics, has qualities that are "non-uniform," such as words that can stand for multiple things, known as "polysemy," like the word "bank," meaning a place where you put money or a rising mound of earth. And words that sound the same can stand for different things, known as homonyms, like "here" and "hear."  Those qualities of language are the focus of a paper posted on arXiv this month, "Data Distributional Properties Drive Emergent Few-Shot Learning in Transformers," by DeepMind scientists Stephanie C.Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, and Felix Hill. The authors started by asking how programs such as GPT-3 can solve tasks where they are presented with kinds of queries for which they have not been explicitly trained, what is known as "few-shot learning."  For example, GPT-3 can answer multiple choice questions without ever having been explicitly programmed to answer such a form of a question, simply by being prompted by a human user typing an example of a multiple choice question and answer pair.  "Large transformer-based language models are able to perform few-shot learning (also known as in-context learning), without having been explicitly trained for it," they write, referring to the wildly popular Transformer neural net from Google that is the basis of GPT-3 and Google's BERT language program.  As they explain, "We hypothesized that specific distributional properties of natural language might drive this emergent phenomenon." Read more: ZD Net

How Abstract Concepts Are Represented in the Brain Across Cultures and Languages

May 27th, 2022 by Researchers at Carnegie Mellon University have explored the regions of the brain where concrete and abstract concepts materialize. A new study now explores if people who grow up in different cultures and speak different languages form these concepts in the same regions of the brain. “We wanted to look across languages to see if our cultural backgrounds influence how we understand, how we perceive abstract ideas like justice,” said Roberto Vargas, a doctoral candidate in psychology at the Dietrich College of Humanities and Social Sciences and lead author on the study. Vargas is continuing fundamental research in neural and semantic organization initiated by Marcel Just, the D.O. Hebb University Professor of Psychology. Just began this process more than 30 years ago by scanning the brains of participants using a functional magnetic resonance imaging (fMRI) machine. His research team began by identifying the regions of the brain that light up for concrete objects, like an apple, and later moved to abstract concepts from physics like force and gravity. The latest study took the evaluation of abstract concepts one step further by exploring the regions of the brain that fire for abstract objects based on language. In this case, the researchers studied people whose first language is Mandarin or English. “The lab’s research is progress to study universalities of not only single concept representations, but also representations of larger bodies of knowledge such as scientific and technical knowledge,” Just said. “Cultures and languages can give us a particular perspective of the world, but our mental filing cabinets are all very similar.” According to Vargas, there is a fairly generalizable set of hardware, or network of brain regions, that people leverage when thinking about abstract information, but how people use these tools varies depending on culture and the meaning of the word. Read more: Neuroscience News

Where Did ‘Jazz,’ the Word, Come From?

May 15th, 2022 by When it comes to the origin of the word “jazz,” it seems that each person simply believes what she or he wants to. Some would like the word to come from Africa, so they firmly believe the stories that support that. Others want it to be an African-American word, so they look for that. The venerable saxophonist, composer and educator Archie Shepp lived in Paris for many years, and he has not the slightest doubt that “jazz” is a French word. But professional linguists (scholars of languages and their history), etymologists (researchers of word origins) and lexicographers (dictionary researchers) have been on the case for decades, and the real story is far less simple. Let’s take a look. The word “jazz” probably derives from the slang word “jasm,”which originally meant energy, vitality, spirit, pep. The Oxford English Dictionary, the most reliable and complete record of the English language, traces “jasm” back to at least 1860: J. G. Holland Miss Gilbert's Career xix. 350 ‘She's just like her mother... Oh! she's just as full of jasm!’.. ‘Now tell me what “jasm” is.’.. ‘If you'll take thunder and lightening, and a steamboat and a buzz-saw, and mix 'em up, and put 'em into a woman, that's jasm.’ Note the discussion of what “jasm,” means, which suggests that it was fairly new, not in widespread use at the time. Some have suggested that it originated as a variant of “gism,” which has the same meaning and can be traced back a little further, to 1842. By the end of the 1800s, “gism” meant not only “vitality” but also “virility,” leading to the word being used as slang for “semen.” But — and this is significant — although a similar evolution happened to the word “jazz,” which became slang for the act of sex, that did not happen until 1918 at the earliest. That is, the sexual connotation was not part of the origin of the word, but something added later. According to the etymologist Professor Gerald Cohen, the leading researcher of the word “jazz” (and author of a study summarizing his work to date; see below), it’s not even certain that “gism” and “jasm” are related. The research is still ongoing, and it’s quite possible that they are two independent words. In short, “jazz” probably comes from “jasm,” and let’s leave “gism” out of it. “Jazz” seems to have originated among white Americans, and the earliest printed uses are in California baseball writing, where it means “lively, energetic.”  (The word still carries this meaning, as in “Let’s jazz this up!”) The earliest known usage occurs on April 2, 1912, in an article discovered by researcher George A. Thompson, and sent to me courtesy of Dr. Cohen. Read more: WBGO

Regina podcast aims to revitalize Indigenous languages

May 14th, 2022 by People can learn different Indigenous languages through a recently launched podcast called pîkiskwêwin which means ‘language’ in Cree. The Indigenous Communication Arts (INCA) program, at the First Nations University of Canada (FNUniv) has worked on the pîkiskwêwin project, which is an Indigenous and community-led initiative to preserve, protect and interpret the history, language, culture, and artistic heritage of First Nations. The pîkiskwêwin’s family of podcasts are produced in Indigenous languages. “We have joined an amazing circle of language teachers and language keepers,” said Shannon Avison, project supervisor and FNUniv INCA Assistant Professor, in a media release. “For most of them, podcasting is a new format to use. Some of our podcasters are fluent but some pîkiskwêwin podcasters are language learners. It’s so exciting to give them training and technology to do interviews in their ancestral languages for the first time.” The project is funded through Heritage Canada on a $600,000 grant for two years. The podcast team ran their first episodes in January 2022. Podcasters who participate in the project are Knowledge Keepers, language teachers, and students. Three episodes are released every week and funding for the project will continue until March 2023. Read more: Global News

Magic: The Gathering cards have a secret language that only a few can translate

May 12th, 2022 by As complicated as Magic: The Gathering is — with all its various and ever-changing keywords, strategies, products, storylines, even entire gameplay formats — would you believe it also has its own secret language? Phyrexian was first introduced in 2010. It’s a language spoken in-fiction by a loathsome species of cybernetic monsters on the plane of Phyrexia. Publisher Wizards of the Coast has never completely explained how it works. So, for more than a dozen years now, a dedicated group of amateurs has been working to translate it, going card by card with only a few lines of new text occasionally delivered with new sets of cards. What they’ve discovered is a tongue that’s simultaneously alien and also very much a part of our world. Fernando Franco Félix, a science advisor for PBS’ Space Time, is perhaps the foremost expert in Phyrexian outside of Wizards. A polyglot — that is, a master of multiple languages, in this case including English, Spanish, and Esperanto — he’s been fascinated with Phyrexian for years now, and maintains a small but dedicated following on YouTube. “I’ve always liked languages,” Félix told Polygon from his home in Aguascalientes, Mexico. “What I always say is that a language is like an art gallery, and every aspect of the language is like an art piece. I see languages as the greatest collaborative work of art in the history of humanity. You have millions of people and, without even realizing it, they are creating this system, which is beautiful.” Of course, Phyrexian wasn’t created over thousands of years across multiple cultures. It’s a constructed language, also called a conlang. That makes it similar to the languages found in Tolkien’s The Lord of the Rings books, or modern conlangs like Star Trek’s Klingon and Game of Thrones’ Dothraki and Valyrian. Of course, you can easily find documents online that will teach you how to speak like an elf or a Klingon. But not so with Phyrexian. Read more: Polygon

Latin Is Dead, but Not Extinct

May 9th, 2022 by No community has claimed Latin as its native tongue since the collapse of the empire that sowed its grammar and lexicon across the ancient world. For a language that officially died more than a thousand years ago, however, it clings to life with all the tenacity of a Roman legion. From the Renaissance through the 18th century, Latin served as the lingua franca for a monumental wave of intellectual progress — to the extent that its hold on the scholarly world is apparent even today. In the courtroom, defendants challenge unlawful imprisonment by applying for habeas corpus. In the laboratory, scientists assign names like Homo sapiens to each newly discovered species. And many who attended high school in recent decades have memories (fond or otherwise) of parsing sentences by the Roman writers Seneca, Ovid and Cicero. On the religious front, Latin rode out the Middle Ages in the mouths and pens of the Roman Catholic Church, which preserved it in its “ecclesiastical” form. This dialect remains an official language of Vatican City: the church still employs it in papal documents and Catholics there enjoy its solemn intonations at Sunday Mass. Interwoven as Latin is with contemporary culture, its pulse seems steady (if a bit fainter than 1,500 years ago). In what sense, then, is it truly dead? Read more: Discover Magazine

Six English words borrowed from the Romany language

May 6th, 2022 by Gypsy, Roma and Traveller communities have been part of the UK’s regional populations for centuries. Roma communities are documented to have migrated to the UK during the early 15th century and evidence is found among a variety of official legal documentation and formal correspondence. As part of a wider community referred to as Gypsy Roma and Traveller, Roma have often faced hostility and inequality. It may be surprising then to hear that Romany, an unwritten language spoken by Roma communities is used in everyday English. Romany is a language spoken by communities who live largely across Europe. The Romany language and culture have been associated with central and northern India and inherits a significant part of its linguistic heritage from Sanskrit alongside modern Indian languages such as Hindi, Urdu and Gujarati. In this sense, it is considered the only Indo Aryan-derived European language. While there are large communities of Romany speakers across Europe and beyond, only a small number of people in the UK speak a fully grammatical version. Within the UK, the majority of speakers use what is referred to as Anglo-Romany. This is a language unique to the Anglo-Roma of the UK and with a historical and linguistic connection to Romany culture. You may be surprised by some of the words that have been incorrectly labelled as colloquial or slang in English, which are in fact words that have crossed over from Anglo-Romany. Here are six such words including their meaning found in regional dialects in England with their Romany historical links explained. Read more: The Conversation

The life and death of the Spanish language in the Philippines

April 25th, 2022 by I remember that a Latin American scholar — I do not know from which country — angrily claimed at an academic congress that because of the Spanish conquest, many indigenous languages went into extinction. That kept me thinking a lot about the strong connection between language and identity and why the scholar — using plain Spanish — was so angry about the issue. Maybe he thought his identity was blurred given his incapacity to speak Quechua, Aymara, Guarani or any other language. I answered that if he really felt that the loss was so big, he was still on time to learn the indigenous language of his birthplace and then give it to his children. But clearly he was not willing to do that: depriving his children from connecting with a community of more than 500 million speakers would undoubtedly affect their chances to prosper. The Philippines has the opposite situation. After 333 years of Spanish presence, the language is almost totally gone. When I first asked some Filipinos why that happened, I was told "Spaniards did not want us to learn it." The answer did not satisfy me: that would be quite unuseful even from the point of view of the colonizers. Other people gave me a more elaborate answer: "The only Spanish figure in many provinces was the Spanish friar and he did not want the people to learn so he could keep power being the middle person between the government and the natives." That sounded more logical, but it actually ignores one very important factor: how languages are learned and spread. Read more: Manila Times

Lost in translation: is research into species being missed because of a language barrier?

April 24th, 2022 by Valeria Ramírez Castañeda, a Colombian biologist, spends her time in the Amazon studying how snakes eat poisonous frogs without getting ill. Although her findings come in many shapes and sizes, in her years as a researcher, she and her colleagues have struggled to get their biological discoveries out to the wider scientific community. With Spanish as her mother tongue, her research had to be translated into English to be published. That wasn’t always possible because of budget or time constraints –and it means that some of her findings were never published. “It’s not that I’m a bad scientist,” she says. “It’s just because of the language.” Ramírez Castañeda is not alone. There is a plethora of research in non-English-language papers that gets lost in translation, or is never translated, creating a gap in the global community’s scientific knowledge. As the amount of scientific research grows, so does the gap. This is especially true for conservation and biodiversity. Research about native traditions and knowledge tied to biodiversity is often conducted in the domestic non-colonial language and isn’t translated. A study published in the journal Plos Biology found that paying more attention to non-English language research could expand the geographical coverage of biodiversity scientific evidence by 12% to 25% and the number of species covered by 5% to 32%. There is research on nine amphibian species, 217 bird species and 64 mammal species not covered in English-language studies. “We are essentially not using scientific evidence published in non-English-languages at the international level, but if we could make a better use of [it], we might be able to fill the existing gaps in the variability of current scientific evidence,” says Tatsuya Amano, a Japanese biodiversity researcher at the University of Queensland and the paper’s lead researcher. Read more: The Guardian