Machine learning reveals role of culture in shaping meanings of words

What do we mean by the word beautiful? It depends not only on whom you ask, but in what language you ask them. According to a machine learning analysis of dozens of languages conducted at Princeton University, the meaning of words does not necessarily refer to an intrinsic, essential constant. Instead, it is significantly shaped by culture, history and geography. This finding held true even for some concepts that would seem to be universal, such as emotions, landscape features and body parts.

“Even for every day words that you would think mean the same thing to everybody, there’s all this variability out there,” said William Thompson, a postdoctoral researcher in computer science at Princeton University, and lead author of the findings, published in Nature Human Behavior Aug. 10. “We’ve provided the first data-driven evidence that the way we interpret the world through words is part of our culture inheritance.”

Language is the prism through which we conceptualize and understand the world, and linguists and anthropologists have long sought to untangle the complex forces that shape these critical communication systems. But studies attempting to address those questions can be difficult to conduct and time consuming, often involving long, careful interviews with bilingual speakers who evaluate the quality of translations. “It might take years and years to document a specific pair of languages and the differences between them,” Thompson said. “But machine learning models have recently emerged that allow us to ask these questions with a new level of precision.”

In their new paper, Thompson and his colleagues Seán Roberts of the University of Bristol, U.K., and Gary Lupyan of the University of Wisconsin, Madison, harnessed the power of those models to analyze over 1,000 words in 41 languages.

Instead of attempting to define the words, the large-scale method uses the concept of “semantic associations,” or simply words that have a meaningful relationship to each other, which linguists find to be one of the best ways to go about defining a word and comparing it to another. Semantic associates of “beautiful,” for example, include “colorful,” “love,” “precious” and “delicate.”

The researchers built an algorithm that examined neural networks trained on various languages to compare millions of semantic associations. The algorithm translated the semantic associates of a particular word into another language, and then repeated the process the other way around. For example, the algorithm translated the semantic associates of “beautiful” into French and then translated the semantic associates of beau into English. The algorithm’s final similarity score for a word’s meaning came from quantifying how closely the semantics aligned in both directions of the translation.

Read more:

Leave a Reply

Your email address will not be published.

three × 2 =