This Is How Google Wants To Make The Internet Speak Everyone’s Language

JAKARTA, Indonesia — When Nurhaida Sirait-Go curses, she curses in her mother tongue.

The 60-year-old grandmother does everything emphatically, and Bahasa, the official language of Indonesia, just doesn’t allow for the same fury of swearing as Batak, the language that Sirait-Go grew up speaking on the Indonesian island of Sumatra.

“On Facebook, on Whatsapp, they speak only Bahasa. So I can’t speak the way I want,” said Sirait-Go, who giggles uncontrollably and covers her mouth with both hands when asked to repeat one of her favorite curse words in Bakat. “I can’t, I can’t! People don’t use these words anymore. … They aren’t on the internet so they don’t exist.”

Batak is one of over 700 languages spoken in Indonesia. But only one language, Bahasa, is currently taught by public schools and widely-used online. For language preservationists, it’s just one more example of how the internet’s growing global influence is leaving some languages in the dust. Linguists warn that 90% of the world’s approximately 7,000 languages will become extinct in the next 100 years. Or, as one prominent group of linguists ominously put it, every 14 days another language goes extinct.

Read more: BuzzFeed News

Wikipedia ‘facts’ depend on which language you read them in

Like Facebook and Twitter, Wikipedia could have its own filter bubbles.

A new website lets you uncover geographical biases in Wikipedia articles by tracking down where editors of different languages source their information. Insert the URL of any Wikipedia page into Wikiwhere and the site’s algorithm trawls the web to find out where the references cited in the entry originate from.

Martin Körner at the University of Koblenz-Landau, Germany, and his colleagues made the tool to compare how Wikipedia articles about the same topic but in a different language might be influenced by different sources.

In the English language version of an article on Russia’s annexation of Crimea, for example, they found that 24 per cent of linked references came from Ukrainian new sources while nearly 20 per cent came from Russian sources. In the German version of the same article, however, the balance tipped, with Russian sources making up ten per cent of the total citations and Ukrainian sources only representing three per cent.

Read more: New Scientist

On The Internet, The World’s Diversity Of Languages Is Completely Absent

There are two ways of looking at the fact that just 2% of the world’s 6,000-odd languages are thriving online.

One is that 98% of our human communication heritage is doomed, as more people switch to more global languages to communicate. The other, says Priceonomics’ Alex Mayyasi, is that these “few languages becoming the language of the web could unite people more closely than they’ve been since the fall of the Tower of Babel.”

There’s a fundamental difference between how a language dies in the real world and how it fails online. A living language becomes extinct when there aren’t enough people around to speak it any more. But a “digital” language faces the opposite problem. It has to establish itself from nothing to become viable. And this isn’t easy in a world of apps that need to be localized for each language. Small developers will stick to English, Spanish, German, maybe French, and so on, because they don’t have the resources to do any more. And even huge resource-rich behemoths like Microsoft can drag their heels when it comes to adding, say, spell-check dictionaries to its software. And that’s before we get to things like Unicode (the international standard for symbols and letters on computers) support for languages using non-standard script or the need for huge databases of language pairs to drive tools like Google Translate.

Read more: Co.Exist

China Tests First Tibetan Language Search Engine

BEIJING: China has begun the trial of its first Tibetan language search engine, putting it on course for release in the second half of 2016, the developer said today.

“Cloud Tibet” has news, pictures, video and audio search options, state-run Xinhua news agency reported.

Development head Tselo said the database and the semantic unit function were both up and running.

It also has news, pictures, video and audio search options, state-run Xinhua news agency reported.

The project was launched in April 2013.

A team of more than 150 people from a Tibetan language research centere in Hainan Tibetan Autonomous Prefecture in Qinghai Province, northwest China, led the project.

“The recognition rate of the system is over 95 per cent,” Tselo said, adding that around 1.2 million people would use the search engine.

Source: NDTV

English is losing its status as the universal language of the Internet

More than half of the internet is in English.

But that percentage may decline in the future, according to research by Álvaro Blanco from Funredes, a nonprofit that studies technology usage in the developing world.

In 1996, Blanco’s research estimated that 80% of online content was in English. Less than a decade later, he said it fell to 45%. These estimations don’t even take into account activity on social networks like Facebook and Twitter, since search engines only index about 30% of the web, Blanco told Quartz.

But even though English’s presence online is declining, the current lack of language diversity is a huge problem on the web.

Even people who speak the most popular languages have a hard time reading online. Chinese, the most widely spoken language, makes up just 2.1% of the internet. The world’s second most widely spoken language, Spanish, encompasses 4.8% of the web.Hindi, spoken by 260 million people, makes up less than 0.1% of the internet.

Read more: Business Insider

24 words that mean totally different things now than they did before the Internet

Technological change, as we know very well, tends to provoke linguistic and cultural change, too. It’s the reason why, several times a year, dictionaries trumpet the addition of new and typically very trendy words.

But more interesting than the new words, I think, are the old words that have gotten new meanings: words such as “cloud” and “tablet” and “catfish,” with very long pre-Internet histories. The reappropriation is rarely random; in most cases, the original meaning of the word is a metaphor for the new one. Our data is as remote as a cloud, for instance; catfish are just as tricky and unpredictable as an online love interest.

Anyway, this is all a very long way of saying that Dictionary.com’s 20th birthday is more interesting than most: To mark the occasion, the online dictionary has compiled a list of words whose meanings have changed since it launched two decades ago. To that list, we have added a few tech terms of our own: such as “troll” and “firehose.”

Read more: The Independent

This is what .com looks like in different languages

This week, Verisign, the registry for domains ending in .com and .net, launched the first internationalized version of .com. It’s in Japanese script, and it looks like this: .コム.

Previously, the only way to load a .com website was to type the Latin characters that are common to English, French, Spanish and German speakers but unfamiliar to billions of people around the globe. Approximately 45% of all websites in the world are in a language other than English, according to W3Tech.

Since most countries in Asia have very low to moderate proficiency in English, users may not be able to find their way to a website simply because they are unable to enter a domain name.

“By enabling more end users to navigate the Internet in scripts representing their native language, and giving more companies the ability to maintain a common brand identity across many scripts, Internationalized Domain Names have the potential to make the Internet more accessible and thus more usable,” said Manish Dalal, vice president of Verisign Naming Services Asia, in a statement.

Read more: CNN

Regional languages are the lynchpin to India’s Internet boom

India is expected to see an unprecedented boom in the number of Internet users over the next few years but for a host of Internet companies it means a wholesale change in the language in which they engage with their potential new consumers.

According to a November report by the Internet and Mobile Association of India (IMAI), India is expected to have the second largest Internet user base in the world by the middle of next year with about 460 million users. The numbers have grown by 49 per cent over the past year and about three-quarter of these new users are accessing the net through mobile phones.

Behind these numbers though, is a more interesting trend, namely that the Indian Internet consumer is now a very different individual than he or she was a few years ago.

Read more: The Hindu

OMG! The Hyperbole of Internet-Speak

The text exchange was unspectacular: a friend explaining a video that had been posted by a classmate to his Snapchat feed. Jordana Narin, my 20-year-old research assistant, was half paying attention, sitting in my living room working on a project, texting between breaks.

“Omg literally dying,” she typed back, not missing a beat. She turned back to her computer.

But Jordana wasn’t literally dying. She wasn’t figuratively dying, either. In fact, she didn’t even crack a smile.

“I don’t even know what she’s talking about,” she told me when I asked. “I want to be like, ‘I don’t care.’”

But instead, she typed what to some may seem like the most dramatic response imaginable. Except that it wasn’t.

“It’s almost like ‘dying’ has become a filler for anytime anyone says anything remotely entertaining,” she said. “Like, if what you’re saying won’t legitimately put me to sleep, I respond with, ‘OMG dying.’”

R.I.P. to the understatement. Welcome to death by Internet hyperbole, the latest example of the overly dramatic, forcibly emotive, truncated, simplistic and frequently absurd ways chosen to express emotion in the Internet age (or sometimes feign it).

Read more: NY Times

Languages are dying, but is the internet to blame?

Languages everywhere are dying; a recent UN report showed that nearly 900 languages have been driven to extinction in the last three years — and that’s despite an increase in the number of languages supported by the internet. But is the lack of language diversity online accelerating language death or simply reflecting what’s happening in the offline world?

There currently around 7,100 languages in use, but 90 percent of these are used by less than 100,000 people. Some are only spoken by the inhabitants of remote villages, while others — such as the Peruvian language Taushiro — are thought to only be spoken by one person.

History suggests that language loss, much like language change, is inevitable. But understanding the cause of this loss is complicated. According to Ethnologue, a catalogue of all the world’s known living languages, 1,519 currently living languages are at risk of death, with a further 915 said to be dying. At the current rate, we’re set to lose six languages every year.

Read more: Wired