The Perils of Machine Translation

Years ago, on a flight from Amsterdam to Boston, two American nuns seated to my right listened to a voluble young Dutchman who was out to discover the US. He asked the nuns where they were from. Alas, Framingham, Massachusetts was not on his itinerary, but, he noted, he had ‘shitloads of time and would be visiting shitloads of other places’.

The jovial young Dutchman had apparently gathered that ‘shitloads’ was a colourful synonym for the bland ‘lots’. He had mastered the syntax of English and a rather extensive vocabulary but lacked the experience of the appropriateness of words to social contexts.

This memory sprang to mind with the recent news that the Google Translate engine would move from a phrase-based system to a neural network. (The technical differences are described here.) Both methods rely on training the machine with a ‘corpus’ consisting of sentence pairs: an original and a translation. The computer then generates rules for inferring, based on the sequence of words in the original text, the most likely sequence of words from the target language.

The procedure is an exercise in pattern matching. Similar pattern-matching algorithms are used to interpret the syllables you utter when you ask your smartphone to ‘navigate to Brookline’ or when a photo app tags your friend’s face. The machine doesn’t ‘understand’ faces or destinations; it reduces them to vectors of numbers and processes them.

Read more: The Wire

Leave a Reply

Your email address will not be published.

two + 19 =