New Scientist reports: “LUIS VON AHN claims he can translate Wikipedia – all 2 billion words of it – from English into Spanish in just 80 hours. What’s more, he will not have to pay anyone to do the work.
His secret weapon is Duolingo, a free language tutorial website that doubles as a paid-for translation service (see video below). The deal is that users get to learn a language while simultaneously helping to translate website content.
“The crazy thing about this method is that it works,” says von Ahn, a computer scientist at Carnegie Mellon University in Pittsburgh, Pennsylvania.
This is not the first time von Ahn has recycled user input in this way. His reCAPTCHA system, which many websites employ when signing up new members, displays snippets of distorted text to confound the automated software used by spammers. Unbeknownst to many users, however, the text that they are deciphering is material that the computers used in book-digitisation projects have tried and failed to understand. The system was bought by Google in 2009 for an undisclosed sum and is now used in the company’s book-scanning project.
On the surface, Duolingo, which opened its doors to testers last month, looks like any other language tutorial system. Users receive lessons in either Spanish or German and then practise on their own. Tasks include translating written sentences and rating the accuracy of translations made by others. The lessons combine standard content with material pulled from web pages written in the language the user is learning. After the web material has been translated it is slotted into an English-language version of the original web page, which is built up sentence by sentence as learners plough through Duolingo worksheets.
Learners inevitably make mistakes, so the system ensures a number of people work on and check each sentence before declaring the translation correct. It also routes complex sentences to more advanced learners and provides tools, such as easy access to language dictionaries, to aid in translation.
But will website owners be prepared to pay for a service performed by students? Pricing has not yet been set but von Ahn insists Duolingo can match professional translators for quality – a claim that has attracted some scepticism.
“Anything that relies solely on learners will limit the number of experts who will participate,” says Mark Chatow, vice-president at Servio, a San Francisco-based company that offers translation and content-generation services. Chatow says that Duolingo will work fine for simple sentences, but notes that some material requires a grasp of nuanced meanings, which learners will struggle with. Nuance is a particularly common problem in Chinese-to-English translations, he adds.
Idiomatic expressions, the bane of automated translation services, may also cause problems. Google, for example, translates the Spanish idiom “nunca llueve a gusto de todos” as “it never rains to everyone’s taste”, whereas a professional translator would provide something like “you can’t please everyone”. It remains to be seen whether Duolingo’s learners can do as good a job.
Also uncertain is the speed at which translations can be completed, which obviously depends on the number of language students using the software. Von Ahn estimates that it would take a million students to translate Wikipedia in 80 hours, while 100,000 learners would take five weeks. However, language tutorials are already popular – more than 5 million people in the US alone have paid for language-learning software – and von Ahn says there are already 200,000 people on the waiting list for Duolingo.
To keep the translations flowing, users will also have to find the learning process “mildly addictive”, says Philip Resnik at the University of Maryland in College Park. In tests, New Scientist found the site easy to use and its reward system of points and stars compelling. Students can also compete against their friends.
“Given von Ahn’s record I imagine he’ll be successful,” says Resnik.