Tom Morris

A pungent mix of programming, philosophy, pedanticism, procrastination, perplexity, peripheral political polemic, and platters of preposterousness.

Duolingo: the future of translation for Wikipedia?

Yesterday, Frances very kindly gave me an invite to Duolingo, the latest project of Luis von Ahn, the creator of reCAPTCHA. You can read a bit about Duolingo at English Wikipedia. Anyway, the current language choices are to learn French, Spanish or German, or if you speak Spanish, to learn English. English to French is currently in beta. I chose French, but if you are invited, you can add more languages after you go through the registration process.

The reason I chose French? Well, firstly because when in school, I just about scraped through a GCSE in French, so at least might have some chance. They also attempted to teach us Spanish, but I just couldn’t get on with it. Secondly, it would be quite useful: I’d like to be able to read some of the French philosophers without translation to see if they make as little sense in their native language as they do in English. Also, you know, being able to get along in that rather large nation just across the channel is rather a nice thing. Unlike other countries I’ve visited, the French do tend to expect you to at least make a token effort: the Dutch, the Danish and the Israelis don’t tend to get too pissy if you can’t speak Dutch, Danish or Hebrew (my Hebrew extends as far as תודה and שלום).

Duolingo is one of the first places I’ve found that makes language learning actually reasonably fun. Through a series of multisensory, multimedia drills, you get exposed to the basic grammar: obviously, introducing gender by using the words homme, femme, garçon and fille, then building up with a bit more vocabulary, then a bit more drilling. Interspersed with this are translation tasks, which I’ll get on to shortly. I quickly turned the speaking tasks off, because the Flash audio thing was very ropey and didn’t actually hear most of what I said. It also feels kind of silly sitting there repeatedly saying Je suis une femme! because some piece of shit computer can’t understand you.

As you do the drills, and repeat the practices, you get points, and points unlock levels. Yes, RPG style levels. Only unlike being a level 70 mage in World of Warcraft, levelling up at foreign languages is actually useful.

Those translation tasks, that’s where it gets more interesting. These are from actual documents on the web, in the case of French, they are from the French version of Vikidia, a Wikipedia fork that’s basically a simple version of French Wikipedia intended for 8–13 year old children. There’s also Vikidia versions in Spanish and Italian.

Sadly, Duolingo advertises these as being “Wikipedia” tasks rather than Vikidia tasks. There’s nothing wrong with translating non-Wikipedia articles, and it’s great that there is a more school-focussed version of French and Spanish Wikipedias, but it’s slightly deceptive to tell Duolingo users that they are doing a Wikipedia task when they aren’t. I’ve left feedback on the Duolingo site and via Twitter saying that they probably ought not be advertising Wikipedia tasks when they are in fact Vikidia tasks.

Now, here’s an equally interesting question: could Duolingo be used to help crowdsource translations of Wikipedia articles? Yes… but, with some restrictions. Obviously, there are lots of articles that exist on French, Spanish and German Wikipedias that could be translated into English, but we’d need to work out some way of identifying reasonably reliably that there doesn’t already exist an English version. This is not just interwiki links: there are often articles created and nobody actually kicks the interwiki link process into action. I’m certainly guilty of it: I’ve created articles on English Wikipedia and not checked to make sure that there aren’t versions on other language sites and linked them together.

The other things one would need to check: make sure there aren’t already deleted versions of a proposed article topic on English Wikipedia. Finally, I’d be against having a system where Duolingo users translate an article from, say, French, and then a bot automatically puts them into English Wikipedia. Much better would be some kind of holding bay, because of potential referencing issues. From what I’ve seen, Duolingo users are doing a pretty good job of translating some sentences from French to English, although sometimes they are a bit word-for-word literal and don’t get the English grammar right. But I’m pretty sure they aren’t going to be fixing references and section headings and so on. The person putting the article up really needs to have a look and do a manual diff of the Duolingo translation and the original language version simply for the mechanical wikification process.

There’s one other area where Duolingo could be really useful: picture descriptions and template messages on Commons. The issues here are different than on Wikipedia, and potentially less valuable than article translations.

But aside from these issues, I’d say Duolingo looks like a really interesting process for learning and translating, and if Wikipedians can be bold, there is a huge potential here for improving Wikipedia. This is the sort of gamification I can actually get behind: the sort of gamification that increases the amount of knowledge in the world and opens up communication and collaboration between people of different languages. That’s actually useful, unlike the other stuff I’ve condemned in the past

Anyway, I’ve got to go. There is grinding to be done. I’m only level two…

Comment policy. Summary: don't be a dick.