My Kamusi - Login
You can Register Here ,   OR

The Serious Limits of Google Translate

People often ask, "What is the difference between Kamusi and other projects that deal with words in many languages?" Here, we answer that question in regards to Google Translate (hereafter "GT"). GT has more than 200 million active users each month, and their chief scientist asserts that they provide "most of the translation on the planet". It is therefore important to examine why GT translations are so often wrong, and why we believe that Kamusi's approach can make translation systems such as Google's much better.

Note: We have invited Google to comment on this article, and will revise it if they raise concerns that portions are inaccurate or unfair.

As the service describes itself:

Google Translate is a free translation service that provides instant translations between dozens of different languages. It can translate words, sentences and web pages between any combination of our supported languages. With Google Translate, we hope to make information universally accessible and useful, regardless of the language in which it’s written.

When Google Translate generates a translation, it looks for patterns in hundreds of millions of documents to help decide on the best translation for you. By detecting patterns in documents that have already been translated by human translators, Google Translate can make intelligent guesses as to what an appropriate translation should be. This process of seeking patterns in large amounts of text is called "statistical machine translation". Since the translations are generated by machines, not all translation will be perfect. The more human-translated documents that Google Translate can analyse in a specific language, the better the translation quality will be. This is why translation accuracy will sometimes vary across languages.

GT's goals are substantially different from Kamusi's - GT is all about translation, whereas Kamusi is about documentation. GT wants to be able to take something that is written in one language and tell you how to say the same thing in another language, while Kamusi wants to give you all the information we can about the words in one language and then show you the equivalents in other languages. Where GT is a tool that is intended to aid in the needs of immediate communication, Kamusi is intended to become a complete repository of how people express themselves through language. And, while GT grows by finding troves of already digitized data, Kamusi's main strategy is to create lexicons by eliciting new information from each language's speakers that does not yet exist in codified digital form.

Nevertheless, there are overlaps between the things that GT does and what we are doing at Kamusi. First, many people try to use GT like a dictionary, looking for translations for single words, and Google has responded by providing a dictionary-esque list of single word translations. Second, both GT and Kamusi share the ambition of being able to convert from one language to any other in the system. Third, both set vastly improved machine translation (MT) as part of their project mission - we also recommend you read this brief summary of 10 Ways that Kamusi can Revolutionize Machine Translation.

We admire the effort that has gone into GT. In fact, we use it all the time for such tasks as writing emails in French, as one arrow in the translation quiver. (Conversely, Google has in the past made use of Kamusi's Swahili data, and could use our multlingual data in the future if licensing arrangements can be worked out.) However, we are often asked, "Doesn't Google Translate already do what you want to do?", and for that, the answer is a resounding "no" - it doesn't, and it can't. GT is a statistical machine translation (SMT) system, not a dictionary. As such, GT is geared toward one task, translating documents, and is locked into an automated methodological approach that cannot rise above mediocrity or worse for most of the world's 25 million language pairs (discussed below). Kamusi, on the other hand, is oriented toward using human intelligence to create nuanced and highly accurate links among concepts in different languages, with one intended outcome being data that MT programs like Google can use to achieve near-certainty in the vocabulary used in translation. That is, GT and Kamusi are currently attempting to do quite different things in quite different ways, but we can envision the day when Google or its competitors use the data Kamusi creates as a way of making MT much more reliable.

Kamusi - Google Translate Comparisons:

Design: Parallel minds (Kamusi) versus parallel texts (Google)

The reason that translation is so difficult is that languages do not align neatly from one to another, in many ways. First, different languages have different sets of ideas, so a word may exist in one language that has no equivalent in another. Second, one language may use the same word in varying ways (run a race,run a factory, run a traffic light), but the other language may use different terms to express those different ideas. Third, languages structure themselves in unique and complicated ways, which change words and word order, and these grammars take great effort to analyze for one language and even more effort to bridge across languages. Fourth, multiword expressions often have meanings that cannot be understood by looking at the individual components, and that might be spread out in different parts of a sentence, e.g., an African fish eagle is neither an African nor a fish, and knowing that add up denotes "make a mathematical total" does not make it easy to find that meaning in the sentence "Please add all of your food and lodging costs up and submit the receipts".

It takes people years to learn languages well enough to translate between them, and even the best human translators will still make mistakes based on misunderstanding or unknown vocabulary. A human making use of her eyeballs and her experience will be much more capable than a computer of determining, in the examples above, that an African fish eagle is a single bird, or that add and up compose a single idea. She will nonetheless be defeated by this headline, Romney Appeals to Women, unless she reads the accompanying article to know whether he is attractive to women, is asking them to reconsider a negative decision, or is asking for their support.

MT attempts to accomplish its translation tasks through computer power, which relies on data. Data can be employed in various ways: the computer can compare individual words, longer portions of a sentence, or entire grammatical structures. Comparing words or longer phrases relies on some combination of a lexicon (set of word pairs, e.g. English clock = Swahili saa) and a body of parallel text (such as the millions of words in official documents professionally translated by the European Union). Making the ideas from the "source" language understandable in the "target" language relies on implementing either rules or patterns.

Google Translate looks for patterns. With millions of lines of parallel text at its disposal, GT scans the texts to seek statistically significant patterns. In principle, MT learns from repeated appearances of "add the costs up" in English source material and a corresponding recurring pattern on the target side, and develops the facility to provide the right target words, in the right grammatical order, through infinite variations ("add the costs of the trip up", "add the expenses up", "add all of the unexpected costs from the flooding up"). In practice, GT fails on a regular basis. GT translates the original example about adding up food and lodging to French as, "S'il vous plaît ajoutez tous vos aliments et frais de séjour et de soumettre les reçus," which back-translates as, "Please add all your foods and costs of lodging and of submit the receipts" - you get the gist, although a native French speaker would grimace, and then suggest, "Merci d'ajouter tous vos frais de nourriture et de logement et de fournir les reçus." For languages with less source material to English, the results are much worse. Here is a Facebook status posted from Egypt, with typically incomprehensible results from SMT:

Original: منى بعتت رسالة بتقول انهم كويسين و مستقرين في شاليه مرتفع. بس مفهمتش العاصفة خلصت ولا لسه شغالة و معرفش الطرق مفتوحة ولا ايه الوضع.
Facebook via Bing: I sent an letter bet'oul eih they sin and stable in the chalet. BAS mfhmch storm and not working and I do not know how to SSH open and no situation.
GT: Mona Bataat message Bet'oul Kwesen and they settled in a chalet high. Mvhmich storm, but found no Lessa maid and I do not know the roads are open or what the situation.
Human: Mona has sent a message saying  that they are fine and settled in a high chalet. I didn't understand though whether the storm has ended or not, and I don't know whether the roads are open or how the situation is.

Rather than looking for patterns, Kamusi looks for words. More specifically, Kamusi is interested in finding all the combinations of letters in a language (e.g., s-p-r-i-n-g), then finding all of the different senses of those words (spring the season, spring the mechanical device, spring the water source), and finally linking those different concepts to equivalent expressions in other languages. In principle, by having people with the appropriate linguistic skills examining each vocabulary pair, our ability to get nuanced and accurate translations between individual terms should be very high. In practice, our ability to get nuanced and accurate translations between individual terms IS very high - but also slow and expensive, because the work needs to be done by people, not by algorithms. If we had the budget to pay for the one-time documentation costs for each word, our system is capable of rapidly producing parallel terms among thousands of languages, at a level of confidence that parallel corpus comparisons will never be able to achieve.

Methods - Confirmed human translation (Kamusi) versus statistical machine translation (Google)

If you think of GT as a tool that can help a knowledgeable user get partway toward a translation, then it is reasonably successful in achieving its mission. One can use the service to understand the gist of a web page, or, for those who know enough of the target language to battle through the inevitable mistakes in vocabulary and syntax, as an aid in writing a business letter. (Don't try to write an informal letter in many languages using GT - because the translations are based on formal documents, it is apparently not possible to get output in the familiar register.) However, if you need translations that you can rely on, such as producing information about your business, you had best run far, far away. GT destroys even a basic sentence like, "I will be unavailable tomorrow", delivering the message in French, Spanish, and Catalan that tomorrow is a good day to meet. And a phrase that students would learn in their first day of Swahili class, "See you tomorrow" (tutaonana), is rendered as "Angalia kesho" - look out tomorrow.

Example of Google Translate giving a result that is exactly opposite to the original meaning

Why is GT so consistently inaccurate? We argue that the problems are inherent to statistical machine translation, and we also propose that those problems can be resolved by Kamusi methods overlayed on SMT and combined with rules that integrate with Kamusi's unique data structure.

The first problem is the S in SMT: statistics. Statistics indicate a calculated likelihood, rather than a formal analysis. That is to say, when comparing parallel texts between English and French, perhaps 30% of the time a sentence where spring is clearly a noun in English will match to a sentence that contains printemps in French (referring to the season), around 30% of the English occurrences of spring will match to ressort (the mechanical device), and the rest of the time there will be some other match. In Google's words, "Typically, when we produce a translation, our system searches through millions of possible translations, selecting the best -- that is, the most statistically likely -- translation."

This picture shows GT's probability index for spring as a noun.Google Translate probabilities for senses of springThe grey bars on the left show the likelihood that GT predicts for each sense of spring that it identifies. The result is that GT will almost always guess that spring refers to printemps or ressort, and that it is extremely difficult to construct a sentence that the program will return with source (water from the ground) or saut (jump). We managed to get source by way of the phrase "the spring flowed" but could not find a way to make GT suggest saut. In short, statistics only give you the right result in the percentage of times that they indicate - if the plurality sense of a word appears 30% of the time in the corpus, your translations are likely to be right 30% of the time, and wrong the other 70%.

Of course, SMT is smarter than merely comparing single words. Advanced programs like GT will, for example, extend their analysis to clusters of words (called n-grams). In this way, GT recognizes that "the spring flowed" is likely to relate to water - with the unfortunate effect that "The spring flowed into summer" is translated to French with source. Similarly, SMT can often detect the difference between different parts of speech, in theory recognizing "the spring" as a noun and "will spring" as a verb. (Mysteriously, "Sara will spring at the offer" gives the right verb and conjugates it correctly in French, and "She will spring for lunch" gives the wrong verb with the correct conjugation, but "She will spring at the offer" gives the noun ressort and "Sara will spring for lunch" guesses the noun printemps. 4 sentences, 25% success.) Still, the range of possible expressions in a language is infinite, and SMT will always be a science of estimation.

Some of the problems with SMT stem from the limitations of matching parallel text. It may sound impressive that millions of pages of parallel data are available for comparison between certain languages, but those pages are restricted to a particular range of topics for particular language pairs. As an example, the Bible is a favorite text for comparing languages because it has been meticulously translated to innumerable languages, cross-referenced to chapter and verse. Yet, translations based on the Bible would be restricted to terms that arose in the lives of a pre-industrial agro-pastoral society; draw appears dozens of times in the sense of drawing water (from a well, not as a ship), drawing swords, or drawing near, and three times in drawing lines on a map, but not once in the sense of drawing a picture. There are many more modern sources of parallel text, but they generally have restricted range. Older texts that are widely translated, such as Shakespeare, are likely to have antiquated language, and modern translations are locked by copyright or intellectual property restrictions. Newer texts that appear in many languages, such as Harry Potter, will be under copyright protection until long after you're dead. Some web sources that appear in parallel form can be scraped, but true parallel texts online are rare - for example, BBC News publish news in 31 languages, but there is no guarantee that the languages are aligned, and no easy way for an external system to know whether the stories in the different languages are similar texts, exemplified by these three simultaneous stories in English, Swahili, and French that discuss exactly the same topic but do not align (they seem to be composed from the same notes but written from scratch by a different author for each language) and do not link. The best-paired and most accessible parallel documents are the translated texts of official documents of the European Union and the United Nations. However, those documents exhibit similar constraints as the Bible; they generally deal in formal language with topics related to governance.

The documents on which SMT is trained, then, are often quite different from the texts that people need translated. Communications between two people, such as emails or instant messages, tend to be written in the first or second person, both of which are quite rare in the corpus. Unscientifically posting a question in an online forum asking people how they use online translation tools gave these answers: emails related to work, emails related to cultural exchanges, school homework assignments, business research, academic research, postings on Facebook from foreign friends, unknown words in a book, reading foreign-language newspapers and web pages, recipes, product manuals, travel planning. Most of these uses have only marginal overlap with the topics and language style of the corpuses on which GT is built.

Now consider that the set of languages for which a large collection of available parallel texts exists is very small, and heavily weighted toward English and a few other major languages. The official languages of the UN are English, French, Spanish, Arabic, Chinese, and Russian, while the EU set has 24 official languages that include the first three from the UN set and twenty-one others from around Europe. While official documents could in theory be prepared in any of those languages and then translated to all the other languages of their set, the great bias is toward English, French, and Spanish as the source. That is, EU agencies in Brussels are not writing original documents in Estonian that they then translate to English, much less neighboring Latvian - they are writing largely in one of the big languages that are then translated to the other 23. Any attempt to use an English text that has been aligned with, say, both Arabic and Chinese as the basis for an Arabic-Chinese translation would generate absurd results. In our test above with the verb spring, GT failed 75% of the time for English-French, one of the pairs best represented in the parallel corpus. Results decay rapidly from English to less common languages within the official sets, are even worse when those languages are the source and English is the target, and are complete disasters when following GT's flow from one lesser-represented language to English to another lesser-represented language. The other languages in the GT set of 80 do not have nearly the amount of parallel text available, and thus start with a poor amount of training material, generally to English, French, or Spanish, and basically zero data outside of the direct connections.

Let us estimate charitably that GT has 100 direct translation pairs: 79 languages paired with English, and a number of major-language couplings such as French-German. These 100 pairs essentially represent the outer limit of what GT can do with parallel text for these languages, because additional corpus matches will not become available. For the same 80 languages, the Kamusi system has the capacity to produce reliable data for all 3160 pairs in the set.

For many languages, it would be mathematically unimaginable to have parallel texts that referenced even a small fraction of valid terms. Bantu languages form their verbs through agglutination - that is, gluing together lots of little bits that indicate the subject, object, tense, and other important bits of information. Most verbs in Kinyarwanda, not a GT language, have close to a billion valid forms, only a small fraction of which have ever been written, much less digitized, much less translated into parallel text, much less made available for corpus comparison. Swahili has somewhat fewer agglutinative combinations (it does not have the "pre-prefix" that appears in Kinyarwanda), but more parallel text. The treatment of the Swahili verb in GT is abysmal, failing with the simplest constructions from English to Swahili, such as "I sit". Swahili to English is slightly better, because GT recognizes a few patterns, such as "hawa-" being a negative prefix and "tuna-" signaling "we" in the present tense, for some verbs in their database. If you could convince someone to take the other side, you could make a lot of money by betting against GT's handling of the Swahili verb.

Kamusi, unlike GT, relies on humans as the ultimate arbiters of term translations, and believes that employing a vocabulary-based approach in combination with certain rules can greatly improve the efficacy of MT. In the near term, Kamusi is working to document relationships between languages, one word at a time. In the longer term, Kamusi methods make it possible to envision a system that produces near-perfect vocabulary matches between any two languages.

Kamusi is programming some neat computational tricks that will enable us to derive initial data for many languages. For example, we can piece apart existing bilingal dictionaries and determine that a term in one language matches to some sense of an English word like spring. We can also show relationships between languages that have never been paired - if we have know a term in Language A is equivalent to one in Language B, and the term in Language B relates to one in Language C, and Language C to Language D, then we can postulate a relationship between A and D. However, we see this sort of computed data as merely a starting point on the path toward real translation.

For Kamusi, data for translation can only be considered valid after it has passed through a human review process. We have a long-established online editing system through which people contribute and refine dictionary entries, and we are developing crowdsourcing systems to generate reliable data through games and mobile apps that engage the public. The human review process will also be used to match imported data to the correct translation sense, and to confirm or reject predicted language pairings.

We have also developed a system to analyze Bantu verbs for their component parts. Through this system, one need only know a core set of language-specific rules in order to construct or parse verbs in Swahili, Kinyarwanda, or a few hundred other languages; instead of needing a database to store a billion potential combinations for every verb, a small program can spin through about 300 rules (if/then statements), and find the root verb and each other significant element.

Rules can be identified in many languages that can be prepared for in the Kamusi data structure. In English, for example, we know that "have" or "has" followed by the past participle results in the present perfect tense (I have eaten, she has arrived), so we can store the past participle within the entry for each English verb. This data is then available for MT to recognize that eaten in "the lion has eaten" matches the verb eat for vocabulary selection, while has eaten should be treated as a single unit for grammatical purposes. Identifying these rules is a substantial challenge that must be accomplished by linguists, not machines.

In combination, these Kamusi methods mean that we can know a tremendous amount about a source document before we send it to a translation system. We can find all of the terms that have ambiguous meanings, and provide a tool for the user to mark the intended sense. We can find the bits and pieces of separable terms like "add up" and staple them back to their dictionary entry. We can identify terms or senses that are not yet in our database, create new dictionary entries, and seek translations from our contributors for dozens of languages that then become instantaneously available - even for idiomatic expressions like "drive [something] through the roof" that have unique meanings that are not usually available in reference resources, and are completely missed by GT. In conjunction with the work that others have done on grammatical analysis, we can locate inflected forms of words and associate them with the entries for their canonical forms. And here is where we can rock the MT world: with the document already analyzed to determine the meanings and word variations on the source side, and with words joined from one language to the next in Kamusi at the level of the concept, we can send the text to a translation service with near certainty about the choice of vocabulary, and a fair amount of additional source grammatical information that can be used for processing the conversion to the target language.

Additionally, many dictionary details can be used to improve MT on the target side. With each new pair of languages, SMT must discover afresh from parallel texts not only vocabulary (if GT does manage to find a decent translation for "drive through the roof" for French, that is not a harbinger of getting it right for German or Thai), but also such details as the gender of nouns. Why waste so much effort, when the same word in one language will be masculine or feminine no matter what language it is translated from? Kamusi can easily move from "the millstone" in English to masculine "der Mühlstein" in German to feminine "la meule" in French to class 7 "kijaa" in Swahili to feminine "piedra de molino" in Spanish. Google Translate cannot. By including many lexical details in the data container for a term, Kamusi is creating a repository of information that is available for constructing grammatically accurate text on the target side, without needing to have encountered a particular pattern within a limited or non-existent parallel corpus.

GT offers users the option to highlight sections of their translated texts and choose alternate translations. Post-processing is a bizarre time to be asking for human input. "Did we get it right?", GT asks their users, and for most people the answer will be, "If I knew, I wouldn't be here." If the user has enough knowledge of the source language to gauge original intent, it is most appropriate to narrow down the options on the source side; currently, an English speaker using GT to write a letter in a language they speak at an intermediate level would be advised to repeatedly adjust their English vocabulary and word order until the output looks right, and to consult an external dictionary to confirm unfamiliar target vocabulary senses. If the user does not have an a priori window to understanding the source document, they currently must accept a translation on blind faith.

GT also enables users to submit their translations to improve the overall system. This data presumably builds the corpus of parallel text, and is also used in "translation memory" - a collection of direct translations that the system can access instead of computing the same phrase in the future. Without independent human oversight as occurs in Kamusi through moderation, the usefulness of these crowd translations is dubious, as they could easily be confirmations of bad GT efforts, or intentional spam. Nevertheless, translation memory is a departure from SMT that recognizes the superior results attainable when the data set has been curated by people instead of processes.

With Kamusi techniques, professional organizations would pre-translate their documents and distribute them to multilingual MT marked for meaning on the source side, an amateur user knowledgeable in the source language would use an on-the-fly pre-translation tool, and a novice would at least be able to see what the full range of options could be. SMT is useful to elevate the most likely translations into view, but that tool only gets you to the point where application of real human knowledge, encapsulated in a hand-polished lexicon and confirmed in manual pre-translation, can finish the job.

Instead of trying to develop models between language pairs based on the brute comparison of such parallel texts as may be available, the Kamusi method makes it possible to begin the translation process from a position of strength - dictionary senses that have been written by humans, human clarification of the meanings in the source document through dictionary-assisted pre-translation, grammatical information that can offer MT shortcuts on both the source and target side, and curated vocabulary links that can match a concept across hundreds or thousands of languages.

Outcomes - Accuracy versus Acceptability

Writing this comparison has necessitated a close inspection of GT. As we tested the service, we found it to be good for casual translations between well-documented language pairs to be consumed by the person requesting the translation, but unserviceable for as-is translations to be read by someone else. It is very easy to get wrong information, and very difficult to get translations on which you can rely. For the average user seeking a quick overview of an unimportant document, the quality of the translation does not much matter - they pay nothing, expect little, can use their own intelligence to paper over the shortcomings, and are therefore happy with whatever they get. Were the service called "Google Approximate" instead of "Google Translate", the best guesses afforded by SMT would be perfectly adequate.

German to English meets the test of overall comprehensibility when given randomly selected newspaper articles, although a copy editor who submitted work of such caliber would be fired on the spot. French to English also passes the test with newspaper articles (the major flaw in the linked article from a Swiss newspaper being "while the cheese and cheese semi-hard cheese" as a translation of "tandis que le fromage frais et le fromage à pâte mi-dure"). The French-Swahili translation of the same newspaper article clearly travels via English, but gives results that are good enough for a Swahili speaker to be able to pick up the main points. Arabic-English is incomprehensible - the linked article is surely about a political prisoner, not a bacterium. Chinese-English is similar, with the linked article making indecipherable points about a student movement and the clothing trade. By reverse-engineering, we determined that Arabic-Chinese and Chinese-Arabic also both take the tactic of translating from source to English to target, rather than attempting direct translations as could be possible based on the UN corpus.

Mathematically, the likelihood of a translation in a non-matched pair like Arabic to Chinese being correct can be calculated by multiplying its probability from Arabic to English times its probability from English to Chinese. If we use GT's own estimate for spring from English to French above as a demonstration value, then a 30% conversion from Arabic to English times a 30% conversion from English to Chinese yields only a 9% chance of SMT getting it right via the English bridge. On the other hand, with a hand-curated system like Kamusi's, the same equation yields much more enticing results. Assuming the user verifies their sense on the source document side, and a 95% likelihood that humans produced the right translation in the data set, 0.95 * 0.95 = 0.90, or a 90% chance of getting a good translation for vocabulary between two languages that have never been paired. As users hone the data within Kamusi, direct translations will approach 100% reliability, meaning the mathematical probablity of accurate second degree transitive links will also approach 100%.

In all cases, if the goal is to be able to write documents of a quality that would bring business to a company rather than drive customers away, GT fails to deliver for most language pairs, and delivers sporadically in the best cases.Rainy day, sunny forecast When you are reading an approximate translation that you have requested, you are of a mindset to figure out the parts of the message that the translation misses. When you are preparing a document for someone who has not signed on to the challenge, on the other hand, it must be fully intelligible before you go to press. As an analogy, you will gamely suffer through something lousy that you cook at home, but you would send back the same dish if it were served in a restaurant. GT cannot be used for mission-critical emails without the author manipulating each sentence on the source side to force correct target translations - and even then, it is advisable to apologize in the email for any language problems, and append the source language text as a reference. In a real-world test case, an English speaker using GT during the course of purchasing and registering a used car and procuring auto insurance in French-speaking Switzerland, there was not a single instance where the original Google translation would have been adequate for purposes of the business at hand.

The role of psychological expectations can be seen with a little thought experiment. Imagine you have to send an email to Oscar, who speaks a language for which you have no competency. In the first instance, you run it through GT and send him the unedited translation. In the second instance, you send him the email in English, and he runs it through GT himself. Of course, the translations will be identical. However, in the first case Oscar will assume that the text he reads is the message you meant to send, no matter whether the translation expresses exactly the wrong sentiment (you are selling an "antique" car that might be valuable rather than an "old" car) or includes gibberish like "and who the most important role to me bacterium" from the Arabic article above. If he asks for a picture of your valuable antique and you send him a photo of a decrepit jalopy, or if he puzzles over the bacterium, he's going to think you're an idiot. However, if he does the same translation himself, you won't look bad when he sees your old car or reads about the bacterium - he'll chalk up the misunderstanding to a translation that didn't work, and put in some extra effort to figuring out what you really meant. We posit that Oscar is less likely to understand the exact same translation in the first case than in the second. (If there are any psychologists who would be interested in conducting a real-world experiment to test this hypothesis, please contact us.)

Readers at a farther remove from the translation process will not have the context, or perhaps the resources, to make any sense of GT output. For example, one respondent to our informal survey about uses of GT wrote, "I attempted to create a Portuguese-language nutrition assessment questionnaire for my dialysis patients, with limited success. One patient gamely attempted to figure out what the questions were; another just handed it back to me with a shake of her head."

We contend that Kamusi will result in MT that makes highly accurate vocabulary choices between virtually any language, with additional grammatical information that can inform both the source and target sides of the translation. The goal is to contribute to systems that are not just good enough for government work, but good enough for surveys of dialysis patients, good enough for business, good enough to rival the best human translations.

Google Translate lays down the challenge on their blog:

Since we first launched the Website Translator plugin back in September 2009, more than a million websites have added the plugin. While we’ve kept improving our machine translation system since then, we may not reach perfection until someone invents full-blown Artificial Intelligence. In other words, you’ll still sometimes run into translations we didn’t get quite right.

Our analysis shows that, instead of "sometimes" not getting things "quite right", GT is often wide of the mark, and far too frequently gets things quite wrong, especially for non-major language pairs, for mission-critical applications, and for translations intended to be read by a third party. We will not proclaim that the Kamusi system will introduce artificial intelligence to MT, but we assert that it will introduce a large measure of human intelligence. Of course, at this point all we have is a model, and pilot data that proves the concept for more than twenty languages, but if we had a fraction of the budget that Google spends on soap in their washrooms, we are ready to build a multilingual lexicon that could revolutionize MT in short order. We extend the invitation to GT or other projects to work together in developing a lexical platform for the MT system of the future.