Say What You Mean
We want you to find the right word for translation, every time. To help you, we are developing Pre-D, a complicated system that is described in this working paper for the European Association of e-Lexicography, presented in Brno, Czech Republic, 16 September 2016.
Full article:Kamusi Pre:D – Lexicon-based source-side predisambiguation for MT and other text processing applicationsAbstract
Kamusi has been developing a system to analyze texts on the source side and present users with sense-specified dictionary options. Similarly to spellcheck, the user selects the intended meaning. We then use a multilingual lexical database to bridge to matching vocabulary in other languages. When paired with Freeling, additional pre-processing is possible for several languages. Integration with MT via Moses and Apertium is planned, but not yet undertaken. MWEs treatment is important. An MWE is lexicalized in the Kamusi database and marked for separability, with a definition and translation equivalents (one or more words) in other languages. When the initial term of an MWE appears in the source text, Pre:D queries the database and scans the sentence for all MWEs that could follow. The user can select the relevant MWE rather than the component words. A user can submit a missing sense or MWE for inclusion in the lexicon. Named entities can also be identified from data sources or by users and rendered appropriately across languages. When users agree, we will also use sense-tagged sentences for machine learning. A prototype of the core system is already functional.
These are the languages for which we have datasets that we are actively working toward putting online. Languages that are Active for you to search are marked with "A" in the list below.
•A = Active language, aligned and searchable
•c = Data 🔢 elicited through the Comparative African Word List
•d = Data from independent sources that Kamusi participants align playing 🐥📊 DUCKS
•e = Data from the 🎮 games you can play on 😂🌎🤖 EmojiWorldBot
•P = Pending language, data in queue for alignment
•w = Data from 🔠🕸 WordNet teams
We are actively creating new software for you to make use of and contribute to the 🎓 knowledge we are bringing together. Learn about software that is ready for you to download or in development, and the unique data systems we are putting in place for advanced language learning and technology:
Our biggest struggle is keeping Kamusi online and keeping it free. We cannot charge money for our services because that would block access to the very people we most want to benefit, the students and speakers of languages around the world that are almost always excluded from information technology. So, we ask, request, beseech, beg you, to please support our work by donating as generously as you can to help build and maintain this unique public resource.
Answers to general questions you might have about Kamusi services.
We are building this page around real questions from members of the Kamusi community. Send us a question that you think will help other visitors to the site, and frequently we will place the answer here.