This is a page from the Kamusi archives. The information below may be out of date, and the links may no longer be valid. Please visit kamusi.org for current information. If you know of links or information on this page that can be updated, please let us know.
Abdel-Karim writes from Egypt, "Regarding the content, is this [PALDO] planned to be a wikipedia-style dictionary (i.e. Entries are generated by users and some sort of community ranking/approval mechanism is in place with an optional discussion page for each entry?)."
The answer: Not exactly. There will be an editor for each language, or a team, who will be responsible for approving each entry. Remote users will be able to submit entries or edit existing entries, but all of their work must go through the official editor for that language before going live.
That is the theory. In practice, we only have full editors at this point for Swahili and Kinyarwanda. The other languages will be restricted at first so that only the official editors can work on them, since we only have the IDRC funding to do the basic terminology vocabularies. Once we are able to find true funding for each component language, we will be able to open it up to a give-and-take between the official editors and the remote participants.
But the fully open Wikipedia model won't work for these purposes, since there is too much data that requires highly specialized knowledge and too many opportunities for mistakes, vandalism, etc. If we find that there is a particular community that is large enough to open up the process more widely, we could consider modifying the submission model for that language. But even then, the model would need to be partially restricted - people would apply to be part of the group that approves entries, and would only be accepted to that group if they complete a training process that puts them in synch with the rest of the editorial board. Otherwise, chaos would quickly ensue.
The objective is a highly structured database for multiple languages, that will withstand scholarly scrutiny. Too much wiki-ness and the structure and accuracy will collapse. So we're trying to build something that will strike a happy balance - anyone can contribute, but any contribution will be vetted.
These are the languages for which we have datasets that we are actively working toward putting online. Languages that are Active for you to search are marked with "A" in the list below.
•A = Active language, aligned and searchable
•c = Data 🔢 elicited through the Comparative African Word List
•d = Data from independent sources that Kamusi participants align playing 🐥📊 DUCKS
•e = Data from the 🎮 games you can play on 😂🌎🤖 EmojiWorldBot
•P = Pending language, data in queue for alignment
•w = Data from 🔠🕸 WordNet teams
We are actively creating new software for you to make use of and contribute to the 🎓 knowledge we are bringing together. Learn about software that is ready for you to download or in development, and the unique data systems we are putting in place for advanced language learning and technology:
Our biggest struggle is keeping Kamusi online and keeping it free. We cannot charge money for our services because that would block access to the very people we most want to benefit, the students and speakers of languages around the world that are almost always excluded from information technology. So, we ask, request, beseech, beg you, to please support our work by donating as generously as you can to help build and maintain this unique public resource.
Answers to general questions you might have about Kamusi services.
We are building this page around real questions from members of the Kamusi community. Send us a question that you think will help other visitors to the site, and frequently we will place the answer here.