How can we document detailed 🔢 data about all the 🌍 world's languages in a consistent, unified source, in a way that can serve 🎓 knowledge and technology needs for 👪 people and their machines around the globe? Dictionaries have historically presented selective information about words and their meanings within a language, or translation equivalents between languages, in idiosyncratic, incommensurable formats with little basis in data science. The Kamusi Project introduces a new approach, conceiving of language as a matrix of interrelated data elements. By documenting these elements within each language, and linking elements at conceptual and functional nodes across languages, Kamusi aims toward an elusive Big Data goal: "every word in every language." If successful, the results will run the gamut from preserving the human heritage embedded in 👅🔫 endangered languages, to providing international vocabularies for students to succeed in science, to a Star Trek-like universal translator embedded in your smart watch ⌚. In this talk, the project's founder discusses the nefarious complexities working against the creation of a universal language data platform, and the systems Kamusi has designed to collect, codify, and deploy quantum-level 👅🔢 linguistic data within one massive 🌎 global dictionary.
These are the languages for which we have datasets that we are actively working toward putting online. Languages that are Active for you to search are marked with "A" in the list below.
•A = Active language, aligned and searchable
•c = Data 🔢 elicited through the Comparative African Word List
•d = Data from independent sources that Kamusi participants align playing 🐥📊 DUCKS
•e = Data from the 🎮 games you can play on 😂🌎🤖 EmojiWorldBot
•P = Pending language, data in queue for alignment
•w = Data from 🔠🕸 WordNet teams
We are actively creating new software for you to make use of and contribute to the 🎓 knowledge we are bringing together. Learn about software that is ready for you to download or in development, and the unique data systems we are putting in place for advanced language learning and technology:
Our biggest struggle is keeping Kamusi online and keeping it free. We cannot charge money for our services because that would block access to the very people we most want to benefit, the students and speakers of languages around the world that are almost always excluded from information technology. So, we ask, request, beseech, beg you, to please support our work by donating as generously as you can to help build and maintain this unique public resource.
Answers to general questions you might have about Kamusi services.
We are building this page around real questions from members of the Kamusi community. Send us a question that you think will help other visitors to the site, and frequently we will place the answer here.