This is a page from the Kamusi archives. The information below may be out of date, and the links may no longer be valid. Please visit kamusi.org for current information. If you know of links or information on this page that can be updated, please let us know.
Before implementing the editorial tools for each language in PALDO, we need to build the data models for each component language. Languages differ in more than words, so the database that contains information about multiple languages will have to account for those differences.
Consider, for example, one difference between English and the Romance languages: English nouns do not have gender, while Romance languages like French and Spanish do. So, the database should not contain a field for gender in English, but should in all the Romance languages.
Swahili nouns each belong to a noun class instead of a gender, meaning that the PALDO database needs to include a field for noun class for Swahili words. Noun classes are a defining feature of Bantu languages in general, so many African languages will need noun class data in their entries.
Some languages have tones that need to be marked in a pronunciation field, but noted separately from the standard. Others have multiple spelling systems. The complexities are never-ending, and sometimes extremely nuanced.
We are currently developing the data structures for the initial languages that will go into PALDO. Once we finish the structure for a language, we are programming the tools to edit dictionary entries for that language. Therefore, it is extremely important that we get it right the first time.
The working versions of the dictionary data structures are online at http://www.kasahorow.org/content/pan-african-living-dictionaries-online-paldo (and changing minute-by-minute). We are consulting with experts in each language, but we could use more input from a variety of perspectives. If you have any insights, please share them in the comments section for this blog entry."
These are the languages for which we have datasets that we are actively working toward putting online. Languages that are Active for you to search are marked with "A" in the list below.
Key
•A = Active language, aligned and searchable
•c = Data 🔢 elicited through the Comparative African Word List
•d = Data from independent sources that Kamusi participants align playing 🐥📊 DUCKS
•e = Data from the 🎮 games you can play on 😂🌎🤖 EmojiWorldBot
•P = Pending language, data in queue for alignment
•w = Data from 🔠🕸 WordNet teams
We are actively creating new software for you to make use of and contribute to the 🎓 knowledge we are bringing together. Learn about software that is ready for you to download or in development, and the unique data systems we are putting in place for advanced language learning and technology:
We welcome your comments and questions, and will try to respond quickly. To get in touch, please visit our contact page. You must use a real email address if you want to get a real reply!
Discussion items about language, technology, and society, from the Kamusi editor and others. This box is growing. To help develop or fund the project, please contact us!
Our biggest struggle is keeping Kamusi online and keeping it free. We cannot charge money for our services because that would block access to the very people we most want to benefit, the students and speakers of languages around the world that are almost always excluded from information technology. So, we ask, request, beseech, beg you, to please support our work by donating as generously as you can to help build and maintain this unique public resource.
Answers to general questions you might have about Kamusi services.
We are building this page around real questions from members of the Kamusi community. Send us a question that you think will help other visitors to the site, and frequently we will place the answer here.
To keep Kamusi growing as a "free" knowledge resource for the world's languages, we need major contributions from philanthropists and organizations. Do you have any connections with a generous person, corporation, foundation, or family office that might wish to make a long term impact on educational outcomes and economic opportunity for speakers of excluded languages around the world? If you can help us reach out to a potential 💛😇 GOLD Angel, please contact us!
Dictionary Data Structures
Before implementing the editorial tools for each language in PALDO, we need to build the data models for each component language. Languages differ in more than words, so the database that contains information about multiple languages will have to account for those differences.
Consider, for example, one difference between English and the Romance languages: English nouns do not have gender, while Romance languages like French and Spanish do. So, the database should not contain a field for gender in English, but should in all the Romance languages.
Swahili nouns each belong to a noun class instead of a gender, meaning that the PALDO database needs to include a field for noun class for Swahili words. Noun classes are a defining feature of Bantu languages in general, so many African languages will need noun class data in their entries.
Some languages have tones that need to be marked in a pronunciation field, but noted separately from the standard. Others have multiple spelling systems. The complexities are never-ending, and sometimes extremely nuanced.
We are currently developing the data structures for the initial languages that will go into PALDO. Once we finish the structure for a language, we are programming the tools to edit dictionary entries for that language. Therefore, it is extremely important that we get it right the first time.
The working versions of the dictionary data structures are online at http://www.kasahorow.org/content/pan-african-living-dictionaries-online-paldo (and changing minute-by-minute). We are consulting with experts in each language, but we could use more input from a variety of perspectives. If you have any insights, please share them in the comments section for this blog entry."
/content/dictionary-data-structures-paldo