This is a page from the Kamusi archives. The information below may be out of date, and the links may no longer be valid. Please visit kamusi.org for current information. If you know of links or information on this page that can be updated, please let us know.
In Africa, a billion people speak 2000 languages. This linguistic diversity is a rich contribution to our human heritage, but also a persistent challenge to the continent's prosperity. Africans have long had to address the world, and even their own governments, in foreign languages such as English, French, and Portuguese. Most Africans cannot go to secondary school in their mother tongues, and they must use one of a few second languages to work or trade beyond their home regions. Few resources exist to ease the communications barriers between local and international languages, much less among the continent's many tongues. With a history that places a low value on most of the languages Africans learn from birth, the majority of Africa's people are excluded from participating equitably in global knowledge and the global economy.
The Kamusi Project is an international NGO that is dedicated to producing communications resources such as dictionaries and glossaries for African languages; “kamusi” is the Swahili word for “dictionary.” With the ambition of documenting every word in Africa, the project has developed PALDO, the Pan-African Living Dictionary Online. Built on a unique hybrid model of scholarly and community contributions, PALDO uses innovative least-cost technology to extend its results as widely as possible, free to all its users. PALDO is intended as a tool to promote language equity across Africa while preserving the cultural knowledge that its languages contain.
PALDO is designed so that new languages can be added quickly, easily, and relatively cheaply. Once a language is added to the system, it is linked to all of the other languages therein. In one step, dictionaries come into being between a given language and numerous other languages spoken all around Africa. After each data item is created, it becomes permanently and freely available to the public in a variety of formats. Adding data to PALDO is similar to installing a solar panel – all the work happens at the beginning, while the benefits continue to flow for years to come.
The Kamusi Project seeks out language specialists throughout Africa and scholars at universities around the world who can take the lead for their languages. These experts commit to completing a certain number of entries, and to editing entries that are submitted by the public. Partners are trained in the specially-designed PALDO software, and receive continuous technical support along the way. Work on a language begins when funds are secured. Partners are remunerated on a per-term basis, which provides an incentive structure toward steady completion. Because each language advances independently, lexicons for many languages can be developed simultaneously.
Dictionary entries can come from many sources. The primary path for PALDO is the log records of tens of millions of lookups on PALDO's predecessor, the Internet Living Swahili Dictionary. These log records give a solid record of the most frequent terms that dictionary users seek, providing a ranked list that guides partners in producing parallel entries for the most important concepts. In addition, log records demonstrate which items are missing from the database and should be slated for addition. Existing data sets, such as out-of-print dictionaries, are sometimes available without copyright restriction, though making such works compatible with PALDO can involve a lot of effort, particularly producing definitions and matching concepts. The Kamusi Project has also planned methods, yet to find development funding, to build preliminary data sets using SMS search and response, and to harvest terms from translators using computer-assisted translation tools. Finally, the database is open to public contributions that are vetted by the language editor.
Languages are linked at the level of the concept. For example, the English word “fork” has several senses: cutlery for eating, a road that splits in two, a tool for tuning musical instruments, or software code that starts in a new direction from an existing project. The Swahili term “uma” is equivalent to the first concept, “njiapanda” to the second, “chuma cha noti” to the third, and “tawi la programu” to the last. Editors align the concepts within the PALDO software. When the term for the cutlery concept is added in a third language, all the links for that concept are attached. If the Zulu term for fork as cutlery, "imfologo,” is added in reference to English, a link is automatically created to “uma” in Swahili, and any other equivalents in the database for other languages based on existing confirmed language pairings. Such automatic links are shown as computer-predicted until a human editor provides confirmation. Through the linking tool, the one-time task of adding an entry for a concept in one language multiplies into a reference chain for many.
Through this system, good monolingual dictionaries come into existence for languages that usually do not have dictionaries of their own. Very few dictionaries exist for African languages that include definitions in those languages; for most African students, using a dictionary for their language (if one is available) is similar to looking up “fork” in English and finding the definition only in French or Russian. At the same time, dictionaries are created for language pairs that would be highly unlikely to be united otherwise. A bridge is built from one language to many others. This bridge can be used by multinational peacekeeping forces, by traders, by intergovernmental commissions, by students studying away from home, by audiences for music and movies and satellite news, or in any other circumstance where a speaker of one PALDO language interacts with a speaker of another.
The Kamusi Project is working on innovative systems to ensure the greatest possible reach of PALDO data. Outputs are currently available via the web, web-enabled phones, and can be printed out as paper dictionaries. A stand-alone program has been developed to use the dictionaries on offline electronic devices, with all of the tools available to web users. A prototype SMS system has proven successful and is awaiting a telecom partner for implementation. With mobile devices as the future of communications in Africa, applications will also be developed for smartphones and tablets, and for new technologies as they emerge. As Africa becomes increasingly wired, PALDO will be at the ready for people to learn about their own languages and those of their neighbors.
With each language added to PALDO, Africa's linguistic richness is documented, preserved, and put to work. Educational prospects are improved, communications among people are enhanced, and opportunities open up for jobs and trade. Reminiscent of other technological tools that once seemed impossible and now seem indispensible, PALDO stands poised to open up deep knowledge about a vibrant portion of the world's linguistic heritage. With the Pan-African Living Dictionary Online, the Kamusi Project and its partners will help harness Africa's language diversity for greater prosperity and mutual understanding.