The first step was for Benjamin to enter about three thousand terms into a spreadsheet, copied with permission from existing learners' glossaries. He then divided those terms into packs of 100 and put those files on a “gopher” server that people could access via a command line interface and dial-up modem. The intent was for volunteers to each expand one pack with new terms, and to keep subdividing the packs as contributions rolled in. That idea never really worked, however, because the process was too cumbersome and the number of Swahili enthusiasts using computers was too small. Instead, the project received copyright permission for a large out-of-print dictionary by Charles Rechenbach, was awarded a larger grant from the full CLTL, and concentrated on data entry and the development of a website (Yale's first in the social sciences or humanities) to distribute the results to the public.
In 1996, Dr. Biersteker was awarded funding for the project from the United States Department of Education's International Research and Studies program (IRS). This grant enabled the development of the “Edit Engine,” a tool that makes it possible for anyone to help edit dictionary entries. The Edit Engine went live in 1999, a year before Wikipedia began with a similar model (and with the important difference that all Kamusi changes must be approved by an editor before becoming public). At the same time, data became available through a searchable online database, rather than having to be downloaded as text or Excel files.
A second IRS grant in 2003 supported many additional features, such as a photo uploader for users to illustrate dictionary entries with appropriate images, a parser to return useable dictionary entries from conjugated verbs, and a grouping tool to organize entries according to priority and sense. By 2006, the Kamusi Project was being used about a million times a month by 60,000 unique visitors.2007 marked a major transition for the project, which had run out of funding. Benjamin had left Yale, where the project was still housed, and moved to Lausanne, Switzerland for family reasons. Several interesting potential partnerships were emerging around the idea of expanding the Kamusi model to other African languages. However, these projects for the international public were better housed at an institution devoted specifically to the cause of language development. It was decided to move the project to the care of the non-profit World Language Documentation Centre, based in Wales, as an interim home while steps were taken to incorporate Kamusi independently. The online presence was established as kamusiproject.org, and then kamusi.org when that name was donated by its original registrant.
Incorporating Kamusi was completed in 2010. The organization is actually two legally independent non-profit entities: Kamusi Project USA for American-based activities and Swiss-based Kamusi Project International for projects with the rest of the world. Our US status makes it possible for Americans, historically Kamusi's most generous supporters, to continue contributing to our work. At the same time, Swiss incorporation facilitates work with partners throughout Africa, due to Switzerland's special open relations with most of the world. The two organizations have independent boards and completely separate accounting. Dr. Benjamin now serves as Executive Director of both NGOs.
Between 2007 and 2013, the Kamusi Project embarked on several exciting new initiatives:
The history of the Kamusi Project has been one of both innovation and struggle. Funding resources for "exotic" languages are few and far between, and the project has found that it is very difficult to make progress unless key partners can be remunerated for their time. Nonetheless, the Kamusi Project has pressed forward and is now in a technical and regulatory position to provide advanced services for a great many languages spoken around the world. Many new and innovative projects are now in the pipeline, with partners from countries on every continent. The next chapters of this history are poised to be written.
Here are some annual highlights:
1993: Project conceived as a way to use collective resources to create new tools for learning Swahili.
1994: First proposal submitted, November. First glossary (3,000 words) begun, December.
1995: Gopher site established, January. Website established, April - first website in the social sciences or humanities at Yale. Wordlists incorporated from many remote contributors. 21,000 entry dictionary posted, September.
1996: Data entry to incorporate Rechenbach's Swahili-English Dictionary .
1997: Data editing.
1998: Programming work begins on Edit Engine. Swahili-Russian dictionary posted.
1999: 56,000 entry dictionary posted, Discussion Forum established, Africa Guide established.
2000: Revised dictionary posted, Edit Engine launched, April.
2001 - 2002: Project has no funding. Development work slows to a crawl, though Edit Engine submissions regularly incorporated into Kamusi lexicon.
2003: Renewed funding begins late July. Development work begins on Learning Guide.
2004: Move to faster, more secure server completed, March. Photo Upload feature introduced, May. Enabled search of plural forms, June. Begin formal collaboration with University of Dar es Salaam Department of Computer Science to establish a mirror server in Tanzania and incorporate computer terminology into the Kamusi lexicon, October. Launch complete site redesign, November. Introduce specialized vocabulary features, November. Continue work on Learning Center .
2005: Introduce the Grouping Tool to arrange dictionary entries. Add new data fields for terminology, dialect, taxonomy, derivation, related words, English definitions, and alternate spellings. Migrate to a more stable and flexible software platform. Improve search and display features. Add user conveniences, including more direct access to the Edit Engine.
2006: Funding runs out in January, project staff furloughed. Work continues with the help of private donations, including a generous grant from the Negaunee Foundation. The Kamusi Parser is introduced that allows users to search and evaluate conjugated Swahili verbs directly within the search engine.
2007: Project is moved from Yale to the World Language Documentation Centre and development work continues with the support of private donations.
2008: National Endowment for the Humanities grant to Grambling University to begin work within Kamusi for expanding the model to multiple languages, with a focus on Kinyarwanda. This grant was subsequently transferred directly to Kamusi after we completed our incorporation as a US legal non-profit corporation.
2009: Incorporation of Kamusi Project USA as a 501(c)(3) non-profit organization registered in Delaware, and Kamusi Project International as a non-governmental organization with the equivalent status registered in Geneva.
2010: Development of KamusiTERMS participatory terminology system and production of localization terminologies in 12 African languages, with the African Network for Localization, IT46, Translate House, and the support of IDRC in Canada. New logo unveiled.
2011: Begin work with University of Ngozi in Burundi on Kirundi language, in association with Universidad Politécnica de Madrid, with students receiving stipends in exchange for working on Kirundi entries.
2012: Programming of multilingual platform with Telamenta in South Africa.
2013: Launch of multilingual pilot, with 100 parallel terms defined in 20 languages, demonstrates that the new multilingual system works and has the potential to scale for unlimited additional languages. However, with no funding for continued language work, linguistic development grinds to a halt. In September, Kamusi joins the Distributed Information Systems Laboratory (LSIR) at EPFL in Switzerland, with support for certain technical development. In November, Kamusi is recognized as a launch partner in the White House Big Data Initiative.
2014: Focus on technical development, including games and mobile apps for engaging the public in the production of linguistic data.
2015: Our Big Data Beta introduces 1.2 million new interlinked records in more than 20 languages, proving Kamusi's capacity to scale. Work is launched on Vietnamese. Server crash in September knocked the site offline to the public for about a year.
2016: Public access moved to kamusigold.org while resources sought to restore full services on the main Kamusi site. Introduction of DUCKS shows the way Kamusi will align data across hundreds or thousands of languages. Launch of Kamusi Here! puts the world's most advanced multilingual dictionary search in the hands of users worldwide.