This is a page from the Kamusi archives. The information below may be out of date, and the links may no longer be valid. Please visit kamusi.org for current information. If you know of links or information on this page that can be updated, please let us know.
When we talk about "morphemes", we are talking about the different forms that a word can take in different situations. For example, the word "big" has the morphemes "bigger" and "biggest". Bigger and biggest are not their own concepts, in need of their own dictionary entries. They are simply "big" in different clothing.
While we were designing Kamusi, we realized that morphemes are a very complicated aspect of most languages. We struggled for a long time with how to document these complexities, until we realized that all morphemes share a common feature: they exist as part of the predictable grammatical structure of their language.
Because morphemes are a predictable feature of a language, we can provide a predictable space for them in the Kamusi data structure. We know that English can have as many as three forms for adjectives - the basic form (big), the comparative form (bigger), and the superlative form (biggest) - so, for the "adjectives" part of speech category in English, we create two extra spaces for morphemes in addition to the basic "lemma" form, and we label them "comparative" and "superlative". We know that Spanish nouns can have as many as four forms (alumno, alumna, alumnos, alumnas), so we create three extra spaces in the Spanish noun category, and give them the appropriate labels. We know that Luganda adjectives can have 21 different forms, so we create and label the morpheme input boxes accordingly.
English verbs are easy. The language has just a few verb morphemes, so we are easily able to collect all those forms: see/ sees/ saw/ seen/ seeing. Once the morphemes are entered, they become available to search, which means you can search for "seeing" and you'll be brought to the entry for "see".
Verbs in Romance languages are harder, because they have dozens of different conjugations, and a painful number of irregular verbs. If you are reading this paragraph, an efficient way of capturing and presenting conjugation data is still on our programming task list - we know what we need to do, so please stay tuned.
Verbs in Bantu languages can have as many as 900 million morphemes. It would be impossible, and ridiculously inefficient, to list each of those morphemes for every verb in every Bantu language in Kamusi. Fortunately, the rules for making those morphemes are extremely regular. Therefore, our task is to gather the linguistic rules for verbs for each Bantu language and convert those into computer code. We've done it for Swahili, and will do it for more languages as our programming resources allow.
Happily, Mandarin Chinese does not have any morphemes in its written form, because Chinese characters refer to ideas, not sounds or grammar. This makes the language that much more difficult to learn, but makes life a little easier from the standpoint of dictionary design. So far, this is the only language we've added to Kamusi that does not need the morphemes feature. For all other languages, all we need is some grammatical guidance, and our morphemes system can handle any word form you send our way.
One final note: Kamusi uses the word "morpheme" in a particular way, for a particular purpose. Other people might use the term "inflection", or "form", or "shape". For us, what is important is not the term, but the ability that the Kamusi morpheme system gives us to accurately document every word in every language./content/what-morpheme-kamusi