Corpus-driven Bantu Lexicography Part 1: organic corpus building for Lusoga
This article is the first in a trilogy that deals with corpus-driven Bantu lexicography, which is illustrated for Lusoga. The focus here is on the building of a so-called 'organic corpus' from scratch, while the next two instalments will deal with the use of that corpus on the macro-structural and microstructural levels, respectively. Not many detailed descriptions of corpus-building efforts exist for Bantu languages, so each and every step is discussed in detail, paying particular attention to the parameters that have to be taken into account, while not losing sight of the need to log the metadata either.
Keywords: Bantu, Lusoga, corpus building, organic corpus, oral, written, source, period, genre, topic, metadata