Main Article Content

Containing overgeneration in Zulu computational morphology


Laurette Pretorius
Sonja E Bosch

Abstract

The development of a large-coverage, computational morphological analyser for Zulu requires the modelling not only of the regular phenomena often associated with word formation, but also the idiosyncratic behaviour that may occur in Zulu morphology. This paper discusses the application of an existing rule-based, finite-state morphological analyser prototype ZulMorph in semi-automating the mining of available Zulu language corpora for idiosyncratic behaviour. The semi-automated procedure makes provision for bootstrapping the morphological analyser to include newly extracted information from corpora. Of particular interest is also the central role that the machine-readable lexicon plays. The procedure is applied to a Zulu development corpus of 30 000 types and the results are given and discussed.

Southern African Linguistics and Applied Language Studies 2008, 26(2): 209–216

Journal Identifiers


eISSN: 1727-9461
print ISSN: 1607-3614