Main Article Content

Computational aids for Zulu natural language processing


Laurette Pretorius
Sonja E Bosch

Abstract

In this paper the development of two basic computational
aids in Zulu natural language processing, namely a morphological analyser,
built with the Xerox finite-state tools (Beesley & Karttunen, 2003) and a
machine-readable lexicon as an XML document, are discussed. We briefly consider
the linguistic characteristics of an agglutinating language such as Zulu, with
specific reference to the noun and noun-based words. The issues of
computational morphology and the challenges involved in building a
morphological analyser for Zulu are addressed, and a brief explanation of how
the Xerox finite-state tools may be used for this purpose, is given. Then the
development of the morphological analyser and the lexicon are outlined, and
finally we discuss the integration and use of these two computational aids to
reflect the dynamic nature of natural language by focusing on a variant of the
morphological analyser, namely the so-called ‘guesser'. By applying this
guesser to the available language corpora, new word roots may be identified and
systematically included in the XML lexicon and the morphological analyser.

Southern African Linguistics and
Applied Language Studies 2003, 21(4): 267–282

Journal Identifiers


eISSN: 1727-9461
print ISSN: 1607-3614