Main Article Content

Using the Corpus of Spoken Afrikaans to generate an Afrikaans chatbot


Bayan Abu Shawar
Eric Atwell

Abstract

This paper presents two chatbot systems, ALICE and
Elizabeth, illustrating the dialogue knowledge representation and pattern
matching techniques of each. We discuss the problems which arise when using the
Corpus of Spoken Afrikaans (Korpus Gesproke Afrikaans) to retrain the ALICE
chatbot system with human dialogue examples. A Java program to convert from
dialogue transcripts to the AIML linguistic knowledge representation formalism
provides a basic implementation of corpus-based chatbot training. The Java program
used the Afrikaans dialogue corpus texts to generate two versions of the
Afrikaans chatbot.

Southern African Linguistics and
Applied Language Studies 2003, 21(4): 283–294

Journal Identifiers


eISSN: 1727-9461
print ISSN: 1607-3614