Using the Corpus of Spoken Afrikaans to generate an Afrikaans chatbot

  • Bayan Abu Shawar School of Computing, University of Leeds, Leeds LS2 9JT, England
  • Eric Atwell School of Computing, University of Leeds, Leeds LS2 9JT, England

Abstract

This paper presents two chatbot systems, ALICE and Elizabeth, illustrating the dialogue knowledge representation and pattern matching techniques of each. We discuss the problems which arise when using the Corpus of Spoken Afrikaans (Korpus Gesproke Afrikaans) to retrain the ALICE chatbot system with human dialogue examples. A Java program to convert from dialogue transcripts to the AIML linguistic knowledge representation formalism provides a basic implementation of corpus-based chatbot training. The Java program used the Afrikaans dialogue corpus texts to generate two versions of the Afrikaans chatbot.

Southern African Linguistics and Applied Language Studies 2003, 21(4): 283–294
Published
2004-05-25
Section
Articles

Journal Identifiers


eISSN: 1727-9461
print ISSN: 1607-3614