Towards a Corpus of Black South African English
AbstractThis paper describes the proposed structure and design for a corpus of Xhosa English, which should ultimately form part of a larger corpus of Black South African English (BSAE). The planned corpus (which already comprises 100 000 transcribed words) is exclusively based on spoken spontaneous Xhosa English, and full justification for this decision is provided in the paper. In order that this corpus will be mutually compatible with similar corpora elsewhere, the guidelines of the Wellington corpus of spoken New Zealand English (based in the International Corpus of English (ICE)) have been closely followed, both in terms of transcription and mark-up conventions and in the referencing system used. Where there are differences, these have been carefully motivated. It is hoped that researchers in other parts of South Africa will collaborate in creating additional corpora of other "indigenous" varieties of Black English, following the guidelines provided here, so that ultimately all such corpora will be compatible and can be combined to form a large and comprehensive corpus of BSAE.
(S/ern Af Linguistics & Applied Language Studies: 2002 20(1&2): 25-35)
Copyright for articles published in this journal is retained by the publisher.