Main Article Content

Semi-automatic retrieval of definitional information: a northern Sotho case study

E Taljard


Corpus-based terminology is currently gaining ground on the international front. It is therefore important that terminologists working on the South African Bantu languages not only take note of this development, but that they should also follow this trend, even if they do not have the same measure of access to highly sophisticated software. The aim of this article is therefore to establish whether it is possible to retrieve definitional information on key concepts from untagged, running text by making use of affordable and easily accessible software such as WordSmith Tools. In order to answer this question, a case study is done in Northern Sotho, using textual material on
linguistics as basis for a special field corpus. Syntactic and lexical patterns serving as textual markers of definitional information are identified and the success rate of the computational retrieval of definitional information is analysed and evaluated. Attention is also paid to the retrieval of specifically conceptual information, which turned out to be a fortunate by-product of semi-automatic retrieval of definitional information. Finally, it is illustrated how definitional information retrieved can be utilised in the writing of a formal terminological definition.

Keywords: terminology, south african bantu languages, definitional information, semi-automatic information retrieval, terminological definitions, conceptual relationships, lexical patterns, syntactic patterns, textual markers, keyword-in-context (kwic), wordsmith tools

Journal Identifiers

eISSN: 2224-0039
print ISSN: 1684-4904