PROMOTING ACCESS TO AFRICAN RESEARCH

Lexikos

Log in or Register to get access to full text downloads.

Remember me or Register



PEDANT: parallel texts in Göteborg

Daniel Ridings

Abstract


The article presents the status of the PEDANT project with parallel corpora at the Language Bank at Göteborg University. The solutions for access to the corpus data are presented. Access is provided by way of the internet and standard applications and SGML-aware programming tools. The SGML format for encoding translation pairs is outlined together. The methods allow working with everything from plain text to texts densely encoded with linguistic information.

Keywords: sgml, parallel corpora, morphosyntactic encoding, lemmatization, multiword units, compound words, internet access




http://dx.doi.org/10.5788/8-1-956
AJOL African Journals Online