Standard Yorùbá context dependent tone identification using Multi-Class Support Vector Machine (MSVM)

A.A. Sosimi; T. Adegbola; O.A. Fakinlede

doi:10.4314/jasem.v23i5.20

download PDF

Published:

Jun 18, 2019

DOI:

10.4314/jasem.v23i5.20

Keywords:

Syllabification Standard Yorùbá Context Dependent Tone Tri-tone Recognition

Issue

Vol. 23 No. 5 (2019)

Section

Articles

JASEM has joined the Creative Commons Attribution License (CCAL). Therefore articles in JASEM are open access articles distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

A.A. Sosimi

T. Adegbola

O.A. Fakinlede

Abstract

Most state-of-the-art large vocabulary continuous speech recognition systems employ context dependent (CD) phone units, however, the CD phone units are not efficient in capturing long-term spectral dependencies of tone in most tone languages. The Standard Yorùbá (SY) is a language composed of syllable with tones and requires different method for the acoustic modeling. In this paper, a context dependent tone acoustic model was developed. Tone unit is assumed as syllables, amplitude magnified difference function (AMDF) was used to derive the utterance wide F contour, followed by automatic syllabification and tri-syllable forced alignment with speech phonetization alignment and syllabification SPPAS tool. For classification of the context dependent (CD) tone, slope and intercept of F values were extracted from each segmented unit. Supervised clustering scheme was utilized to partition CD tri-tone based on category and normalized based on some statistics to derive the acoustic feature vectors. Multi-class support vector machine (MSVM) was used for tri-tone training. From the experimental results, it was observed that the word recognition accuracy obtained from the MSVM tri-tone system based on dynamic programming tone embedded features was comparable with phone features. A best parameter tuning was obtained for 10-fold cross validation and overall accuracy was 97.5678%. In term of word error rate (WER), the MSVM CD tri-tone system outperforms the hidden Markov model tri-phone system with WER of 44.47%.

Keywords: Syllabification, Standard Yorùbá, Context Dependent Tone, Tri-tone Recognition

Journal of Applied Sciences and Environmental Management
Journal / Journal of Applied Sciences and Environmental Management / Vol. 23 No. 5 (2019) / Articles

Published:

DOI:

Keywords:

Standard Yorùbá context dependent tone identification using Multi-Class Support Vector Machine (MSVM)

A.A. Sosimi

T. Adegbola

O.A. Fakinlede

Abstract

Journal Identifiers

Article Sidebar

Published:

DOI:

Keywords:

Article Details

Main Article Content

A.A. Sosimi

T. Adegbola

O.A. Fakinlede

Abstract

Journal Identifiers