Main Article Content

The first Malay language storytelling text-to-speech (TTS) corpus for humanoid robot storytellers


I. Ramli
N. Jamil
N. Seman
N. Ardi

Abstract

This paper describes the process undertaken and criteria considered in acquiring a storytelling
speech corpus of Malay language towards the development of humanoid storyteller. The
speech corpus contains 464 speech sentences, 4,656 words and 9,584 syllables. Three
children’s short stories were recorded by 3 female storytellers, 1 male professional speaker, 2
female speakers and 2 male speakers. The equipment specifications, recording procedures and
speech annotations are described in detail in accordance to baseline work. The stories were
recorded in two speaking styles that are neutral and storytelling speaking style. The first
Malay language storytelling corpus is not only necessary for the development of a storytelling
text-to-speech (TTS) synthesis. It is also detrimental for natural language processing and
speech recognition of Malay language, an under-resourced language

Keywords: storytelling speech corpus; humanoid storyteller; storytelling TTS; Malay
language.


Journal Identifiers


eISSN:
print ISSN: 1112-9867