Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper describes an architecture to convert Sinhala Unicode text into phonemic specification of pronunciation. The study was mainly focused on disambiguating schwa-/\/ and /a/ vowel epenthesis for consonants, which is one of the significant problems found in Sinhala. This problem has been addressed by formulating a set of rules. The proposed set of rules was tested using 30,000 distinct words obtained from a corpus and compared with the same words manually transcribed to phonemes by an expert. The Grapheme-to-Phoneme (G2P) conversion model achieves 98 % accuracy. . | Sinhala Grapheme-to-Phoneme Conversion and Rules for Schwa Epenthesis Asanka Wasala Ruvan Weerasinghe and Kumudu Gamage Language Technology Research Laboratory University of Colombo School of Computing 35 Reid Avenue Colombo 07 Sri Lanka awasala kgamage @webmail.cmb.ac.lk arw@ucsc.cmb.ac.lk Abstract This paper describes an architecture to convert Sinhala Unicode text into phonemic specification of pronunciation. The study was mainly focused on disambiguating schwa- and a vowel epenthesis for consonants which is one of the significant problems found in Sinhala. This problem has been addressed by formulating a set of rules. The proposed set of rules was tested using 30 000 distinct words obtained from a corpus and compared with the same words manually transcribed to phonemes by an expert. The Grapheme-to-Phoneme G2P conversion model achieves 98 accuracy. 1 Introduction The conversion of Text-to-Speech TTS involves many important processes. These processes can be divided mainly in to three parts text analysis linguistic analysis and waveform generation Black and Lenzo 2003 . The text analysis process is responsible for converting the nontextual content into text. This process also involves tokenization and normalization of the text. The identification of words or chunks of text is called text-tokenization. Text normalization establishes the correct interpretation of the input text by expanding the abbreviations and acronyms. This is done by replacing the non-alphabetic characters numbers and punctuation with appropriate text strings depending on the context. The linguistic analysis process involves finding the correct pronunciation of words and assigning prosodic features eg. phrasing intonation stress to the phonemic string to be spoken. The final process of a TTS system is waveform generation which involves the production of an acoustic digital signal using a particular synthesis approach such as formant synthesis articulatory synthesis or waveform concatenation .