TAILIEUCHUNG - Báo cáo khoa học: "Sinhala Grapheme-to-Phoneme Conversion and Rules for Schwa Epenthesis"

This paper describes an architecture to convert Sinhala Unicode text into phonemic specification of pronunciation. The study was mainly focused on disambiguating schwa-/\/ and /a/ vowel epenthesis for consonants, which is one of the significant problems found in Sinhala. This problem has been addressed by formulating a set of rules. The proposed set of rules was tested using 30,000 distinct words obtained from a corpus and compared with the same words manually transcribed to phonemes by an expert. The Grapheme-to-Phoneme (G2P) conversion model achieves 98 % accuracy. . | Sinhala Grapheme-to-Phoneme Conversion and Rules for Schwa Epenthesis Asanka Wasala Ruvan Weerasinghe and Kumudu Gamage Language Technology Research Laboratory University of Colombo School of Computing 35 Reid Avenue Colombo 07 Sri Lanka awasala kgamage @ arw@ Abstract This paper describes an architecture to convert Sinhala Unicode text into phonemic specification of pronunciation. The study was mainly focused on disambiguating schwa- and a vowel epenthesis for consonants which is one of the significant problems found in Sinhala. This problem has been addressed by formulating a set of rules. The proposed set of rules was tested using 30 000 distinct words obtained from a corpus and compared with the same words manually transcribed to phonemes by an expert. The Grapheme-to-Phoneme G2P conversion model achieves 98 accuracy. 1 Introduction The conversion of Text-to-Speech TTS involves many important processes. These processes can be divided mainly in to three parts text analysis linguistic analysis and waveform generation Black and Lenzo 2003 . The text analysis process is responsible for converting the nontextual content into text. This process also involves tokenization and normalization of the text. The identification of words or chunks of text is called text-tokenization. Text normalization establishes the correct interpretation of the input text by expanding the abbreviations and acronyms. This is done by replacing the non-alphabetic characters numbers and punctuation with appropriate text strings depending on the context. The linguistic analysis process involves finding the correct pronunciation of words and assigning prosodic features eg. phrasing intonation stress to the phonemic string to be spoken. The final process of a TTS system is waveform generation which involves the production of an acoustic digital signal using a particular synthesis approach such as formant synthesis articulatory synthesis or waveform concatenation .

Kiều Mỹ 98 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462302 61

Giới thiệu :Lập trình mã nguồn mở

14 24977 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11294 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10514 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9797 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8878 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8468 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8092 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7481 1764

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7196 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

THE ANTHROPOLOGY OF ONLINE COMMUNITIES BY Samuel M.Wilson and Leighton C. Peterson

19 212 4 29-11-2024

Giáo án điện tử tiểu học môn lịch sử: Cách mạng mùa thu

39 158 1 29-11-2024

Báo cáo " Bàn về hành vi pháp luật và hành vi đạo đức "

11 171 2 29-11-2024

Valve Selection Handbook - Fourth Edition

337 141 1 29-11-2024

báo cáo khoa học: "Malignant peripheral nerve sheath tumor arising from the greater omentum: Case report"

4 135 1 29-11-2024

OPEN SOURCE ERP REASONABLE TOOLS FOR MANUFACTURING SMEs?

1 144 1 29-11-2024

TRẮC NGHIỆM - CÁC BỆNH THIẾU DINH DƯỠNG THƯỜNG GẶP

32 201 2 29-11-2024

Determini prounoun 1

6 134 0 29-11-2024

Báo cáo khoa học: "A rare coexistence of adrenal cavernous hemangioma with extramedullar hemopoietic tissue: a case report and brief review of the literature"

4 102 0 29-11-2024

Báo cáo khoa học: "Tongue carcinoma in an adult Down's syndrome patient: a case report"

4 127 0 29-11-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8092 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7481 1764

Ebook Chào con ba mẹ đã sẵn sàng

112 4369 1369

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6162 1259

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8878 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3797 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3911 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4623 562

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11294 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4460 490