Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Data-Oriented Methods for Grapheme-to-Phoneme Conversion"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

It is traditionally assumed that various sources of linguistic knowledge and their interaction should be formalised in order to be able to convert words into their phonemic representations with reasonable accuracy. We show that using supervised learning techniques, based on a corpus of transcribed words, the same and even better performance can be achieved, without explicit modeling of linguistic knowledge. In this paper we present two instances of this approach. | Data-Oriented Methods for Grapheme-to-Phoneme Conversion Antal van den Bosch and Walter Daelemans ITK Institute for Language Technology and Al Tilburg University P.O. Box 90153 NL-5000 LE Tilburg Tel 31 13 663070 Email antalb@kub.nl walter@kub.nl Abstract It is traditionally assumed that various sources of linguistic knowledge and their interaction should be formalised in order to be able to convert words into their phonemic representations with reasonable accuracy. We show that using supervised learning techniques based on a corpus of transcribed words the same and even better performance can be achieved without explicit modeling of linguistic knowledge. In this paper we present two instances of this approach. A first model implements a variant of instance-based learning in which a weighed similarity metric and a database of prototypical exemplars are used to predict new mappings. In the second model grapheme-to-phoneme mappings are looked up in a compressed text-to-speech lexicon table lookup enriched with default mappings. We compare performance and accuracy of these approaches to a connectionist backpropagation approach and to the linguistic knowledge-based approach. 1 Introduction Grapheme-to-phoneme conversion is a central task in any text-to-speech reading aloud system. Given an alphabet of spelling symbols graphemes and an alphabet of phonetic symbols a mapping should be achieved transliterating strings of graphemes into strings of phonetic symbols. It is well known that this mapping is difficult because in general not all graphemes are realised in the phonetic transcription and the same grapheme may correspond to different phonetic symbols depending on context. It is traditionally assumed that various sources of linguistic knowledge and their interaction should be formalised in order to be able to convert words into their phonemic representations with reasonable accuracy. Although different researchers propose different knowledge structures consensus seems

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.