TAILIEUCHUNG - Báo cáo khoa học: "Joint Processing and Discriminative Training for Letter-to-Phoneme Conversion"

We present a discriminative structureprediction model for the letter-to-phoneme task, a crucial step in text-to-speech processing. Our method encompasses three tasks that have been previously handled separately: input segmentation, phoneme prediction, and sequence modeling. The key idea is online discriminative training, which updates parameters according to a comparison of the current system output to the desired output, allowing us to train all of our components together. | Joint Processing and Discriminative Training for Letter-to-Phoneme Conversion Sittichai Jiampojamarn Colin Cherry Grzegorz Kondrak tDepartment of Computing Science University of Alberta Edmonton AB T6G 2E8 Canada sj kondrak @ ÍMicrosoft Research One Microsoft Way Redmond WA 98052 colinc@ Abstract We present a discriminative structureprediction model for the letter-to-phoneme task a crucial step in text-to-speech processing. Our method encompasses three tasks that have been previously handled separately input segmentation phoneme prediction and sequence modeling. The key idea is online discriminative training which updates parameters according to a comparison of the current system output to the desired output allowing us to train all of our components together. By folding the three steps of a pipeline approach into a unified dynamic programming framework we are able to achieve substantial performance gains. Our results surpass the current state-of-the-art on six publicly available data sets representing four different languages. 1 Introduction Letter-to-phoneme L2P conversion is the task of predicting the pronunciation of a word represented as a sequence of phonemes from its orthographic form represented as a sequence of letters. The L2P task plays a crucial role in speech synthesis systems Schroeter et al. 2002 and is an important part of other applications including spelling correction Toutanova and Moore 2001 and speech-to-speech machine translation Engelbrecht and Schultz 2005 . Converting a word into its phoneme representation is not a trivial task. Dictionary-based approaches cannot achieve this goal reliably due to unseen words and proper names. Furthermore the construction of even a modestly-sized pronunciation dictionary requires substantial human effort for each new language. Effective rule-based approaches can be designed for some languages such as Spanish. However Kominek and Black 2006 show that in languages with a less .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.