TAILIEUCHUNG - Báo cáo khoa học: "Discriminative Pronunciation Modeling: A Large-Margin, Feature-Rich Approach"

We address the problem of learning the mapping between words and their possible pronunciations in terms of sub-word units. Most previous approaches have involved generative modeling of the distribution of pronunciations, usually trained to maximize likelihood. We propose a discriminative, feature-rich approach using large-margin learning. This approach allows us to optimize an objective closely related to a discriminative task, to incorporate a large number of complex features, and still do inference efficiently. . | Discriminative Pronunciation Modeling A Large-Margin Feature-Rich Approach Hao Tang Joseph Keshet and Karen Livescu Toyota Technological Institute at Chicago Chicago IL USA haotang jkeshet klivescu @ Abstract We address the problem of learning the mapping between words and their possible pronunciations in terms of sub-word units. Most previous approaches have involved generative modeling of the distribution of pronunciations usually trained to maximize likelihood. We propose a discriminative feature-rich approach using large-margin learning. This approach allows us to optimize an objective closely related to a discriminative task to incorporate a large number of complex features and still do inference efficiently. We test the approach on the task of lexical access that is the prediction of a word given a phonetic transcription. In experiments on a subset of the Switchboard conversational speech corpus our models thus far improve classification error rates from a previously published result of to about 15 . We find that large-margin approaches outperform conditional random field learning and that the Passive-Aggressive algorithm for large-margin learning is faster to converge than the Pegasos algorithm. 1 Introduction One of the problems faced by automatic speech recognition especially of conversational speech is that of modeling the mapping between words and their possible pronunciations in terms of sub-word units such as phones. While pronouncing dictionaries provide each word s canonical pronunciation s in terms of phoneme strings running speech often includes pronunciations that differ greatly from 194 the dictionary. For example some pronunciations of probably in the Switchboard conversational speech database are p r aa b iy p r aa l iy p r ay and p ow ih Greenberg et al. 1996 . While some words . common words are more prone to such variation than others the effect is extremely general In the phonetically transcribed portion of Switchboard fewer

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.