TAILIEUCHUNG - Báo cáo khoa học: "A Probabilistic Context-free Grammar for Disambiguation in Morphological Parsing"

One of the major problems one is faced with when decomposing words into their constituent parts is ambiguity: the generation of multiple analyses for one input word, many of which are implausible. In order to deal with ambiguity, the MORphological PArser MORPA is provided with a probabilistic context-free grammar (PCFG), . it combines a "conventional" context-free morphological grammar to filter out ungrammatical segmentations with a probability-based scoring function which determines the likelihood of each successful parse. . | A Probabilistic Context-free Grammar for Disambiguation in Morphological Parsing Josee s. Heemskerk Institute of Language Technology and Artificial Intelligence Tilburg University . Box 90153 5000 LE Tilburg The Netherlands E-mail joseeh@ Abstract One of the major problems one is faced with when decomposing words into their constituent parts is ambiguity the generation of multiple analyses for one input word many of which are implausible. In order to deal with ambiguity the MORphological PArser MORPA is provided with a probabilistic context-free grammar PCFG . it combines a conventional context-free morphological grammar to filter out ungrammatical segmentations with a probability-based scoring function which determines the likelihood of each successful parse. Consequently remaining analyses can be ordered along a scale of plausibility. Test performance data will show that a PCFG yields good results in morphological parsing. MORPA is a fully implemented parser developed for use in a text-to-speech conversion system. 1 Introduction MORPA is a MORphological PArser developed for use in the text-to-speech conversion system for Dutch SPRAAKMAKER van Leeuwen and te Lin-dert 1993 . An important step in text-to-speech conversion is the generation of the correct phonemic representation on the basis of the input text. As is well-known phonemic transcriptions can not be derived This work was carried out at the Phonetics Laboratory at Leiden University and supported by the Speech Technology Foundation which is funded by the Netherlands Stimulation Project for Information Sciences SPIN. directly from orthographic input in Dutch as there is no one-to-one correspondence between graphemes and phonemes. Also stress and the effects of most phonological rules are not reflected in orthography. A text-to-speech system therefore requires an intelligent method to convert the spelled words of the input sentence into a phonemic representation. As far as the pronunciation of .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.