TAILIEUCHUNG - Báo cáo khoa học: "Acquiring a Lexicon from Unsegmented Speech"

We present work-in-progress on the machine acquisition of a lexicon from sentences that are each an unsegmented phone sequence paired with a primitive representation of meaning. A simple exploratory algorithm is described, along with the direction of current work and a discussion of the relevance of the problem for child language acquisition and computer speech recognition. | Acquiring a Lexicon from Unsegmented Speech Carl de Marcken MIT Artificial Intelligence Laboratory 545 Technology Square NE43-804 Cambridge MA 02139 USA cgdemarc@ Abstract We present work-in-progress on the machine acquisition of a lexicon from sentences that are each an unsegmented phone sequence paired with a primitive representation of meaning. A simple exploratory algorithm is described along with the direction of current work and a discussion of the relevance of the problem for child language acquisition and computer speech recognition. 1 Introduction We are interested in how a lexicon of discrete words can be acquired from continuous speech a problem fundamental both to child language acquisition and to the automated induction of computer speech recognition systems see Olivier 1968 Wolff 1982 Cartwright and Brent 1994 for previous computational work in this area. For the time being we approximate the problem as induction from phone sequences rather than acoustic pressure and assume that learning takes place in an environment where simple semantic representations of the speech intent are available to the acquisition mechanism. For example we approximate the greater problem as that of learning from inputs like Phon. Input 3araebltsineyb3wt Sem. Input BOAT A IN RABBIT THE BE The rabbit s in a boat. where the semantic input is an unordered set of identifiers corresponding to word paradigms. Obviously the artificial pseudo-semantic representations make the problem much easier we experiment with them as a first step somewhere between learning language from a radio and providing an unambiguous textual transcription as might be used for training a speech recognition system. Our goal is to create a program that after training on many such pairs can segment a new phonetic utterance into a sequence of morpheme identifiers. Such output could be used cis input to many grammar acquisition programs. 2 A Simple Prototype We have implemented a simple algorithm as an

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.