TAILIEUCHUNG - Báo cáo khoa học: "A TAG-based noisy channel model of speech repairs"

This paper describes a noisy channel model of speech repairs, which can identify and correct repairs in speech transcripts. A syntactic parser is used as the source model, and a novel type of TAG-based transducer is the channel model. The use of TAG is motivated by the intuition that the reparandum is a “rough copy” of the repair. The model is trained and tested on the Switchboard disfluency-annotated corpus. | A TAG-based noisy channel model of speech repairs Mark Johnson Brown University Providence RI 02912 mj@ Eugene Charniak Brown University Providence RI 02912 ec@ Abstract This paper describes a noisy channel model of speech repairs which can identify and correct repairs in speech transcripts. A syntactic parser is used as the source model and a novel type of TAG-based transducer is the channel model. The use of TAG is motivated by the intuition that the reparandum is a rough copy of the repair. The model is trained and tested on the Switchboard disfluency-annotated corpus. 1 Introduction Most spontaneous speech contains disfluencies such as partial words filled pauses . uh um huh explicit editing terms . I mean parenthetical asides and repairs. Of these repairs pose particularly difficult problems for parsing and related NLP tasks. This paper presents an explicit generative model of speech repairs and shows how it can eliminate this kind of disfluency. While speech repairs have been studied by psycholinguists for some time as far as we know this is the first time a probabilistic model of speech repairs based on a model of syntactic structure has been described in the literature. Probabilistic models have the advantage over other kinds of models that they can in principle be integrated with other probabilistic models to produce a combined model that uses all available evidence to select the globally optimal analysis. Shriberg and Stolcke 1998 studied the location and distribution of repairs in the Switchboard corpus but did not propose an actual model of repairs. Heeman and Allen 1999 describe a noisy channel model of speech repairs but leave extending the model to incorporate higher level syntactic . . . processing to future work. The previous work most closely related to the current work is Charniak and Johnson 2001 who used a boosted decision stub classifier to classify words as edited or not on a word by word basis but do not .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.