TAILIEUCHUNG - Báo cáo khoa học: "Question Detection in Spoken Conversations Using Textual Conversations"

We investigate the use of textual Internet conversations for detecting questions in spoken conversations. We compare the text-trained model with models trained on manuallylabeled, domain-matched spoken utterances with and without prosodic features. Overall, the text-trained model achieves over 90% of the performance (measured in Area Under the Curve) of the domain-matched model including prosodic features, but does especially poorly on declarative questions. | Question Detection in Spoken Conversations Using Textual Conversations Anna Margolis and Mari Ostendorf Department of Electrical Engineering University of Washington Seattle WA USA amargoli mo @ Abstract We investigate the use of textual Internet conversations for detecting questions in spoken conversations. We compare the text-trained model with models trained on manually-labeled domain-matched spoken utterances with and without prosodic features. Overall the text-trained model achieves over 90 of the performance measured in Area Under the Curve of the domain-matched model including prosodic features but does especially poorly on declarative questions. We describe efforts to utilize unlabeled spoken utterances and prosodic features via domain adaptation. 1 Introduction Automatic speech recognition systems which transcribe words are often augmented by subsequent processing for inserting punctuation or labeling speech acts. Both prosodic features extracted from the acoustic signal and lexical features extracted from the word sequence have been shown to be useful for these tasks Shriberg et al. 1998 Kim and Woodland 2003 Ang et al. 2005 . However access to labeled speech training data is generally required in order to use prosodic features. On the other hand the Internet contains large quantities of textual data that is already labeled with punctuation and which can be used to train a system using lexical features. In this work we focus on question detection in the Meeting Recorder Dialog Act corpus MRDA Shriberg et al. 2004 using text sentences with question marks in Wikipedia talk 118 pages. We compare the performance of a question detector trained on the text domain using lexical features with one trained on MRDA using lexical features and or prosodic features. In addition we experiment with two unsupervised domain adaptation methods to incorporate unlabeled MRDA utterances into the text-based question detector. The goal is to use the unlabeled .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.