TAILIEUCHUNG - Báo cáo khoa học: "Automatic Segmentation of Multiparty Dialogue"

In this paper, we investigate the problem of automatically predicting segment boundaries in spoken multiparty dialogue. We extend prior work in two ways. We first apply approaches that have been proposed for predicting top-level topic shifts to the problem of identifying subtopic boundaries. We then explore the impact on performance of using ASR output as opposed to human transcription. | Automatic Segmentation of Multiparty Dialogue Pei-Yun Hsueh School of Informatics University of Edinburgh Edinburgh EH8 9LW Gb Johanna D. Moore School of Informatics University of Edinburgh Edinburgh EH8 9LW GB Steve Renals School of Informatics University of Edinburgh Edinburgh EH8 9LW GB Abstract In this paper we investigate the problem of automatically predicting segment boundaries in spoken multiparty dialogue. We extend prior work in two ways. We first apply approaches that have been proposed for predicting top-level topic shifts to the problem of identifying subtopic boundaries. We then explore the impact on performance of using ASR output as opposed to human transcription. Examination of the effect of features shows that predicting top-level and predicting subtopic boundaries are two distinct tasks 1 for predicting subtopic boundaries the lexical cohesion-based approach alone can achieve competitive results 2 for predicting top-level boundaries the machine learning approach that combines lexical-cohesion and conversational features performs best and 3 conversational cues such as cue phrases and overlapping speech are better indicators for the toplevel prediction task. We also find that the transcription errors inevitable in ASR output have a negative impact on models that combine lexical-cohesion and conversational features but do not change the general preference of approach for the two tasks. 1 Introduction Text segmentation . determining the points at which the topic changes in a stream of text plays an important role in applications such as topic detection and tracking summarization automatic genre detection and information retrieval and extraction Pevzner and Hearst 2002 . In recent work researchers have applied these techniques to corpora such as newswire feeds transcripts of radio broadcasts and spoken dialogues in order to facilitate browsing information retrieval and topic detection Allan et al.

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.