TAILIEUCHUNG - Báo cáo khoa học: "You talking to me? A Corpus and Algorithm for Conversation Disentanglement"

When multiple conversations occur simultaneously, a listener must decide which conversation each utterance is part of in order to interpret and respond to it appropriately. We refer to this task as disentanglement. We present a corpus of Internet Relay Chat (IRC) dialogue in which the various conversations have been manually disentangled, and evaluate annotator reliability. This is, to our knowledge, the first such corpus for internet chat. We propose a graph-theoretic model for disentanglement, using discourse-based features which have not been previously applied to this task. . | You talking to me A Corpus and Algorithm for Conversation Disentanglement Micha Elsner and Eugene Charniak Brown Laboratory for Linguistic Information Processing BLLIP Brown University Providence RI 02912 melsner ec @@ Abstract When multiple conversations occur simultaneously a listener must decide which conversation each utterance is part of in order to interpret and respond to it appropriately. We refer to this task as disentanglement. We present a corpus of Internet Relay Chat IRC dialogue in which the various conversations have been manually disentangled and evaluate annotator reliability. This is to our knowledge the first such corpus for internet chat. We propose a graph-theoretic model for disentanglement using discourse-based features which have not been previously applied to this task. The model s predicted disentanglements are highly correlated with manual annotations. 1 Motivation Simultaneous conversations seem to arise naturally in both informal social interactions and multi-party typed chat. Aoki et al. 2006 s study of voice conversations among 8-10 people found an average of conversations floors active at a time and a maximum of four. In our chat corpus the average is even higher at . The typical conversation therefore is one which is interrupted- frequently. Disentanglement is the clustering task of dividing a transcript into a set of distinct conversations. It is an essential prerequisite for any kind of higher-level dialogue analysis for instance consider the multiparty exchange in figure 1. Contextually it is clear that this corresponds to two conversations and Felicia s1 response excel- 1Real user nicknames are replaced with randomly selected Chanel Felicia google works Gale Arlie you guys have never worked in a factory before have you Gale Arlie there s some real unethical stuff that goes on Regine hands Chanel a trophy Arlie Gale of course . thats how they make money Gale and people lose limbs or get killed Felicia .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.