TAILIEUCHUNG - Báo cáo khoa học: "Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure"

Documents often have inherently parallel structure: they may consist of a text and commentaries, or an abstract and a body, or parts presenting alternative views on the same problem. Revealing relations between the parts by jointly segmenting and predicting links between the segments, would help to visualize such documents and construct friendlier user interfaces. | Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure Minwoo Jeong and Ivan Titov Saarland University Saarbriicken Germany titov @ Abstract Documents often have inherently parallel structure they may consist of a text and commentaries or an abstract and a body or parts presenting alternative views on the same problem. Revealing relations between the parts by jointly segmenting and predicting links between the segments would help to visualize such documents and construct friendlier user interfaces. To address this problem we propose an unsupervised Bayesian model for joint discourse segmentation and alignment. We apply our method to the English as a second language podcast dataset where each episode is composed of two parallel parts a story and an explanatory lecture. The predicted topical links uncover hidden relations between the stories and the lectures. In this domain our method achieves competitive results rivaling those of a previously proposed supervised technique. 1 Introduction Many documents consist of parts exhibiting a high degree of parallelism . abstract and body of academic publications summaries and detailed news stories etc. This is especially common with the emergence of the Web technologies many texts on the web are now accompanied with comments and discussions. Segmentation of these parallel parts into coherent fragments and discovery of hidden relations between them would facilitate the development of better user interfaces and improve the performance of summarization and information retrieval systems. Discourse segmentation of the documents composed of parallel parts is a novel and challenging problem as previous research has mostly focused on the linear segmentation of isolated texts . Hearst 1994 . The most straightforward approach would be to use a pipeline strategy where an existing segmentation algorithm finds discourse boundaries of each part independently and then the .

Trường Sơn 70 5 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Towards the Unsupervised Acquisition of Discourse Relations"

5 58 0

Báo cáo khoa học: "Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure"

5 53 0

Báo cáo khoa học: "Unsupervised Topic Modelling for Multi-Party Spoken Discourse"

8 49 0

Báo cáo khoa học: "An Unsupervised Approach to Recognizing Discourse Relations"

8 71 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462285 61

Giới thiệu :Lập trình mã nguồn mở

14 24844 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11281 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10508 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9785 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8463 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8090 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7465 1763

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7185 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Báo cáo nghiên cứu nông nghiệp " Field control of pest fruit flies in Vietnam "

14 181 4 23-11-2024

Quy Trình Canh Tác Cây Bông Vải

8 148 1 23-11-2024

Bảng màu theo chữ cái – V

11 153 2 23-11-2024

Chương 10: Các phương pháp tính quá trình quá độ trong mạch điện tuyến tính

57 226 7 23-11-2024

báo cáo hóa học:" Perceptions of rewards among volunteer caregivers of people living with AIDS working in faith-based organizations in South Africa: a qualitative study"

10 146 1 23-11-2024

CHƯƠNG 2: RỦI RO THÂM HỤT TÀI KHÓA

28 152 1 23-11-2024

Sử dụng mô hình ARCH và GARCH để phân tích và dự báo về giá cổ phiếu trên thị trường chứng khoán

24 1064 2 23-11-2024

5 thói quen ăn uống hủy hoại hàm răng đẹp

5 159 1 23-11-2024

Lập trình Java cơ bản : Luồng và xử lý file part 8

5 133 1 23-11-2024

Xinh xinh vườn nhà

6 128 0 23-11-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8090 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7465 1763

Ebook Chào con ba mẹ đã sẵn sàng

112 4364 1369

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6149 1258

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3786 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3909 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4614 562

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11281 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4447 490