Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "The utility of parse-derived features for automatic discourse segmentation"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

We investigate different feature sets for performing automatic sentence-level discourse segmentation within a general machine learning approach, including features derived from either finite-state or contextfree annotations. We achieve the best reported performance on this task, and demonstrate that our SPADE-inspired context-free features are critical to achieving this level of accuracy. This counters recent results suggesting that purely finite-state approaches can perform competitively. Nucleus | The utility of parse-derived features for automatic discourse segmentation Seeger Fisher and Brian Roark Center for Spoken Language Understanding OGI School of Science Engineering Oregon Health Science University Beaverton Oregon 97006 USA fishers roark @cslu.ogi.edu Abstract We investigate different feature sets for performing automatic sentence-level discourse segmentation within a general machine learning approach including features derived from either finite-state or context-free annotations. We achieve the best reported performance on this task and demonstrate that our SPADE-inspired context-free features are critical to achieving this level of accuracy. This counters recent results suggesting that purely finite-state approaches can perform competitively. 1 Introduction Discourse structure annotations have been demonstrated to be of high utility for a number of NLP applications including automatic text summarization Marcu 1998 Marcu 1999 Cristea et al. 2005 sentence compression Sporleder and Lap-ata 2005 natural language generation Prasad et al. 2005 and question answering Verberne et al. 2006 . These annotations include sentence segmentation into discourse units along with the linking of discourse units both within and across sentence boundaries into a labeled hierarchical structure. For example the tree in Figure 1 shows a sentence-level discourse tree for the string Prices have dropped but remain quite high according to CEO Smith which has three discourse segments each labeled with either Nucleus or Satellite depending on how central the segment is to the coherence of the text. There are a number of corpora annotated with discourse structure including the well-known RST Treebank Carlson et al. 2002 the Discourse GraphBank Wolf and Gibson 2005 and the Penn Discourse Treebank Miltsakaki et al. 2004 . While the annotation approaches differ across these corpora the requirement of sentence segmentation into 488 Root Figure 1 Example Nucleus Satellite labeled .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.