Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Constraint-based Sentence Compression An Integer Programming Approach"

Hoài Thanh 76 8 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

The ability to compress sentences while preserving their grammaticality and most of their meaning has recently received much attention. Our work views sentence compression as an optimisation problem. We develop an integer programming formulation and infer globally optimal compressions in the face of linguistically motivated constraints. We show that such a formulation allows for relatively simple and knowledge-lean compression models that do not require parallel corpora or largescale resources. The proposed approach yields results comparable and in some cases superior to state-of-the-art. . | Constraint-based Sentence Compression An Integer Programming Approach James Clarke and Mirella Lapata School of Informatics University of Edinburgh 2 Bucclecuch Place Edinburgh Eh8 9LW UK jclarke@ed.ac.uk mlap@inf.ed.ac.uk Abstract The ability to compress sentences while preserving their grammaticality and most of their meaning has recently received much attention. Our work views sentence compression as an optimisation problem. We develop an integer programming formulation and infer globally optimal compressions in the face of linguistically motivated constraints. We show that such a formulation allows for relatively simple and knowledge-lean compression models that do not require parallel corpora or large-scale resources. The proposed approach yields results comparable and in some cases superior to state-of-the-art. 1 Introduction A mechanism for automatically compressing sentences while preserving their grammaticality and most important information would greatly bene-ht a wide range of applications. Examples include text summarisation Jing 2000 subtitle generation from spoken transcripts Vandeghinste and Pan 2004 and information retrieval Olivers and Dolan 1999 . Sentence compression is a complex paraphrasing task with information loss involving substitution deletion insertion and reordering operations. Recent years have witnessed increased interest on a simpler instantiation of the compression problem namely word deletion Knight and Marcu 2002 Riezler et al. 2003 Turner and Char-niak 2005 . More formally given an input sentence of words W W1 w2 . wn a compression is formed by removing any subset of these words. Sentence compression has received both generative and discriminative formulations in the literature. Generative approaches Knight and Marcu 2002 Turner and Charniak 2005 are instantiations of the noisy-channel model given a long sentence l the aim is to hnd the corresponding short sentence s which maximises the conditional probability P s l . In a .

TÀI LIỆU LIÊN QUAN

Kỷ yếu tóm tắt báo cáo khoa học: Hội nghị khoa học tim mạch toàn quốc lần thứ XI - Hội tim mạch Quốc gia Việt Nam

Báo cáo nghiên cứu khoa học: "Danh lục các loài thú ở khu bảo tồn thiên nhiên Pù Huống tỉnh Nghệ An và ý nghĩa bảo tồn nguồn gen quí hiếm của chúng"

Báo cáo khoa học: Hỗ trợ nâng cao năng lực quản lý chất thải sinh hoạt tại thành phố Hội An

Báo cáo nghiên cứu khoa học: "Tính năng động nghệ thuật của văn học hiện đại Việt Nam và một cách nhìn hành trình thể loại"

Báo cáo nghiên cứu khoa học: " DỊCH CHUYỂN TRUY VẤN OQL VÀO CÁC PHÉP TÍNH BAO HÀM"

Báo cáo khoa học: " Áp dụng thủ tục phân tích trong kiểm toán báo cáo tài chính"

Báo cáo nghiên cứu khoa học: "Người lính trở về sau chiến tranh với mặc cảm “ăn mày dĩ vãng’ trong tiểu thuyết Chu Lai"

Báo cáo nghiên cứu khoa học: "Khảo sát hiện tượng chuyển đổi chức năng - nghĩa của động từ tiếng Việt"

Báo cáo nghiên cứu khoa học: " BẢN CHẤT KHOA HỌC VÀ CÁCH MẠNG LÀ CỘI NGUỒN SỨC SỐNG CỦA CHỦ NGHĨA MÁC - LÊNIN"

Báo cáo khoa học: " CẢI TIẾN CÁC THUẬT TOÁN MƯỢN VÀ KHOÁ KÊNH TẦN SỐ MẠNG DI ĐỘNG TẾ BÀO"

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.