TAILIEUCHUNG - Báo cáo khoa học: "A Grammatical Approach to Understanding Textual Tables using Two-Dimensional SCFGs"

We present an elegant and extensible model that is capable of providing semantic interpretations for an unusually wide range of textual tables in documents. Unlike the few existing table analysis models, which largely rely on relatively ad hoc heuristics, our linguistically-oriented approach is systematic and grammar based, which allows our model (1) to be concise and yet (2) recognize a wider range of data models than others, and (3) disambiguate to a significantly finer extent the underlying semantic interpretation of the table in terms of data models drawn from relation database theory. . | A Grammatical Approach to Understanding Textual Tables using Two-Dimensional SCFGs Dekai Wu1 Ken Wing Kuen Lee Human Language Technology Center HKUST Department of Computer Science and Engineering University of Science and Technology Clear Water Bay Hong Kong dekai cswkl @ Abstract We present an elegant and extensible model that is capable of providing semantic interpretations for an unusually wide range of textual tables in documents. Unlike the few existing table analysis models which largely rely on relatively ad hoc heuristics our linguistically-oriented approach is systematic and grammar based which allows our model 1 to be concise and yet 2 recognize a wider range of data models than others and 3 disambiguate to a significantly finer extent the underlying semantic interpretation of the table in terms of data models drawn from relation database theory. To accomplish this the model introduces Viterbi parsing under two-dimensional stochastic CFGs. The cleaner grammatical approach facilitates not only greater coverage but also grammar extension and maintenance as well as a more direct and declarative link to semantic interpretation for which we also introduce a new cleaner data model. In disambiguation experiments on recognizing relevant data models of unseen web tables from different domains a blind evaluation of the model showed 60 precision and 80 recall. 1 Introduction Natural language processing has historically tended to emphasize understanding of linear strings sentences paragraphs discourse structure. The vast body of work that focuses on text understanding is often seen as an approximation of 1The authors would like to thank the Hong Kong Research Grants Council RGC for supporting this research in part through grants RGC6083 99E RGC6256 00E and DAG03 . spoken language understanding. Yet real-life text is actually heavily dependent on visual layout and formatting which compensate for cues normally found in spoken language but are absent in

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.