TAILIEUCHUNG - Báo cáo khoa học: "Segment-based Hidden Markov Models for Information Extraction"

Hidden Markov models (HMMs) are powerful statistical models that have found successful applications in Information Extraction (IE). In current approaches to applying HMMs to IE, an HMM is used to model text at the document level. This modelling might cause undesired redundancy in extraction in the sense that more than one filler is identified and extracted. We propose to use HMMs to model text at the segment level, in which the extraction process consists of two steps: a segment retrieval step followed by an extraction step. . | Segment-based Hidden Markov Models for Information Extraction Zhenmei Gu Nick Cercone University of Waterloo Waterloo Ontario Canada N2l 3G1 z2gu@ David R. Cheriton School of Computer Science Faculty of Computer Science Dalhousie University Halifax Nova Scotia Canada B3H 1W5 nick@ Abstract Hidden Markov models HMMs are powerful statistical models that have found successful applications in Information Extraction IE . In current approaches to applying HMMs to IE an HMM is used to model text at the document level. This modelling might cause undesired redundancy in extraction in the sense that more than one filler is identified and extracted. We propose to use HMMs to model text at the segment level in which the extraction process consists of two steps a segment retrieval step followed by an extraction step. In order to retrieve extractionrelevant segments from documents we introduce a method to use HMMs to model and retrieve segments. Our experimental results show that the resulting segment HMM IE system not only achieves near zero extraction redundancy but also has better overall extraction performance than traditional document HMM IE systems. 1 Introduction A Hidden Markov Model HMM is a finite state automaton with stochastic state transitions and symbol emissions Rabiner 1989 . The automaton models a random process that can produce a sequence of symbols by starting from some state transferring from one state to another state with a symbol being emitted at each state until a final state is reached. Formally a hidden Markov model HMM is specified by a five-tuple S K n A B where S is a set of states K is the alphabet of observation symbols n is the initial state distribution A is the probability distribution of state transitions and B is the probability distribution of symbol emissions. When the structure of an HMM is determined the complete model parameters can be represented as A A B n . HMMs are particularly useful in modelling sequential data.

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.