TAILIEUCHUNG - Báo cáo khoa học: "PART-OF-SPEECH TAGGING USING A VARIABLE MEMORY MARKOV MODEL"

We present a new approach to disambiguating syntactically ambiguous words in context, based on Variable Memory Markov (VMM) models. In contrast to fixed-length Markov models, which predict based on fixed-length histories, variable memory Markov models dynamically adapt their history length based on the training data, and hence may use fewer parameters. In a test of a VMM based tagger on the Brown corpus, of tokens are correctly classified. | PART-OF-SPEECH TAGGING USING A VARIABLE MEMORY MARKOV MODEL Hinrich Schiitze Center for the Study of Language and Information Stanford CA 94305-4115 Internet schuetze@ Yoram Singer Institute of Computer Science and Center for Neural Computation Hebrew University Jerusalem 91904 Internet singer@ Abstract We present a new approach to disambiguating syntactically ambiguous words in context based on Variable Memory Markov VMM models. In contrast to fixed-length Markov models which predict based on fixed-length histories variable memory Markov models dynamically adapt their history length based on the training data and hence may use fewer parameters. In a test of a VMM based tagger on the Brown corpus of tokens are correctly classified. INTRODUCTION Many words in English have several parts of speech PỚS . For example book is used as a noun in She read a book. and as a verb in She didn t book a trip. Part-of-speech tagging is the problem of determining the syntactic part of speech of an occurrence of a word in context. In any given English text most tokens are syntactically ambiguous since most of the high-frequency English words have several parts of speech. Therefore a correct syntactic classification of words in context is important for most syntactic and other higher-level processing of natural language text. Two stochastic methods have been widely used for POS tagging fixed order Markov models and Hidden Markov models. Fixed order Markov models are used in Church 1989 and Charniak et al. 1993 . Since the order of the model is assumed to be fixed a short memory small order is typically used since the number of possible combinations grows exponentially. For example assuming there are 184 different tags as in the Brown corpus there are 1843 6 229 504 different order 3 combinations of tags of course not all of these will actually occur see Weischedel et al. 1993 . Because of the large number of parameters higher-order fixed length .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.