TAILIEUCHUNG - Báo cáo khoa học: "Exploiting Morphology in Turkish Named Entity Recognition System"

Turkish is an agglutinative language with complex morphological structures, therefore using only word forms is not enough for many computational tasks. In this paper we analyze the effect of morphology in a Named Entity Recognition system for Turkish. We start with the standard word-level representation and incrementally explore the effect of capturing syntactic and contextual properties of tokens. Furthermore, we also explore a new representation in which roots and morphological features are represented as separate tokens instead of representing only words as tokens. . | Exploiting Morphology in Turkish Named Entity Recognition System Reyyan Yeniterzi Language Technologies Institute Carnegie Mellon University Pittsburgh PA 15213 USA reyyan@ Abstract Turkish is an agglutinative language with complex morphological structures therefore using only word forms is not enough for many computational tasks. In this paper we analyze the effect of morphology in a Named Entity Recognition system for Turkish. We start with the standard word-level representation and incrementally explore the effect of capturing syntactic and contextual properties of tokens. Furthermore we also explore a new representation in which roots and morphological features are represented as separate tokens instead of representing only words as tokens. Using syntactic and contextual properties with the new representation provide an relative improvement over the baseline. 1 Introduction One of the main tasks of information extraction is the Named Entity Recognition NER which aims to locate and classify the named entities of an unstructured text. state-of-the-art NER systems have been produced for several languages but despite all these recent improvements developing a NER system for Turkish is still a challenging task due to the structure of the language. Turkish is a morphologically complex language with very productive inflectional and derivational processes. Many local and non-local syntactic structures are represented as morphemes which at the The author is also affiliated with iLab and the Center for the Future of Work of Heinz College Carnegie Mellon University 105 end produces Turkish words with complex morphological structures. For instance the following English phrase if we are going to be able to make something acquire flavor which contains the necessary function words to represent the meaning can be translated into Turkish with only one token tat-landirabileceksek which is produced from the root tat flavor with additional morphemes lan acquire dir .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.