TAILIEUCHUNG - Báo cáo khoa học: "Poliqarp An open source corpus indexer and search engine with syntactic extensions"

This paper presents recent extensions to Poliqarp, an open source tool for indexing and searching morphosyntactically annotated corpora, which turn it into a tool for indexing and searching certain kinds of treebanks, complementary to existing treebank search engines. In particular, the paper discusses the motivation for such a new tool, the extended query syntax of Poliqarp and implementation and efficiency issues. | Poliqarp An open source corpus indexer and search engine with syntactic extensions Daniel Janus Sentivision Polska Sp. z . Marynarska 19a 02-674 Warsaw Poland nathell@ Adam Przepiórkowski Insitute of Computer Science Polish Academy of Sciences Ordona 21 01-237 Warsaw Poland adamp@ Abstract This paper presents recent extensions to Poliqarp an open source tool for indexing and searching morphosyntactically annotated corpora which turn it into a tool for indexing and searching certain kinds of treebanks complementary to existing treebank search engines. In particular the paper discusses the motivation for such a new tool the extended query syntax of Poliqarp and implementation and efficiency issues. 1 Introduction The aim of this paper is to present extensions to Poliqarp 1 an effi cient open source indexer and search tool for morphosyntactically annotated XCES-encoded Ide et al. 2000 corpora with query syntax based on that of CQP Christ 1994 but extending it in interesting ways. Poliqarp has been in constant development since 2003 Przepiórkowski et al. 2004 and it is currently employed as the search engine of the IPI PAN Corpus of Polish Przepiórkowski 2004 and the Lisbon corpus of Portuguese Barreto et al. 2006 as well as in other projects. Poliqarp has a typical server-client architecture with various Poliqarp clients developed so far including GUI clients for a variaty of operating systems Linux Windows MacOS Solaris and architectures big-endian and little-endian as well as a PHP client. Since March 2006 the 1st stable version of Poliqarp Janus and 1Polyinterpretation Indexing Query And Retrieval Processor 85 Przepiórkowski 2006 is available under A version of Poliqarp that implements various statistical extensions is at the beta-testing stage. Although Poliqarp was designed as a tool for corpora linguistically annotated at word-level only the extensions described in this paper turn it into an indexing and search tool for certain .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.