TAILIEUCHUNG - Báo cáo khoa học: "Entity Type Variation across Two Biomedical Subdomains"

There are lexical, syntactic, semantic and discourse variations amongst the languages used in various biomedical subdomains. It is important to recognise such differences and understand that biomedical tools that work well on some subdomains may not work as well on others. We report here on the semantic variations that occur in the sublanguages of two biomedical subdomains, . cell biology and pharmacology, at the level of named entity information. By building a classifier using ratios of named entities as features, we show that named entity information can discriminate between documents from each subdomain. . | What s in a Name Entity Type Variation across Two Biomedical Subdomains Claudiu Mihăilă and Riza Theresa Batista-Navarro National Centre for Text Mining School of Computer Science University of Manchester Manchester Interdisciplinary Biocentre 131 Princess Street M1 7DN Manchester UK Abstract There are lexical syntactic semantic and discourse variations amongst the languages used in various biomedical subdomains. It is important to recognise such differences and understand that biomedical tools that work well on some subdomains may not work as well on others. We report here on the semantic variations that occur in the sublanguages of two biomedical subdomains . cell biology and pharmacology at the level of named entity information. By building a classifier using ratios of named entities as features we show that named entity information can discriminate between documents from each subdomain. More specifically our classifier can distinguish between documents belonging to each subdomain with an accuracy of F-score. 1 Introduction Biomedical information extraction efforts in the past decade have focussed on fundamental tasks needed to create intelligent systems capable of improving search engine results and easing the work of biologists. More specifically researchers have concentrated mainly on named entity recognition mapping them to concepts in curated databases Krallinger et al. 2008 and extracting simple binary relations between entities. Recently an increasing number of resources that facilitate the training of systems to extract more detailed information have become available . PennBioIE Kulick et al. 2004 GENETAG Tanabe et al. 2005 BioInfer Pyysalo et al. 2007 GENIA Kim et al. 2008 GREC Thompson et al. 2009 and Metaknowledge GE-NIA Thompson et al. 2011 . Moreover several other annotated corpora have been developed for shared task purposes such as BioCreative I II III Arighi et al. 2011 and

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.