TAILIEUCHUNG - Báo cáo khoa học: "Extracting Regulatory Gene Expression Networks from PubMed"

We present an approach using syntactosemantic rules for the extraction of relational information from biomedical abstracts. The results show that by overcoming the hurdle of technical terminology, high precision results can be achieved. From abstracts related to baker’s yeast, we manage to extract a regulatory network comprised of 441 pairwise relations from 58,664 abstracts with an accuracy of 83–90%. To achieve this, we made use of a resource of gene/protein names considerably larger than those used in most other biology related information extraction approaches. . | Extracting Regulatory Gene Expression Networks from PubMed Jasmin Saric EML Research gGmbH Heidelberg Germany saric@ Lars J. Jensen EMBL Heidelberg Germany jensen@ Rossitza Ouzounova EMBL Heidelberg Germany ouzounov@ Isabel Rojas EML Research gGmbH Heidelberg Germany rojas@ Abstract We present an approach using syntacto-semantic rules for the extraction of relational information from biomedical abstracts. The results show that by overcoming the hurdle of technical terminology high precision results can be achieved. From abstracts related to baker s yeast we manage to extract a regulatory network comprised of 441 pairwise relations from 58 664 abstracts with an accuracy of 83-90 . To achieve this we made use of a resource of gene protein names considerably larger than those used in most other biology related information extraction approaches. This list of names was included in the lexicon of our retrained part-of-speech tagger for use on molecular biology abstracts. For the domain in question an accuracy of was attained on POS-tags. The method is easily adapted to other organisms than yeast allowing us to extract many more biologically relevant relations. 1 Introduction and related work A massive amount of information is buried in scientific publications more than 500 000 publications per year . Therefore the need for information extraction IE and text mining in the life sciences is drastically increasing. Most of the ongoing work is being dedicated to deal with Peer Bork EMBL Heidelberg Germany bork@ PubMed1 abstracts. The technical terminology of biomedicine presents the main challenge of applying IE to such a corpus Hobbs 2003 . The goal of our work is to extract from biological abstracts which proteins are responsible for regulating the expression . transcription or translation of which genes. This means to extract a specific type of pairwise relations between biological entities. This differs from the .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.