TAILIEUCHUNG - Báo cáo khoa học: "Effect of Utilizing Terminology on Extraction of Protein-Protein Interaction Information from Biomedical Literature"

As the amount of on-line scientific literature in the biomedical domain increases, automatic processing has become a promising approach for accelerating research. We are applying syntactic parsing trained on the general domain to identify proteinprotein interactions. One of the main difficulties obstructing the use of language processing is the prevalence of specialized terminology. Accordingly, we have created a specialized dictionary by compiling on-line glossaries, and have applied it for information extraction. We conducted preliminary experiments on one hundred sentences, and compared the extraction performance when (a) using only a general dictionary and (b) using this plus our specialized dictionary | Effect of utilizing Terminology on Extraction of Protein-Protein Interaction Information from Biomedical Literature Junko Hosaka Genomic Sciences Center Suehiro cho 1 7 Tsurumi-ku Yokohama. Kanagawa Japan jhosaka@ Judice LA7. Koh Institute tor Infocomm Research 21 Heng Mui Keng Terrace Singapore 119611 judicc@ Akihiko Konagaya Genomic Sciences Center Suehiro cho I 7 22 Tsurumi-ku Yokohama. Kanagawa Japan konagaya@ Abstract As the amount of on-line scientific literature in the biomedical domain increases automatic processing has become a promising approach for accelerating research. We are applying syntactic parsing trained on he general domain to identify proteinprotein interactions. One of the main difficulties obstructing the use of language processing is the prevalence of specialized terminology. Accordingly we have created a specialized dictionary by compiling on-line glossaries and have applied it for information extraction. We conducted preliminary experiments on one hundred sentences and compared the extraction performance when a using only a general dictionary and b using this plus our specialized dictionary. Contrary to our expectation using only the general dictionary resulted in better performance recall precision than with the terminology-based approach recall precision . 1 Introduction With the increasing amount of on-line literature in the biomedical domain research can be greatly accelerated by extracting information automatically from text resources. Approaches to automatic extraction have used co-occurrence Jenssen 2001 full parsing Yakushiji 2001 manually built templates Blaschke. 2001 and a natural language system developed for a neighboring domain with modifications . regarding semantic categories Friedman 2001 . In order to extract information such as proteinprotein interactions from scientific text it is insufficient to check only co-occurrences. .

TỪ KHÓA LIÊN QUAN
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.