TAILIEUCHUNG - Báo cáo khoa học: "An Unsupervised Vector Approach to Biomedical Term Disambiguation: Integrating UMLS and Medline"

This paper introduces an unsupervised vector approach to disambiguate words in biomedical text that can be applied to all-word disambiguation. We explore using contextual information from the Unified Medical Language System (UMLS) to describe the possible senses of a word. We experiment with automatically creating individualized stoplists to help reduce the noise in our dataset. We compare our results to SenseClusters and Humphrey et al. (2006) using the NLM-WSD dataset and with SenseClusters using conflated data from the 2005 Medline Baseline. . | An Unsupervised Vector Approach to Biomedical Term Disambiguation Integrating UMLS and Medline Bridget T. McInnes Computer Science Department University of Minnesota Twin Cities Minneapolis MN 55155 USA bthomson@ Abstract This paper introduces an unsupervised vector approach to disambiguate words in biomedical text that can be applied to all-word disambiguation. We explore using contextual information from the Unified Medical Language System UMLS to describe the possible senses of a word. We experiment with automatically creating individualized stoplists to help reduce the noise in our dataset. We compare our results to Senseclusters and Humphrey et al. 2006 using the NLM-WSD dataset and with Senseclusters using conflated data from the 2005 Medline Baseline. 1 Introduction Some words have multiple senses. For example the word cold could refer to a viral infection or the temperature. As humans we find it easy to determine the appropriate sense concept given the context in which the word is used. For a computer though this is a difficult problem which negatively impacts the accuracy of biomedical applications such as medical coding and indexing. The goal of our research is to explore using information from biomedical knowledge sources such as the Unified Medical Language System UMLS and Medline to help distinguish between different possible concepts of a word. In the UMLS concepts associated with words and terms are enumerated via Concept Unique Identifiers CUIs . For example two possible senses of cold are C0009264 Cold Temperature and C0009443 Common Cold in the UmLS release 2008AA. The UMLS is also encoded with different semantic and syntactic structures. Some such information includes related concepts and semantic types. A semantic type ST is a broad subject categorization assigned to a CUI. For example the ST of C0009264 Cold Temperature is Idea or Concept while the ST for 0009443 Common Cold is Disease or Syndrome . Currently there exists .

TÀI LIỆU MỚI ĐĂNG
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.