TAILIEUCHUNG - Báo cáo khoa học: "Semi-Supervised Maximum Entropy Based Approach to Acronym and Abbreviation Normalization in Medical Texts"

Numerous abbreviations are used routinely throughout such texts and identifying their meaning is critical to understanding of the document. The problem is that abbreviations are highly ambiguous with respect to their meaning. For example, according to UMLS 2 (2001), RA may stand for “rheumatoid arthritis”, “renal artery”, “right atrium”, “right atrial”, “refractory anemia”, “radioactive”, “right arm”, “rheumatic arthritis,” etc. Liu et al. (2001) show that 33% of abbreviations listed in UMLS are ambiguous. . | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 160-167. Semi-Supervised Maximum Entropy Based Approach to Acronym and Abbreviation Normalization in Medical Texts Serguei Pakhomov . Mayo Foundation Rochester MN Abstract Text normalization is an important aspect of successful information retrieval from medical documents such as clinical notes radiology reports and discharge summaries. In the medical domain a significant part of the general problem of text normalization is abbreviation and acronym disambiguation. Numerous abbreviations are used routinely throughout such texts and knowing their meaning is critical to data retrieval from the document. In this paper I will demonstrate a method of automatically generating training data for Maximum Entropy ME modeling of abbreviations and acronyms and will show that using ME modeling is a promising technique for abbreviation and acronym normalization. I report on the results of an experiment involving training a number of ME models used to normalize abbreviations and acronyms on a sample of 10 000 rheumatology notes with 89 accuracy. 1 Introduction and Background Text normalization is an important aspect of successful information retrieval from medical documents such as clinical notes radiology reports and discharge summaries to name a few. In the medical domain a significant part of the general problem of text normalization is abbreviation and acronym1 disambiguation. Numerous abbreviations are used routinely throughout such texts and identifying their meaning is critical to understanding of the document. The problem is that abbreviations are highly ambiguous with respect to their meaning. For example according to UMLS 2 2001 RA may stand for rheumatoid arthritis renal artery right atrium right atrial refractory anemia radioactive right arm rheumatic arthritis etc. Liu et al. 2001 show that 33 of abbreviations listed in .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.