TAILIEUCHUNG - Báo cáo khoa học: "Annotating and Recognising Named Entities in Clinical Notes"

This paper presents ongoing research in clinical information extraction. This work introduces a new genre of text which are not well-written, noise prone, ungrammatical and with much cryptic content. A corpus of clinical progress notes drawn form an Intensive Care Service has been manually annotated with more than 15000 clinical named entities in 11 entity types. This paper reports on the challenges involved in creating the annotation schema, and recognising and annotating clinical named entities. . | Annotating and Recognising Named Entities in Clinical Notes Yefeng Wang School of Information Technology The University of Sydney Australia 2006 ywang1@ Abstract This paper presents ongoing research in clinical information extraction. This work introduces a new genre of text which are not well-written noise prone ungrammatical and with much cryptic content. A corpus of clinical progress notes drawn form an Intensive Care Service has been manually annotated with more than 15000 clinical named entities in 11 entity types. This paper reports on the challenges involved in creating the annotation schema and recognising and annotating clinical named entities. The information extraction task has initially used two approaches a rule based system and a machine learning system using Conditional Random Fields CRF . Different features are investigated to assess the interaction of feature sets and the supervised learning approaches to establish the combination best suited to this data set. The rule based and CRF systems achieved an F-score of and respectively. 1 Introduction A substantial amount of clinical data is locked away in a non-standardised form of clinical language which if standardised could be usefully mined to improve processes in the work of clinical wards and to gain greater understanding of patient care as well as the progression of diseases. However in some clinical contexts these clinical notes as written by a clinicians are in a less structured and often minimal grammatical form with idiosyncratic and cryptic shorthand. Whilst there is increasing interest in the automatic extraction of the contents of clinical text this particular type of notes cause significant difficulties for automatic extraction processes not present for well-written prose notes. The first step to the extraction of structured information from these clinical notes is to achieve accurate identification of clinical concepts or named entities. An entity may refer to a

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.