Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
The automatic coding of clinical documents is an important task for today’s healthcare providers. Though it can be viewed as multi-label document classification, the coding problem has the interesting property that most code assignments can be supported by a single phrase found in the input document. We propose a Lexically-Triggered Hidden Markov Model (LT-HMM) that leverages these phrases to improve coding accuracy. | Lexically-Triggered Hidden Markov Models for Clinical Document Coding Svetlana Kiritchenko Colin Cherry Institute for Information Technology National Research Council Canada Svetlana.Kiritchenko Colin.Cherry @nrc-cnrc.gc.ca Abstract The automatic coding of clinical documents is an important task for today s healthcare providers. Though it can be viewed as multi-label document classification the coding problem has the interesting property that most code assignments can be supported by a single phrase found in the input document. We propose a Lexically-Triggered Hidden Markov Model LT-HMM that leverages these phrases to improve coding accuracy. The LT-HMM works in two stages first a lexical match is performed against a term dictionary to collect a set of candidate codes for a document. Next a discriminative HMM selects the best subset of codes to assign to the document by tagging candidates as present or absent. By confirming codes proposed by a dictionary the LT-HMM can share features across codes enabling strong performance even on rare codes. In fact we are able to recover codes that do not occur in the training set at all. Our approach achieves the best ever performance on the 2007 Medical NLP Challenge test set with an F-measure of 89.84. 1 Introduction The clinical domain presents a number of interesting challenges for natural language processing. Conventionally most clinical documentation such as doctor s notes discharge summaries and referrals are written in a free-text form. This narrative form is flexible allowing healthcare professionals to express any kind of concept or event but it is not particularly suited for large-scale analysis search 742 or decision support. Converting clinical narratives into a structured form would support essential activities such as administrative reporting quality control biosurveillance and biomedical research Meystre et al. 2008 . One way of representing a document is to code the patient s conditions and the performed .