Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Automatic Acronym Recognition"

Tuệ Nhi 88 4 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

This paper deals with the problem of recognizing and extracting acronymdeﬁnition pairs in Swedish medical texts. This project applies a rule-based method to solve the acronym recognition task and compares and evaluates the results of different machine learning algorithms on the same task. The method proposed is based on the approach that acronym-deﬁnition pairs follow a set of patterns and other regularities that can be usefully applied for the acronym identiﬁcation task. Supervised machine learning was applied to monitor the performance of the rule-based method, using Memory Based Learning (MBL). . | Automatic Acronym Recognition Dana Dannells Computational Linguistics Department of Linguistics and Department of Swedish Language Goteborg University Goteborg Sweden cl2ddoyt@cling.gu.se Abstract This paper deals with the problem of recognizing and extracting acronymdefinition pairs in Swedish medical texts. This project applies a rule-based method to solve the acronym recognition task and compares and evaluates the results of different machine learning algorithms on the same task. The method proposed is based on the approach that acronym-definition pairs follow a set of patterns and other regularities that can be usefully applied for the acronym identification task. Supervised machine learning was applied to monitor the performance of the rule-based method using Memory Based Learning MBL . The rule-based algorithm was evaluated on a hand tagged acronym corpus and performance was measured using standard measures recall precision and f-score. The results show that performance could further improve by increasing the training set and modifying the input settings for the machine learning algorithms. An analysis of the errors produced indicates that further improvement of the rulebased method requires the use of syntactic information and textual pre-processing. 1 Introduction There are many on-line documents which contain important information that we want to understand thus the need to extract glossaries of domainspecific names and terms increases especially in technical fields such as biomedicine where the vocabulary is quickly expanding. One known phenomenon in biomedical literature is the growth of new acronyms. Acronyms are a subset of abbreviations and are generally formed with capital letters from the original word or phrase however many acronyms are realized in different surface forms i.e. use of Arabic-numbers mixed alpha-numeric forms low-case acronyms etc. Several approaches have been proposed for automatic acronym extraction with the most common tools .

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: " An Input Device for the Harvard Automatic Dictionary"

Báo cáo khoa học: "A Formula Finder for the Automatic Synthesis of Translation Algorithms"

Báo cáo khoa học: "Automatic Paraphrasing in Essay Format"

Báo cáo khoa học: "Some Comments on Algorithm and Grammar in the Automatic Parsing of Natural Languages"

Báo cáo khoa học: "Some Notes on Russian Predicative Infinitives in Automatic Translation"

Báo cáo khoa học: "Automatic Determination of Parts of Speech of English Words"

Báo cáo khoa học: "Automatic Event Extraction with Structured Preference Modeling"

Báo cáo khoa học: "Automatic Evaluation of Linguistic Quality in Multi-Document Summarization"

Báo cáo khoa học: "Automatic Generation of Story Highlights"

Báo cáo khoa học: "TrustRank: Inducing Trust in Automatic Translations via Ranking"

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.