Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Effective Measures of Domain Similarity for Parsing"

Tú Sương 51 11 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

It is well known that parsing accuracy suffers when a model is applied to out-of-domain data. It is also known that the most beneﬁcial data to parse a given domain is data that matches the domain (Sekine, 1997; Gildea, 2001). Hence, an important task is to select appropriate domains. However, most previous work on domain adaptation relied on the implicit assumption that domains are somehow given. | Effective Measures of Domain Similarity for Parsing Barbara Plank University of Groningen The Netherlands Gertjan van Noord University of Groningen The Netherlands b.plank@rug.nl G.J.M.van.Noord@rug.nl Abstract It is well known that parsing accuracy suffers when a model is applied to out-of-domain data. It is also known that the most beneficial data to parse a given domain is data that matches the domain Sekine 1997 Gildea 2001 . Hence an important task is to select appropriate domains. However most previous work on domain adaptation relied on the implicit assumption that domains are somehow given. As more and more data becomes available automatic ways to select data that is beneficial for a new unknown target domain are becoming attractive. This paper evaluates various ways to automatically acquire related training data for a given test set. The results show that an unsupervised technique based on topic models is effective - it outperforms random data selection on both languages examined English and Dutch. Moreover the technique works better than manually assigned labels gathered from meta-data that is available for English. 1 Introduction and Motivation Previous research on domain adaptation has focused on the task of adapting a system trained on one domain say newspaper text to a particular new domain say biomedical data. Usually some amount of labeled or unlabeled data from the new domain was given - which has been determined by a human. However with the growth of the web more and more data is becoming available where each document is potentially its own domain McClosky et al. 2010 . It is not straightforward to determine 1566 which data or model in case we have several source domain models will perform best on a new unknown target domain. Therefore an important issue that arises is how to measure domain similarity i.e. whether we can find a simple yet effective method to determine which model or data is most beneficial for an arbitrary piece of new text. .

TÀI LIỆU LIÊN QUAN

Báo cáo y học: "Retroperitoneal packing as part of damage control surgery in a Danish trauma centre – fast, effective, and cost-effective"

Báo cáo khoa học: "Effective Measures of Domain Similarity for Parsing"

Báo cáo khoa học: "Effective Use of Function Words for Rule Generalization in Forest-Based Translation"

Báo cáo khoa học: "Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz data: Bootstrapping and Evaluation"

Báo cáo khoa học: "Learning More Effective Dialogue Strategies Using Limited Dialogue Move Features"

Báo cáo khoa học: "An Effective Two-Stage Model for Exploiting Non-Local Dependencies in Named Entity Recognition"

Báo cáo khoa học: "Leveraging Reusability: Cost-effective Lexical Acquisition for Large-scale Ontology Translation"

Báo cáo khoa học: "Machine Translation by Triangulation: Making Effective Use of Multi-Parallel Corpora"

Báo cáo khoa học: "Learning to Compose Effective Strategies from a Library of Dialogue Components"

Báo cáo khoa học: "Effective Phrase Translation Extraction from Alignment Models"

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.