TAILIEUCHUNG - Báo cáo khoa học: "Lost in Translation: Authorship Attribution using Frame Semantics"

We investigate authorship attribution using classifiers based on frame semantics. The purpose is to discover whether adding semantic information to lexical and syntactic methods for authorship attribution will improve them, specifically to address the difficult problem of authorship attribution of translated texts. Our results suggest (i) that frame-based classifiers are usable for author attribution of both translated and untranslated texts; (ii) that framebased classifiers generally perform worse than the baseline classifiers for untranslated texts, but (iii) perform as well as, or superior to the baseline classifiers on translated texts; (iv) that—contrary to current belief—naïve classifiers based on lexical markers. | Lost in Translation Authorship Attribution using Frame Semantics Steffen Hedegaard Department of Computer Science University of Copenhagen Njalsgade 128 2300 Copenhagen S Denmark steffenh@ Jakob Grue Simonsen Department of Computer Science University of Copenhagen Njalsgade 128 2300 Copenhagen S Denmark simonsen@ Abstract We investigate authorship attribution using classifiers based on frame semantics. The purpose is to discover whether adding semantic information to lexical and syntactic methods for authorship attribution will improve them specifically to address the difficult problem of authorship attribution of translated texts. Our results suggest i that frame-based classifiers are usable for author attribution of both translated and untranslated texts ii that framebased classifiers generally perform worse than the baseline classifiers for untranslated texts but iii perform as well as or superior to the baseline classifiers on translated texts iv that contrary to current belief naive classifiers based on lexical markers may perform tolerably on translated texts if the combination of author and translator is present in the training set of a classifier. 1 Introduction Authorship attribution is the following problem For a given text determine the author of said text among a list of candidate authors. Determining authorship is difficult and a host of methods have been proposed As of 1998 Rudman estimated the number of metrics used in such methods to be at least 1000 Rudman 1997 . For comprehensive recent surveys see . Juola 2006 Koppel et al. 2008 Stamatatos 2009 . The process of authorship attribution consists of selecting markers features that provide an indication of the author and classifying a text by assigning it to an author using some appropriate machine learning technique. 65 Attribution of translated texts In contrast to the general authorship attribution problem the specific problem of attributing translated texts to their original .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.