TAILIEUCHUNG - Báo cáo khoa học: "Online Plagiarism Detection Through Exploiting Lexical, Syntactic, and Semantic Information"

In this paper, we introduce a framework that identifies online plagiarism by exploiting lexical, syntactic and semantic features that includes duplication-gram, reordering and alignment of words, POS and phrase tags, and semantic similarity of sentences. We establish an ensemble framework to combine the predictions of each model. Results demonstrate that our system can not only find considerable amount of real-world online plagiarism cases but also outperforms several state-of-the-art algorithms and commercial software. . | Online Plagiarism Detection Through Exploiting Lexical Syntactic and Semantic Information Wan-Yu Lin Graduate Institute of Networking and Multimedia National Taiwan University r99944016@csie . Nanyun Peng Institute of Computational Linguistic Peking University pengnanyun@pku . Chun-Chao Yen Graduate Institute of Networking and Multimedia National Taiwan University r96944016@csie . Shou-de Lin Graduate Institute of Networking and Multimedia National Taiwan University sdlin@ . Abstract In this paper we introduce a framework that identifies online plagiarism by exploiting lexical syntactic and semantic features that includes duplication-gram reordering and alignment of words POS and phrase tags and semantic similarity of sentences. We establish an ensemble framework to combine the predictions of each model. Results demonstrate that our system can not only find considerable amount of real-world online plagiarism cases but also outperforms several state-of-the-art algorithms and commercial software. Keywords Plagiarism Detection Lexical Syntactic Semantic 1. Introduction Online plagiarism the action of trying to create a new piece of writing by copying reorganizing or rewriting others work identified through search engines is one of the most commonly seen misusage of the highly matured web technologies. As implied by the experiment conducted by Braumoeller and Gaines 2001 a powerful plagiarism detection system can effectively discourage people from plagiarizing others work. A common strategy people adopt for onlineplagiarism detection is as follows. First they identify several suspicious sentences from the write-up and feed them one by one as a query to a search engine to obtain a set of documents. Then human reviewers can manually examine whether these documents are truly the sources of the suspicious sentences. While it is quite straightforward and effective the limitation of this strategy is obvious. First since the length of .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.