TAILIEUCHUNG - Báo cáo khoa học: "Measuring Syntactic Difference in British English"

Recent work by Nerbonne and Wiersma (2006) has provided a foundation for measuring syntactic differences between corpora. It uses part-of-speech trigrams as an approximation to syntactic structure, comparing the trigrams of two corpora for statistically significant differences. This paper extends the method and its application. It extends the method by using leafpath ancestors of Sampson (2000) instead of trigrams, which capture internal syntactic structure—every leaf in a parse tree records the path back to the root. The corpus used for testing is the International Corpus of English, Great Britain (Nelson et al., 2002), which contains syntactically annotated speech of. | Measuring Syntactic Difference in British English Nathan C. Sanders Department of Linguistics Indiana University Bloomington IN 474o5 USA ncsander@ Abstract Recent work by Nerbonne and Wiersma 2006 has provided a foundation for measuring syntactic differences between corpora. It uses part-of-speech trigrams as an approximation to syntactic structure comparing the trigrams of two corpora for statistically significant differences. This paper extends the method and its application. It extends the method by using leafpath ancestors of Sampson 2000 instead of trigrams which capture internal syntactic structure every leaf in a parse tree records the path back to the root. The corpus used for testing is the International Corpus of English Great Britain Nelson et al. 2002 which contains syntactically annotated speech of Great Britain. The speakers are grouped into geographical regions based on place of birth. This is different in both nature and number than previous experiments which found differences between two groups of Norwegian L2 learners of English. We show that dialectal variation in eleven British regions from the ICE-GB is detectable by our algorithm using both leaf-ancestor paths and trigrams. 1 Introduction In the measurement of linguistic distance older work such as Seguy 1973 was able to measure distance in most areas of linguistics such as phonology morphology and syntax. The features used for comparison were hand-picked based on linguistic knowledge of the area being surveyed. These features while probably lacking in completeness of coverage certainly allowed a rough comparison of distance in all linguistic domains. In contrast computational methods have focused on a single area of language. For example a method for determining phonetic distance is given by Heeringa 2004 . Heeringa and others have also done related work on phonological distance in Nerbonne and Heeringa 1997 and Gooskens and Heeringa 2004 . A measure of syntactic distance is the .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.