TAILIEUCHUNG - Báo cáo khoa học: "Information structure and pauses in a corpus of spoken Danish"

This paper describes a study in which a corpus of spoken Danish annotated with focus and topic tags was used to investigate the relation between information structure and pauses. The results show that intra-clausal pauses in the focus domain, tend to precede those words that express the property or semantic type whereby the object in focus is distinguished from other ones in the domain. | Information structure and pauses in a corpus of spoken Danish Patrizia Paggio Centre for Language Technology University of Copehagen Denmark patrizia@ Abstract This paper describes a study in which a corpus of spoken Danish annotated with focus and topic tags was used to investigate the relation between information structure and pauses. The results show that intra-clausal pauses in the focus domain tend to precede those words that express the property or semantic type whereby the object in focus is distinguished from other ones in the domain. 1 Introduction The interest for corpora annotated with information structure has been raised recently by several authors. Kruijff-Korbayova and Kruijff 2004 describe a method where a rich discourselevel annotation is used to investigate information structure while both Postolache 2005 and Diderichsen and Elming 2005 study the application of machine learning to the problem of automatic identification of topic and focus. In this study on the contrary information structure is annotated manually and the annotation is used to investigate the correlation between information structure tags and intra-clausal pauses. 2 Annotating information structure The starting point for this study was the corpus of spoken Danish DanPass Gr0nnum 2005 a collection of 54 monologues produced by 18 different subjects dealing with three well-defined tasks following the methodology established in Terken 1985 . In the first task the subjects describe a geometrical network in the second the process of assembling the drawing of a house out of existing pieces and in the third they solve a map task. The corpus has been annotated with several annotation tiers including orthography phonetic transcription pauses and PoS-tags. Two independent annotators added then tags for focus and topic based on a set of simple guidelines and using the Praat tool to carry out the annotation. The annotation reflects the assumption that a sentence can be divided into an .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.