TAILIEUCHUNG - Báo cáo khoa học: "Japanese Dependency Parsing Using Sequential Labeling for Semi-spoken Language"

The amount of documents directly published by end users is increasing along with the growth of Web . Such documents often contain spoken-style expressions, which are difficult to analyze using conventional parsers. This paper presents dependency parsing whose goal is to analyze Japanese semi-spoken expressions. One characteristic of our method is that it can parse selfdependent (independent) segments using sequential labeling. of Web . Such documents do not use controlled written language and contain fillers and emoticons. This implies that analyzing such documents is difficult for conventional parsers. . | Japanese Dependency Parsing Using Sequential Labeling for Semi-spoken Language Kenji Imamura and Genichiro Kikui NTT Cyber Space Laboratories NTT Corporation 1-1 Hikarinooka Yokosuka-shi Kanagawa 239-0847 Japan @ Norihito Yasuda NTT Communication Science Laboratories NTT Corporation 2-4 Hikaridai Seika-cho Soraku-gun Kyoto 619-0237 Japan n-yasuda@ Abstract The amount of documents directly published by end users is increasing along with the growth of Web . Such documents often contain spoken-style expressions which are difficult to analyze using conventional parsers. This paper presents dependency parsing whose goal is to analyze Japanese semi-spoken expressions. One characteristic of our method is that it can parse selfdependent independent segments using sequential labeling. 1 Introduction Dependency parsing is a way of structurally analyzing a sentence from the viewpoint of modification. In Japanese relationships of modification between phrasal units called bunsetsu segments are analyzed. A number of studies have focused on parsing of Japanese as well as of other languages. Popular parsers are CaboCha Kudo and Matsumoto 2002 and KNP Kurohashi and Nagao 1994 which were developed to analyze formal written language expressions such as that in newspaper articles. Generally the syntactic structure of a sentence is represented as a tree and parsing is carried out by maximizing the likelihood of the tree Charniak 2000 Uchimoto et al. 1999 . Units that do not modify any other units such as fillers are difficult to place in the tree structure. Conventional parsers have forced such independent units to modify other units. Documents published by end users . blogs are increasing on the Internet along with the growth 225 of Web . Such documents do not use controlled written language and contain fillers and emoticons. This implies that analyzing such documents is difficult for conventional parsers. This .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.