TAILIEUCHUNG - Báo cáo khoa học: "Edit Machines for Robust Multimodal Language Processing"

Multimodal grammars provide an expressive formalism for multimodal integration and understanding. However, handcrafted multimodal grammars can be brittle with respect to unexpected, erroneous, or disfluent inputs. Spoken language (speech-only) understanding systems have addressed this issue of lack of robustness of hand-crafted grammars by exploiting classification techniques to extract fillers of a frame representation. | Edit Machines for Robust Multimodal Language Processing Srinivas Bangalore AT T Labs-Research 180 Park Ave Florham Park NJ 07932 srini@ Michael Johnston AT T Labs-Research 180 Park Ave Florham Park NJ 07932 johnston@ Abstract Multimodal grammars provide an expressive formalism for multimodal integration and understanding. However handcrafted multimodal grammars can be brittle with respect to unexpected erroneous or disfluent inputs. Spoken language speech-only understanding systems have addressed this issue of lack of robustness of hand-crafted grammars by exploiting classification techniques to extract fillers of a frame representation. In this paper we illustrate the limitations of such classification approaches for multimodal integration and understanding and present an approach based on edit machines that combine the expressiveness of multimodal grammars with the robustness of stochastic language models of speech recognition. We also present an approach where the edit operations are trained from data using a noisy channel model paradigm. We evaluate and compare the performance of the hand-crafted and learned edit machines in the context of a multimodal conversational system MATCH . 1 Introduction Over the years there have been several multimodal systems that allow input and or output to be conveyed over multiple channels such as speech graphics and gesture for example put that there Bolt 1980 CUBRICON Neal and Shapiro 1991 QuickSet Cohen et al. 1998 SmartKom Wahlster 2002 Match Johnston et al. 2002 . Multimodal integration and interpretation for such interfaces is elegantly expressed using multimodal grammars Johnston and Bangalore 2000 . These grammars support composite multimodal inputs by aligning speech input words and gesture input represented as sequences of gesture symbols while expressing the relation between the speech and gesture input and their combined semantic representation. In Bangalore and Johnston 2000 Johnston .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.