TAILIEUCHUNG - Báo cáo khoa học: "Unification-based Multimodal Parsing"

In order to realize their full potential, multimodal systems need to support not just input from multiple modes, but also synchronized integration of modes. Johnston et al (1997) model this integration using a unification operation over typed feature structures. This is an effective solution for a broad class of systems, but limits multimodal utterances to combinations of a single spoken phrase with a single gesture. We show how the unification-based approach can be scaled up to provide a full multimodal grammar formalism. In conjunction with a multidimensional chart parser, this approach supports integration of multiple elements distributed across the. | Unification-based Multimodal Parsing Michael Johnston Center for Human Computer Communication Department of Computer Science and Engineering Oregon Graduate Institute . Box 91000 Portland OR 97291-1000 johnston @ Abstract In order to realize their full potential multimodal systems need to support not just input from multiple modes but also synchronized integration of modes. Johnston et al 1997 model this integration using a unification operation over typed feature structures. This is an effective solution for a broad class of systems but limits multimodal utterances to combinations of a single spoken phrase with a single gesture. We show how the unification-based approach can be scaled up to provide a full multimodal grammar formalism. In conjunction with a multidimensional chart parser this approach supports integration of multiple elements distributed across the spatial temporal and acoustic dimensions of multimodal interaction. Integration strategies are stated in a high level unificationbased rule formalism supporting rapid prototyping and iterative development of multimodal systems. 1 Introduction Multimodal interfaces enable more natural and efficient interaction between humans and machines by providing multiple channels through which input or output may pass. Our concern here is with multimodal input such as interfaces which support simultaneous input from speech and pen. Such interfaces have clear task performance and user preference advantages over speech only interfaces in particular for spatial tasks such as those involving maps Ovi-att 1996 . Our focus here is on the integration of input from multiple modes and the role this plays in the segmentation and parsing of natural human input. In the examples given here the modes are speech and pen but the architecture described is more general in that it can support more than two input modes and modes of other types such as 3D gestural input. Our multimodal interface technology is implemented in

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.