TAILIEUCHUNG - Báo cáo khoa học: "Hierarchical Reinforcement Learning and Hidden Markov Models for Task-Oriented Natural Language Generation"

Surface realisation decisions in language generation can be sensitive to a language model, but also to decisions of content selection. We therefore propose the joint optimisation of content selection and surface realisation using Hierarchical Reinforcement Learning (HRL). To this end, we suggest a novel reward function that is induced from human data and is especially suited for surface realisation. | Hierarchical Reinforcement Learning and Hidden Markov Models for Task-Oriented Natural Language Generation Nina Dethlefs Heriberto Cuayahuitl Department of Linguistics German Research Centre for Artificial Intelligence University of Bremen DFKI Saarbrucken dethlefs@ Abstract Surface realisation decisions in language generation can be sensitive to a language model but also to decisions of content selection. We therefore propose the joint optimisation of content selection and surface realisation using Hierarchical Reinforcement Learning HRL . To this end we suggest a novel reward function that is induced from human data and is especially suited for surface realisation. It is based on a generation space in the form of a Hidden Markov Model HMM . Results in terms of task success and human-likeness suggest that our unified approach performs better than greedy or random baselines. 1 Introduction Surface realisation decisions in a Natural Language Generation NLG system are often made according to a language model of the domain Langkilde and Knight 1998 Bangalore and Rambow 2000 Oh and Rudnicky 2000 White 2004 Belz 2008 . However there are other linguistic phenomena such as alignment Pickering and Garrod 2004 consistency Halliday and Hasan 1976 and variation which influence people s assessment of discourse Levelt and Kelter 1982 and generated output Belz and Reiter 2006 Foster and Oberlander 2006 . Also in dialogue the most likely surface form may not always be appropriate because it does not correspond to the user s infomiation need the user is confused or the most likely sequence is infelicitous with respect to the dialogue history. In such cases it is important to optimise surface realisation in a unified fashion with content selection. We suggest to use Hierarchical Reinforcement Learning HRL to 654 achieve this. Reinforcement Learning RL is an attractive framework for optimising a sequence of decisions given incomplete .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.