Đang chuẩn bị liên kết để tải về tài liệu:
Robot Learning 2010 Part 6

Hoài Thương 22 15 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

Tham khảo tài liệu 'robot learning 2010 part 6', kỹ thuật - công nghệ, cơ khí - chế tạo máy phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả | 68 Robot Learning context of RL is provided by Dearden et al. 1998 1999 who applied Q-learning in a Bayesian framework with an application to the exploration-exploitation trade-off. Poupart et al. 2006 present an approach for efficient online learning and exploration in a Bayesian context they ascribe Bayesian RL to POMDPs. Besides statistical uncertainty consideration is similar to but strictly demarcated from other issues that deal with uncertainty and risk consideration. Consider the work of Heger 1994 and of Geibel 2001 . They deal with risk in the context of undesirable states. Mihatsch Neuneier 2002 developed a method to incorporate the inherent stochasticity of the MDP. Most related to our approach is the recent independent work by Delage Mannor 2007 who solved the percentile optimisation problem by convex optimization and applied it to the exploration-exploitation trade-off. They suppose special priors on the MDP s parameters whereas the present work has no such requirements and can be applied in a more general context of RL methods. 2. Bellman iteration and uncertainty propagation Our concept of incorporating uncertainty into RL consists in applying UP to the Bellman iteration Schneegass et al. 2008 Qm s. a TQm-1 s. a 5 S TPs Is. a R s. a sk rVm-1 sk 6 k-1 here for discrete MDPs. For policy evaluation we have Vm s - Qm s n s with n the used policy and for policy iteration Vm s - maxaeA Qm s a section 1.1 . Thereby we assume a finite number of states s. i e 1 . . . I S I and actions a j e 1 . . . IA I . The Bellman iteration converges with m ro to the optimal Q-function which is appropriate to the estimators P and R. In the general stochastic case which will be important later we set Vm s - T n s a. Qm s a. with n s a the probability of choosing a in s. To obtain the uncertainty of the approached Q-function the technique of UP is applied in parallel to the Bellman iteration. With given covariance matrices Cov P Cov R and Cov P R for the transition .

TÀI LIỆU LIÊN QUAN

Giáo trình Robot studio courseware 5.14 - Chương 1: Learning the basics

Initial study of learning curves in robot-assisted radical prostatectomy

Nghiên cứu điều khiển robot tự hành ứng dụng cho điều hướng thông minh trên cơ sở thuật toán Q-Learning

Human robot interactive intention prediction using deep learning techniques

Using active learning in motor control and matlab simulation

machine learning and robot perception bruno apolloni et al eds

Bruno Apolloni, Ashish Ghosh, Ferda Alpaslan, Lakhmi C. Jain, Srikanta Patnaik (Eds.) machine learning and robot perception bruno apolloni

Machine Learning and Robot Perception

HUMANOID ROBOT

Báo cáo hóa học: Human-robot cooperative movement training: Learning a novel sensory motor transformation during walking with robotic assistance-as-needed

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.