Project Details
Robot learning to perceive, plan, and act under uncertainty
Applicant
Professor Jan Reinhard Peters, Ph.D., since 11/2019
Subject Area
Automation, Mechatronics, Control Systems, Intelligent Technical Systems, Robotics
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
from 2018 to 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 398611747
Future robots will need to be able to plan their actions so that they can learn about the environment in order to accomplish their tasks. This kind of planning is especially important in unstructured partially observable real world environments found in household robotics, adaptive manufacturing, elderly care, handling dangerous materials, or even disaster scenarios such as Fukujima. In such applications, the robot has to rely on multiple modalities such as camera images, laser range finders or even tactile and acoustic feedback and even with perfect visual sensors the robot cannot observe through occlusions. Robots that can operate in such environments and interactively perceive the world, need data-driven reinforcement learning methods that can incorporate uncertainty in the decision making process to proactively take information gathering actions. To do so, the robot also needs to develop a basic understanding of the physical world and incorporate this information into its reasoning process. This project aims to make model-free reinforcement learning in partially observable robotic tasks feasible through following innovations: (i) We will investigate new probabilistic structured memory representations which allow us to efficiently reuse experience with different kinds of policies. (ii) Policy learning under partial observability requires information gathering actions which need propagation of values over long horizons, and exploration, which can uncover those values. To enable long-term action selection, we will utilize ideas from model-based methods for efficient exploration and value propagation. (iii) In partially observable settings the value assignment problem is amplified. We will follow a guided reinforcement learning approach: we use additional side information during offline policy learning but use only local sensory information during online operation.We will evaluate these methodological advances by endowing a robot with the ability to play Mikado. Playing Mikado is a challenging robotic manipulation problem that exhibits all the difficulties connected to partial observability described above. The robot has to deal with occlusions and partial information. It has to proactively test physical properties and which contacts are active for certain sticks and integrate this knowledge into its fine manipulation skills to remove sticks from the heap.
DFG Programme
Research Grants
Ehemaliger Antragsteller
Dr. Joni Pajarinen, until 11/2019