Meta-Reinforcement Learning Via Language Instructions

Zhenshan Bing, Alexander Koch, Xiangtong Yao, Kai Huang, Alois Knoll

PDF

Key figure (auto-extracted from paper)

Abstract

Although deep reinforcement learning has recently been very successful at learning complex behaviors, it requires a tremendous amount of data to learn a task. One of the fundamental reasons causing this limitation lies in the nature of the trial-and-error learning paradigm of reinforcement learning, where the agent communicates with the environment and pro- gresses in the learning only relying on the reward signal. This is implicit and rather insufficient to learn a task well. On the con- trary, humans are usually taught new skills via natural language instructions. Utilizing language instructions for robotic motion control to improve the adaptability is a recently emerged topic and challenging. In this paper, we present a meta-RL algorithm that addresses the challenge of learning skills with language instructions in multiple manipulation tasks. On the one hand, our algorithm utilizes the language instructions to shape its in- terpretation of the task, on the other hand, it still learns to solve task in a trial-and-error process. We evaluate our algorithm on the robotic manipulation benchmark (Meta-World) and it significantly outperforms state-of-the-art methods in terms of training and testing task success rates. Codes are available at https://tumi6robot.wixsite.com/million.

Index terms

Reinforcement Learning Deep Learning in Grasping and Manipulation