Co-Learning Planning and Control Policies Constrained by Differentiable Logic Specifications

Zikang Xiong, Daniel Lawson, Joe Kurian Eappen, Ahmed H. Qureshi, Suresh Jagannathan

PDF

Key figure (auto-extracted from paper)

Abstract

Synthesizing planning and control policies in robotics is a fundamental task, further complicated by factors such as complex logic specifications and high-dimensional robot dynamics. This paper presents a novel reinforcement learning approach to solving high-dimensional robot navigation tasks with complex logic specifications by co-learning planning and control policies. Notably, this approach significantly reduces the sample complexity in training, allowing us to train high-quality policies with much fewer samples compared to existing rein- forcement learning algorithms. In addition, our methodology streamlines complex specification extraction from map images and enables the efficient generation of long-horizon robot mo- tion paths across different map layouts. Moreover, our approach also demonstrates capabilities for high-dimensional control and avoiding suboptimal policies via policy alignment. The efficacy of our approach is demonstrated through experiments involving simulated high-dimensional quadruped robot dynamics and a real-world differential drive robot (TurtleBot3) under different types of task specifications.

Index terms

Reinforcement Learning Integrated Planning and Control Deep Learning Methods