Domain Adaptation in Visual Reinforcement Learning Via Self-Expert Imitation with Purifying Latent Feature

Lin Chen, Jianan Huang, zhen zhou, Yaonan Wang, Yang Mo, Zhiqiang Miao, Kai Zeng, Mingtao Feng, Danwei Wang

PDF

Key figure (auto-extracted from paper)

Abstract

Generalizing visual reinforcement learning is fun- damental to robot visual navigation, involving the acquisi- tion of a policy from interactions with source environments to facilitate adaptation to analogous, yet unfamiliar target environments. Recent advancements capitalize on data aug- mentation techniques, self-supervised learning methods, and the generative adversarial network framework to train policy neural networks with enhanced generalizability. However, cur- rent methods, upon extracting domain-general latent features, further utilize these features to train the reinforcement learning policy, resulting in a decline in the performance of the learned policy guiding the agent to accomplish tasks. To tackle these challenges, a framework of self-expert imitation with purifying latent features was devised, empowering the policy to achieve robust and stable zero-shot generalization performance in visually similar domains previously unseen, without diminishing the performance of guiding the agent to accomplish tasks. The extraction method of domain-general latent features is proposed to enhance their quality based on the variational autoencoder. Extensive experiments have shown that our policy, compared with state-of-the-art counterparts, does not diminish the performance of the policy guiding the agent to accomplish tasks after generalization.

Index terms

Autonomous Vehicle Navigation Intelligent Transportation Systems