V-STRONG: Visual Self-Supervised Traversability Learning for Off-Road Navigation

Sanghun Jung, JoonHo Lee, Xiangyun Meng, Byron Boots, Alexander Lambert

PDF

Key figure (auto-extracted from paper)

Abstract

Reliable estimation of terrain traversability is critical for the successful deployment of autonomous systems in wild, outdoor environments. Given the lack of large-scale annotated datasets for off-road navigation, strictly-supervised learning approaches remain limited in their generalization ability. To this end, we introduce a novel, image-based self-supervised learning method for traversability prediction, leveraging a state-of-the-art vision foundation model for improved out-of- distribution performance. Our method employs contrastive representation learning using both human driving data and instance-based segmentation masks during training. We show that this simple, yet effective, technique drastically outperforms recent methods in predicting traversability for both on- and off-trail driving scenarios. We compare our method with recent baselines on both a common benchmark as well as our own datasets, covering a diverse range of outdoor environments and varied terrain types. We also demonstrate the compatibil- ity of resulting costmap predictions with a model-predictive controller. Finally, we evaluate our approach on zero- and few-shot tasks, demonstrating unprecedented performance for generalization to new environments. Videos and additional material can be found here: https://sites.google.com/ view/visual-traversability-learning.

Index terms

Deep Learning for Visual Perception Learning from Experience Field Robots