Research Analyzer
← Back

BEV-ODOM: Reducing Scale Drift in Monocular Visual Odometry with BEV Representation

Yufei Wei, Sha Lu, Fuzhang Han, Rong Xiong, Yue Wang

PDF
Key figure (auto-extracted from paper)

Abstract

Monocular visual odometry (MVO) is vital in autonomous navigation and robotics, providing a cost-effective and flexible motion tracking solution, but the inherent scale ambiguity in monocular setups often leads to cumulative errors over time. In this paper, we present BEV-ODOM, a novel MVO framework leveraging the Bird’s Eye View (BEV) Representa- tion to address scale drift. Unlike existing approaches, BEV- ODOM integrates a depth-based perspective-view (PV) to BEV encoder, a correlation feature extraction neck, and a CNN- MLP-based decoder, enabling it to estimate motion across three degrees of freedom without the need for depth supervision or complex optimization techniques. Our framework reduces scale drift in long-term sequences and achieves accurate motion estimation across various datasets, including NCLT, Oxford, and KITTI. The results indicate that BEV-ODOM outperforms current MVO methods, demonstrating reduced scale drift and higher accuracy.

Index terms

Visual Tracking Localization SLAM