Research Analyzer
← Back

RelationGrasp: Object-Oriented Prompt Learning for Simultaneously Grasp Detection and Manipulation Relationship in Open Vocabulary

Songting Liu, Tat Joo Teo, Zhiping Lin, Haiyue Zhu

PDF
Key figure (auto-extracted from paper)

Abstract

Autonomous robotic grasping under complex, clustered, and unstructured environments is a fundamental but challenging task. To achieve human-like rationality in dealing with the grasping task, the agent requires hybrid intelligence from multilateral aspects. This paper introduces RelationGrasp, a unified framework employing a transformer encoder-decoder structure to simultaneously achieve open-vocabulary object detection, manipulation relationship inference, and grasp pose detection. A unique object-oriented prompt learning mecha- nism is designed to seamlessly bridge the grasp pose and manipulation relationship branches, delivering high fidelity of object-grasp affiliation for object-aware grasping and grasp sequence planning. By formulating the relationship detection as an adjacency matrix regression task under multi-task learning, our framework significantly increases the relationship accuracy with reduced computational overhead. Moreover, to facilitate the robust and adaptive deployment of the proposed Relation- Grasp to novel environments, we propose a consistency-based self-supervised adaptation strategy to adapt the pre-trained network to new scenarios and improve grasp accuracy on unseen objects. Our proposed network achieved state-of-the-art performance on various public dataset such as VMRD, OCID, etc., in both grasp detection and manipulation relationship classification, and real-world robot experiments has also been conducted to show the practical usages.

Index terms

Deep Learning in Grasping and Manipulation Perception for Grasping and Manipulation Grasping