Autonomous object detection and manipulation using a mobile cobot
Publication date
2026-05-07
Document type
Konferenzbeitrag
Author
Nordhoff, Tim Yago
Gaida, Daniel
Organisational unit
TH Köln
Publisher
Universitätsbibliothek der HSU/UniBw H
Book title
Machine learning for cyber physical systems : proceedings of the conference ML4CPS 2026
First page
50
Last page
59
Peer-reviewed
✅
Part of the university bibliography
Nein
Language
English
Keyword
Autonomous mobile robots
Mobile manipulation
Open-vocabulary object detection
Frontier-based exploration
Robotic grasping
Cyber-physical systems
Abstract
Autonomous mobile manipulators operating in unknown environments must tightly couple exploration, perception, and manipulation under strict computational and sensing constraints. This paper presents a fully onboard exploration-to-grasp system that enables a mobile cobot to autonomously search for, detect, and grasp a target object specified by a natural-language prompt without prior maps or object-specific training. The proposed system integrates frontier-based exploration with camera-aware coverage planning to reduce redundant motion and promote informative viewpoints. Open-vocabulary object detection is performed using a lightweight vision-language model optimized for real-time inference on embedded GPU hardware. Upon stable detection, a deterministic detection-to-grasp pipeline computes feasible standoff poses and executes a constrained grasp sequence tailored to the target object geometry. The approach is evaluated in two real-world indoor environments with multiple exploration scenarios. Experimental results demonstrate that frontier-based exploration significantly outperforms a straight-line baseline in terms of execution time, traveled path length, and grasp success, particularly in environments with occlusions and narrow passages. The findings highlight the practical feasibility of integrating open-vocabulary perception and autonomous exploration for reliable mobile manipulation on resource-constrained cyber-physical systems.
Description
This contribution is part of the conference proceedings, which are licensed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/)
Version
Published version
Access right on openHSU
Open access
