Our Research

We investigate robots that can understand their environment semantically and geometrically, in order to perform manipulation and other safety critical tasks in proximity to humans. This encompasses semantic understanding under open-set conditions, map representations of the environment, active perception and planning, as well as adaptation and continual self-supervised learning.

Method for setting more precisely a position and/or orientation of a device head

US Patent 12,226,867, 2025

CroCoDL: Cross-device Collaborative Dataset for Localization

CVPR 2025

3D-MOOD: Lifting 2D to 3D for Monocular Open-Set Object Detection

ICCV 2025

arxiv PDF website code

ActLoc: Learning to Localize on the Move via Active Viewpoint Selection

CoRL 2025

arxiv PDF website

FunGraph: Functionality Aware 3D Scene Graphs for Language-Prompted Scene Interaction

IROS 2025

arxiv PDF website

SpotLight: Robotic Scene Understanding through Interaction and Affordance Detection

HUMANOIDS 2025

arxiv PDF website

FrontierNet: Learning Visual Cues to Explore

IEEE Robotics and Automation Letters 2025

doi website arxiv PDF code doi

osmAG-LLM: Zero-Shot Open-Vocabulary Object Navigation via Semantic Maps and Large Language Models Reasoning

2025

arxiv PDF

ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding

CVPR 2025

arxiv PDF labelmaker.org code

DepthSplat: Connecting Gaussian Splatting and Depth

CVPR 2025

arxiv PDF website code openaccess.thecvf.com

Spot-On: A Mixed Reality Interface for Multi-Robot Cooperation

2025

arxiv PDF

Loop Closure from Two Views: Revisiting PGO for Scalable Trajectory Estimation through Monocular Priors

2025

arxiv PDF

Lost & Found: Tracking Changes from Egocentric Observations in 3D Dynamic Scene Graphs

Robotics and Automation Letters (RA-L) 2025

arxiv PDF website doi

NeuSurfEmb: A Complete Pipeline for Dense Correspondence-based 6D Object Pose Estimation without CAD Models

IROS 2024

Method for localizing a mobile construction robot on a construction site using semantic segmentation, construction robot system and computer program product

US Patent App. 18/284,646, 2024

HoloSpot: Intuitive Object Manipulation via Mixed Reality Drag-and-Drop

2024

arxiv PDF website

Learning Where to Look: Self-supervised Viewpoint Selection for Active Localization Using Geometrical Information

ECCV 2024

doi arxiv PDF code

“Where am I?” Scene Retrieval with Language

ECCV 2024

doi arxiv PDF website

OpenDAS: Open-Vocabulary Domain Adaptation for Segmentation

2024

arxiv PDF website

SNI-SLAM: Semantic Neural Implicit SLAM

CVPR 2024

arxiv PDF code openaccess.thecvf.com

A 3D Mixed Reality Interface for Human-Robot Teaming

ICRA 2024

arxiv PDF code

Active Visual Localization for Multi-Agent Collaboration: A Data-Driven Approach

ICRA 2024

arxiv PDF

OpenDAS: Open-Vocabulary Domain Adaptation for 2D and 3D Segmentation

2024

arxiv PDF

Spot-Compose: A Framework for Open-Vocabulary Object Retrieval and Drawer Manipulation in Point Clouds

ICRA 2024

arxiv PDF website code

LabelMaker: Automatic Semantic Label Generation from RGB-D Trajectories

3DV 2024

arxiv PDF labelmaker.org code

Unsupervised Continual Semantic Adaptation through Neural Rendering

CVPR 2023

doi code website

Our Research

Method for setting more precisely a position and/or orientation of a device head

CroCoDL: Cross-device Collaborative Dataset for Localization

3D-MOOD: Lifting 2D to 3D for Monocular Open-Set Object Detection

ActLoc: Learning to Localize on the Move via Active Viewpoint Selection

FunGraph: Functionality Aware 3D Scene Graphs for Language-Prompted Scene Interaction

SpotLight: Robotic Scene Understanding through Interaction and Affordance Detection

FrontierNet: Learning Visual Cues to Explore

osmAG-LLM: Zero-Shot Open-Vocabulary Object Navigation via Semantic Maps and Large Language Models Reasoning

ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding

DepthSplat: Connecting Gaussian Splatting and Depth

Spot-On: A Mixed Reality Interface for Multi-Robot Cooperation

Loop Closure from Two Views: Revisiting PGO for Scalable Trajectory Estimation through Monocular Priors

Lost & Found: Tracking Changes from Egocentric Observations in 3D Dynamic Scene Graphs

NeuSurfEmb: A Complete Pipeline for Dense Correspondence-based 6D Object Pose Estimation without CAD Models

Method for localizing a mobile construction robot on a construction site using semantic segmentation, construction robot system and computer program product

HoloSpot: Intuitive Object Manipulation via Mixed Reality Drag-and-Drop

Learning Where to Look: Self-supervised Viewpoint Selection for Active Localization Using Geometrical Information

“Where am I?” Scene Retrieval with Language

OpenDAS: Open-Vocabulary Domain Adaptation for Segmentation

SNI-SLAM: Semantic Neural Implicit SLAM

A 3D Mixed Reality Interface for Human-Robot Teaming

Active Visual Localization for Multi-Agent Collaboration: A Data-Driven Approach

OpenDAS: Open-Vocabulary Domain Adaptation for 2D and 3D Segmentation

Spot-Compose: A Framework for Open-Vocabulary Object Retrieval and Drawer Manipulation in Point Clouds

LabelMaker: Automatic Semantic Label Generation from RGB-D Trajectories

Unsupervised Continual Semantic Adaptation through Neural Rendering

Robot Perception and Learning Lab

Lab

Network