We extend the lamar.ethz.ch benchmark to develop accurate SLAM methods that can co-register drones, legged robots, wheeled robots, smartphones, and mixed reality headsets based on visual SLAM.
Mixed-reality headsets and handheld devices offer the most intuitive interface to operate robots, i.e. giving them commands in 3D or checking what they plan to do next. Such human-robot teaming however requires that we can register the mixed-reality devices and the robots to the same 3D environment. Robots and Humans have quite a different viewpoint and different motion patterns, which makes this registration difficult. At the same time, nobody wants to wave their camera around for 5 minutes until they can operate the robot, so the registration needs to be possible from only a few images.
As part of this project, you will operate wheeled, legged, or flying robots to collect data from buildings and labs at ETH. We then use the data to generate highly accurate ground-truth poses and measure the performance of existing SLAM algorithms to register the agents with respect to each other.
Please send your CV and transcript to blumh@uni-bonn.de and zuria.bauer@inf.ethz.ch .
Scene graphs allow to easily model dynamic environments for robots: Every object is a node in a graph and can be moved around, with edges combining nodes based on geometric or semantic proximity, and connections to higher levels grouping nodes per room, floor, and building.
In this thesis, we want to create a tool that easily and automatically creates such scene graphs from a single iPad scan of the room, such that a robot can be deployed to the environment. We base this on LabelMaker, where we add instance segmentation, and separation of rooms and floors.
Our goal is to create an easy tool that anybody can use, but to also be able to process large datasets such as Taskonomy, to create the first large, multi-room scene-graph dataset for the research community.
experience with a python, ideally with open3d; docker is a plus
please send your CV and transcript to blumh@uni-bonn.de
To mine large-scale training data, LabelMaker can be used to automatically annotate phone scans. So far, the tool can only output pointclouds. Using the latest advancements in NeRFs and Gaussian Splatting, we develop an efficient pipeline to produce the most accurate possible 3D lifting of segmentation labels into a renderable scene representation.
With this, we build the largest existing indoor semantic segmentation dataset.
experience with a pytorch
please send your CV and transcript to blumh@uni-bonn.de