Learning Spatiotemporal Occupancy Grid Maps for Lifelong Navigation in Dynamic Scenes
In collaboration with University of Toronto
AuthorsHugues Thomas, Matthieu Gallet de Saint Aurin, Jian Zhang, Timothy D. Barfoot
Learning Spatiotemporal Occupancy Grid Maps for Lifelong Navigation in Dynamic Scenes
In collaboration with University of Toronto
AuthorsHugues Thomas, Matthieu Gallet de Saint Aurin, Jian Zhang, Timothy D. Barfoot
We present a novel method for generating, predicting, and using Spatiotemporal Occupancy Grid Maps (SOGM), which embed future information of dynamic scenes. Our automated generation process creates groundtruth SOGMs from previous navigation data. We build on prior work to annotate lidar points based on their dynamic properties, which are then projected on time-stamped 2D grids: SOGMs. We design a 3D-2D feedforward architecture, trained to predict the future time steps of SOGMs, given 3D lidar frames as input. Our pipeline is entirely self-supervised, thus enabling lifelong learning for robots. The network is composed of a 3D back-end that extracts rich features and enables the semantic segmentation of the lidar frames, and a 2D front-end that predicts the future information embedded in the SOGMs within planning. We also design a navigation pipeline that uses these predicted SOGMs. We provide both quantitative and qualitative insights into the predictions and validate our choices of network design with a comparison to the state of the art and ablation studies.
How PARTs Assemble into Wholes: Learning the Relative Composition of Images
February 6, 2026research area Computer Vision, research area Methods and Algorithmsconference Northern Lights Deep Learning Conference (NLDL)
The composition of objects and their parts, along with object-object positional relationships, provides a rich source of information for representation learning. Hence, spatial-aware pretext tasks have been actively explored in self-supervised learning. Existing works commonly start from a grid structure, where the goal of the pretext task involves predicting the absolute position index of patches within a fixed grid. However, grid-based…
Self-Supervised Learning of Lidar Segmentation for Autonomous Indoor Navigation
September 23, 2021research area Computer Visionconference ICRA
We present a self-supervised learning approach for the semantic segmentation of lidar frames. Our method is used to train a deep point cloud segmentation architecture without any human annotation. The annotation process is automated with the combination of simultaneous localization and mapping (SLAM) and ray-tracing algorithms. By performing multiple navigation sessions in the same environment, we are able to identify permanent structures, such…