Description

This project focuses on generating monocular pseudo-LiDAR point clouds for the KITTI dataset using multiple depth estimation backbones including NeWCRFs and Depth Anything v2. The pipeline converts 2D images into 3D point clouds by estimating depth and backprojecting pixels to 3D space using camera calibration data.

The generated pseudo-LiDAR point clouds are evaluated with PointRCNN (OpenPCDet) for 3D object detection on autonomous driving scenarios. Multiple experiments were conducted including grayscale intensity mapping, geometry-only representations, and mask-guided sampling strategies to optimize point cloud quality.

Pipeline

Key features include statistical outlier removal, intelligent downsampling to ~16k-40k points, and depth accuracy analysis across various distance ranges. The project demonstrates how monocular depth estimation can be leveraged to create cost-effective 3D perception systems for autonomous vehicles.