publications

On Bundle Adjustment for Multiview Point Cloud Registration

Huaiyang Huang, Yuxiang Sun, Jin Wu, Jianhao Jiao and 4 more authors.

IEEE Robotics and Automation Letters (RA-L), 2021

abs website supp code paper bib

Multiview registration is used to estimate Rigid Body Transformations (RBTs) from multiple frames and reconstruct a scene with corresponding scans. Despite the success of pairwise registration and pose synchronization, the concept of Bundle Adjustment (BA) has been proven to better maintain global consistency. So in this work, we make the multiview point-cloud registration more tractable from a different perspective in resolving range-based BA. Based on this analysis, we propose an objective function that takes both measurement noises and computational cost into account. For the feature parameter update, instead of calculating the global distribution parameters from the raw measurements, we aggregate the local distributions upon the pose update at each iteration. The computational cost of feature update is then only dependent on the number of scans. Finally, we develop a multiview registration system using voxel-based quantization that can be applied in real-world scenarios. The experimental results demonstrate our superiority over the baselines in terms of both accuracy and speed. Moreover, the results also show that our average positioning errors achieve the centimeter level.

@article{huang2021bundle, title={On Bundle Adjustment for Multiview Point Cloud Registration}, author={Huang, Huaiyang and Sun, Yuxiang and Wu, Jin and Jiao, Jianhao and Hu, Xiangcheng and Zheng, Linwei and Wang, Lujia and Liu, Ming}, journal={IEEE Robotics and Automation Letters}, volume={6}, number={4}, pages={8269--8276}, year={2021}, publisher={IEEE} }

Incorporating Learnt Local and Global Embeddings into Monocular Visual SLAM

Huaiyang Huang, Haoyang Ye, Yuxiang Sun, Lujia Wang, Ming Liu

Autonomous Robots (AURO), Sprinter, 2021

abs website code paper bib

Traditional approaches for Visual Simultaneous Localization and Mapping (VSLAM) rely on low-level vision information for state estimation, such as handcrafted local features or the image gradient. While significant progress has been made through this track, under more challenging configuration for monocular VSLAM, e.g., varying illumination, the performance of state-of-the-art systems generally degrades. As a consequence, robustness and accuracy for monocular VSLAM are still widely concerned. This paper presents a monocular VSLAM system that fully exploits learnt features for better state estimation. The proposed system leverages both learnt local features and global embeddings at different modules of the system: direct camera pose estimation, inter-frame feature association, and loop closure detection. With a probabilistic explanation of keypoint prediction, we formulate the camera pose tracking in a direct manner and parameterize local features with uncertainty taken into account. To alleviate the quantization effect, we adapt the mapping module to generate 3D landmarks better to guarantee the system's robustness. Detecting temporal loop closure via deep global embeddings further improves the robustness and accuracy of the proposed system. The proposed system is extensively evaluated on public datasets (Tsukuba, EuRoC, and KITTI), and compared against the state-of-the-art methods. The competitive performance of camera pose estimation confirms the effectiveness of our method.

@article{huang2021incorporating, title={Incorporating learnt local and global embeddings into monocular visual SLAM}, author={Huang, Huaiyang and Ye, Haoyang and Sun, Yuxiang and Wang, Lujia and Liu, Ming}, journal={Autonomous Robots}, volume={45}, number={6}, pages={789--803}, year={2021}, publisher={Springer} }

3D Surfel Map-Aided Visual Relocalization with Learned Descriptors

Haoyang Ye, Huaiyang Huang, Marco Hutter, Timothy Sandy, Ming Liu.

IEEE International Conference on Robotics and Automation (ICRA), 2021

abs code paper bib

In this paper, we introduce a method for visual relocalization using the geometric information from a 3D surfel map. A visual database is first built by global indices from the 3D surfel map rendering, which provides associations between image points and 3D surfels. Surfel reprojection constraints are utilized to optimize the keyframe poses and map points in the visual database. A hierarchical camera relocalization algorithm then utilizes the visual database to estimate 6-DoF camera poses. Learned descriptors are further used to improve the performance in challenging cases. We present evaluation under real-world conditions and simulation to show the effectiveness and efficiency of our method, and make the final camera poses consistently well aligned with the 3D environment.

@inproceedings{ye20213d, title={3D Surfel Map-Aided Visual Relocalization with Learned Descriptors}, author={Ye, Haoyang and Huang, Huaiyang and Hutter, Marco and Sandy, Timothy and Liu, Ming}, booktitle={2021 IEEE International Conference on Robotics and Automation (ICRA)}, pages={5574--5581}, year={2021}, organization={IEEE} }

GMMLoc: Structure Consistent Visual Localization with Gaussian Mixture Model

Huaiyang Huang, Haoyang Ye, Yuxiang Sun, Ming Liu

IEEE Robotics and Automation Letters (RA-L), 2020

abs website code paper bib

Incorporating prior structure information into the visual state estimation could generally improve the localization performance. In this letter, we aim to address the paradox between accuracy and efficiency in coupling visual factors with structure constraints. To this end, we present a cross-modality method that tracks a camera in a prior map modelled by the Gaussian Mixture Model (GMM). With the pose estimated by the front-end initially, the local visual observations and map components are associated efficiently, and the visual structure from the triangulation is refined simultaneously. By introducing the hybrid structure factors into the joint optimization, the camera poses are bundle-adjusted with the local visual structure. By evaluating our complete system, namely GMMLoc, on the public dataset, we show how our system can provide a centimeter-level localization accuracy with only trivial computational overhead. In addition, the comparative studies with the state-of-the-art vision-dominant state estimators demonstrate the competitive performance of our method.

@article{huang2020gmmloc, title={GMMLoc: Structure Consistent Visual Localization with Gaussian Mixture Models}, author={Huang, Huaiyang and Ye, Haoyang and Sun, Yuxiang and Liu, Ming}, journal={IEEE Robotics and Automation Letters}, volume={5}, number={4}, pages={5043--5050}, year={2020}, publisher={IEEE} }

Monocular Direct Sparse Localization in a Prior 3D Surfel Map

Haoyang Ye, Huaiyang Huang and Ming Liu

IEEE International Conference on Robotics and Automation (ICRA), 2020

abs website code paper bib

In this paper, we introduce an approach to tracking the pose of a monocular camera in a prior surfel map. By rendering vertex and normal maps from the prior surfel map, the global planar information for the sparse tracked points in the image frame is obtained. The tracked points with and without the global planar information involve both global and local constraints of frames to the system. Our approach formulates all constraints in the form of direct photometric errors within a local window of the frames. The final optimization utilizes these constraints to provide the accurate estimation of global 6-DoF camera poses with the absolute scale. The extensive simulation and real-world experiments demonstrate that our monocular method can provide accurate camera localization results under various conditions.

@inproceedings{ye2020monocular, title={Monocular direct sparse localization in a prior 3d surfel map}, author={Ye, Haoyang and Huang, Huaiyang and Liu, Ming}, booktitle={2020 IEEE International Conference on Robotics and Automation (ICRA)}, pages={8892--8898}, year={2020}, organization={IEEE} }

Monocular Visual Odometry using Learned Repeatability and Description

Huaiyang Huang, Haoyang Ye, Yuxiang Sun, Ming Liu

IEEE International Conference on Robotics and Automation (ICRA), 2020

abs website code paper bib

@inproceedings{huang2020monocular, title={Monocular visual odometry using learned repeatability and description}, author={Huang, Huaiyang and Ye, Haoyang and Sun, Yuxiang and Liu, Ming}, booktitle={2020 IEEE International Conference on Robotics and Automation (ICRA)}, pages={8913--8919}, year={2020}, organization={IEEE} }

Metric Monocular Localization Using Signed Distance Fields

Huaiyang Huang, Yuxiang Sun, Haoyang Ye, Ming Liu.

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019

abs website paper bib

Metric localization plays a critical role in vision-based navigation. For overcoming the degradation of matching photometry under appearance changes, recent research resorted to introducing geometry constraints of the prior scene structure. In this paper, we present a metric localization method for the monocular camera, using the Signed Distance Field (SDF) as a global map representation. Leveraging the volumetric distance information from SDFs, we aim to relax the assumption of an accurate structure from the local Bundle Adjustment (BA) in previous methods. By tightly coupling the distance factor with temporal visual constraints, our system corrects the odometry drift and jointly optimizes global camera poses with the local structure. We validate the proposed approach on both indoor and outdoor public datasets. Compared to the state-of-the-art methods, it achieves a comparable performance with a minimal sensor configuration.

@inproceedings{huang2019metric, title={Metric monocular localization using signed distance fields}, author={Huang, Huaiyang and Sun, Yuxiang and Ye, Haoyang and Liu, Ming}, booktitle={2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, pages={1195--1201}, year={2019}, organization={IEEE} }

Geometric Structure Aided Visual Inertial Localization

Huaiyang Huang, Haoyang Ye, Jianhao Jiao, Yuxiang Sun, Ming Liu.

arxiv, 2021

abs paper bib

Visual Localization is an essential component in autonomous navigation. Existing approaches are either based on the visual structure from SLAM/SfM or the geometric structure from dense mapping. To take the advantages of both, in this work, we present a complete visual inertial localization system based on a hybrid map representation to reduce the computational cost and increase the positioning accuracy. Specially, we propose two modules for data association and batch optimization, respectively. To this end, we develop an efficient data association module to associate map components with local features, which takes only 2ms to generate temporal landmarks. For batch optimization, instead of using visual factors, we develop a module to estimate a pose prior from the instant localization results to constrain poses. The experimental results on the EuRoC MAV dataset demonstrate a competitive performance compared to the state of the arts. Specially, our system achieves an average position error in 1.7 cm with 100% recall. The timings show that the proposed modules reduce the computational cost by 20-30%.

@article{huang2020geometric, title={Geometric structure aided visual inertial localization}, author={Huang, Huaiyang and Ye, Haoyang and Jiao, Jianhao and Sun, Yuxiang and Liu, Ming}, journal={arXiv preprint arXiv:2011.04173}, year={2020} }