References
-
R. Mur-Artal, J. M. M. Montiel, and J. D. Tardos, “ORB-SLAM: A versatile and accurate monocular SLAM system,” IEEE Transactions on Robotics, vol. 31, no. 5, pp. 1147–1163, Oct. 2015.
In this paper, the authors suggest the new algorithm for simultaneous localization and mapping called ORB-SLAM. It is the feature based monocular SLAM system in a small or large environment and in both indoor and outdoor. The main idea of this paper is using PTAM that means parallel tracking and mapping and loop closing. PTAM is the one of visual SLAM algorithm that uses keyframes and the differences between them.
ORB-SLAM reduced the calculating time of each image a lot but it seems that the tracking accuracy can be higher. Also, it only makes the sparse map but it can be the good root for the further algorithm to make the dense map.
This paper is relevant to my current topic because this is the one of most famous SLAM until now and I want to improve SLAM in self-driving car field.
-
D. Detone, T. Malisiewicz, and A. Rabinovich, “SuperPoint: Self-supervised interest point detection and description,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Apr. 2018.
In this paper, the authors turned the deep learning into the frontend of the SLAM system. There is a shared encoder to keep the model fast and easy to train which is operated by the interest point decoder for detection and the descriptor decoder for description.
This paper used self-supervised learning so it worked well even in so many diversion of views and lights as compared with classic feature extracting and matching algorithms.
The attempt to combine with deep learning and the feature extraction can be a good frame to my further work to make the end-to-end neural network model for SLAM.
-
E. Parisotto, D. S. Chaplot, J. Zhang, and R. Salakhutdinov, “Global pose estimation with an attention-based recurrent network,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Feb. 2018.
In this paper, the authors turned the deep learning into the backend of the SLAM system. The main concept of this paper is the neural global optimizer consist of local pose estimation model, pose aggregation module and neural graph optimization. It predicts the relative pose changes and optimizes the predictions.
This paper was the first attempt to use deep learning in SLAM. Although it is not the end-to-end model, it is well worth enough. They have the enormous potential for development because there are several convolutional network models can be applied instead of Flownet that it used in this paper.
The neural global optimizer is one of good related work for my further work to make the end-to-end neural network model for SLAM.
-
G. Klein and D. Murray, “Parallel tracking and mapping for small AR workspaces,” 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, 2007.
Original visual SLAM considers every frame to pick feature points but Parallel Tracking and Mapping(PTAM) picks them in keyframes so it is faster than before. It runs tracking and mapping separately in parallel threads. There are few calculations in tracking so it is applied on every video frames for making it real-time and more accurate but mapping is applied on only keyframes.
However, PTAM is appropriate to apply only in fixed and limited space like the office desk. Also, there is no loop detection which makes the result more accurate and the user should do the initialization for scaling before running the algorithm.
This work became the base work of ORB-SLAM.
-
S. Milz, G. Arbeiter, C. Witt, B. Abdallah, and S. Yogamani, “Visual SLAM for automated driving: Exploring the applications of deep learning,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018.
In this paper, authors did the brief summary of some important SLAM algorithms so far, introduced the fundamental pipeline of SLAM and explained how and why deep learning can help them to improve. Also, there are short descriptions about considering problems in self-driving vehicles field.
Because this paper is written in early 2018 there is no important researches of 2018 and 2019 so far but it is helpful to read quickly about classical visual SLAM approaches such as PTAM, ORB-SLAM and Mono SLAM.
There is slow development on convolutional neural network based SLAM algorithm approaches. This paper helped me to learn SLAM briefly and gave me some insights of adopting deep learning to it.
-
Kang Rong, Jieqi Shi, Xueming Li, Yang Liu and Xiao Liu. "DF-SLAM: A deep-learning enhanced visual SLAM system based on deep local features." arXiv preprint arXiv:1901.07223, 2019.
DF-SLAM is proposed with the deep local feature descriptors adopted by the neural network which can replace ORB, SIFT or other classic feature descriptors. This system runs three threads in parallel, there are tracking, local mapping and loop closing each. The new local feature descriptors are extracted before the tracking thread manages a new frame.
DF-SLAM can be more stable. Also, there is a limitation that it is hard to handle difficult localization and mapping problems under extreme conditions. There are some points to improve include global bundle adjustment.
This paper can help me to establish the end-to-end system for deep learning enhanced and powerful SLAM algorithm.