Camera pose tracking failure is a critical issue in visual SLAM systems.
Although various failure recovery methods have been proposed, they often struggle when the number of shared features before and after the failure is insufficient.
In this work, we propose an approach for robust failure recovery that leverages text detection to enhance the reliability of feature matching.
We propose failure recovery leveraging Location-Relevant Text Detection(LRTD).
(a) Failure recovery is achieved by utilizing text detection.
(b) LRTD filters out irrelevant text, enhancing robustness and computational efficiency.
(c) A dataset generation pipeline is designed to automatically create training data for LRTD.
This is a demo of our main model, Location-Relevant Text Detection(LRTD).
LRTD is designed to take an image as input and output the bounding boxes of location-relevant text segments.
This is our experiment result across different SLAM methods.
We observed a remarkable reduction in the number of failures across all three types of SLAM systems.
This represents a simple example of trajectory comparison between our proposed method and ORB-SLAM.
Sooyong Shin |
Youngsun Jae |
Chaehyeuk Lee |
This project requires Python 3.10+
pip install -r requirements.txt
Due to size limits, sample data is hosted externally.
Make sure to create the 'data/' and 'results/' directory in this step.
mkdir data && cd data
gdown https://drive.google.com/uc?id=1tZsYiypBhw_9EdzqTGKThjxZBzSjsgU7
unzip example_sequence.zip
cd .. && mkdir results
In 'env.sh', set the path below to the absolute path of your code directory.
RUN_DIR="absolute/path/to/your/code"
The command below runs the full pipeline of our system.
This pipeline requires a CUDA-compatible GPU.
bash run_all_pipeline.sh
Will sequentially execute:
-
src/runLRTD - Perform LRTD on all keyframes
-
src/search4frames - Text guided frame search & Local map generation
-
src/alignmaps - Align two maps with local map
-
evo_traj - Visualize trajectory comparision between our method and ORB-SLAM
All inputs should be stored in:
data/your_sequence_name
Should contain:
-
images/ - RGB images of keyframes
-
orb_result/KeyframeTrakectoryXX.txt - Trajectories of built maps
-
orb_result/timestamp.txt - Timestamps of relocalization & tracking fail
-
Ground_Truth.txt - Ground truth trajectory
-
ORB-SLAM.txt - Aligned trajectory without LRTD
All outputs will be stored in:
results/your_sequence_name
Should contain:
-
COLMAP/
-
LRTD_images/
-
log_4images.txt
-
log_colmap.txt
-
log_tracking_fail.txt
-
LRTD_filtered_info.csv
-
LRTD_info.csv
-
ORB-SLAM_with_LRTD.txt - Aligned trajectory with LRTD
You can configure:
-
Frame search hyperparameters (in 'src/search4frames/config.yaml')
-
Data and result paths (in 'env.sh')