-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
To achieve results more closely aligned with the paper #16
Comments
From my experiments, I got better results training from scratch than loading pretrained_ViT (curiously), so my results are without pretrained ViT. If you read the "args.pkl" files you will see the exact configuration I used to train the models (https://github.com/aofrancani/TSformer-VO?tab=readme-ov-file#2-pre-trained-models) they are .pkl files, but they can be easily read with import pandas as pd
args = pd.read_pickle("args.pkl") From your images, it looks like you are not applying the 7DoF alignment like mentioned in the paper (also other good works do this, so I had to apply it to make a fair comparison with them)... The results are really poor since we deal with the monocular case and we don't have scale information. For this alignment, I used https://github.com/Huangying-Zhan/kitti-odom-eval. They already save the plots and other analyses for you... I just modified their code to save the data I needed and I made a plot on my way... I also observed that there is a high variance in the results depending on the epoch you take to evaluate... I believe this is due to the poor results in the angle estimation (the angles have small values in the Euler representation). One thing that might help is to weight the loss for the angles, like in the DeepVO paper. I already implemented it when you set a value to "weighted_loss" (maybe 10, 20, 50, 100, idk), but in the end I had no resources to test and evaluate it :( |
Thank you for your kind response. After understanding that applying 7DoF alignment is crucial for achieving results consistent with the paper, I conducted evaluations using the https://github.com/Huangying-Zhan/kitti-odom-eval repository, as per your guidance. The results are as follows: The first graph depicts the inference results from the pretrained model shared on this GitHub repository, TSformer-VO-3. The second graph represents the results after applying 7DoF alignment. Given the dramatic change in results depending on whether 7DoF alignment is applied or not, I can't help but raise the following questions: Q1) What is 7DoF alignment, and could you provide a recommended resource for studying it? I apologize for asking many questions. I'm in a position where I need to utilize visual odometry, even though I'm not very familiar with it. I appreciate your kind and helpful responses as always. |
Hi. A few days ago, I encountered an error while attempting to run the pretrained_ViT model. I managed to resolve it through another issue. Actually, the reason I attempted to run the pretrained_ViT model was because the results of the non-pretrained model were inconsistent with the results in the paper provided in this GitHub repository. Therefore, after resolving the pretrained issue, I trained the model with pretrained_ViT set to True, and obtained results for sequences 01, 03, 04, 05, 06, 07, and 10 as follows:
Here are the settings in train.py:
The results are similar to the pretrained models provided on GitHub, namely Model1, Model2, and Model3.
It seems like I might have made a mistake somewhere. Could you kindly advise on what I should correct?
The text was updated successfully, but these errors were encountered: