Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of different frond-ends #754

Open
sarlinpe opened this issue Dec 4, 2023 · 1 comment
Open

Performance of different frond-ends #754

sarlinpe opened this issue Dec 4, 2023 · 1 comment

Comments

@sarlinpe
Copy link

sarlinpe commented Dec 4, 2023

Hi folks,

I've read your GTSfM paper - nice work, thanks for pushing this to arxiv. I enjoyed reading it and appreciate the huge effort that went into building it. I am very surprised by the conclusion that SuperPoint+Super/LightGlue is not as good as SIFT - in fact we've always observed the exact opposite with incremental SfM (COLMAP) on different easy and difficult datasets (ETH3D, IMC 2020/1/2/3). I went through the code but didn't find anything obvious.

  1. The point clouds of SP+SG/LG look pretty sparse on several datasets, so do the matches in fig 3.

the shorter image side is resized to at most 760 pixels in length

So that'd give a 1351x760 px image for a 1920×1080 input - this seems fine.

A maximum of 5000 keypoints are used for each of the following front-ends

Do you know how many points are effectively extracted by SuperPoint per image? How often is the limit of 5k hit compared to SIFT?

self._config = {"weights_path": weights_path}
self._model = SuperPoint(self._config).eval()

Do I understand correctly that you use the default settings? Did you try to tweak them? As is, it cannot return 5k keypoints on these kinds of images, unlike SIFT. I recommend the following:

  • decrease the detection threshold: keypoint_threshold=0.001
  • decrease the NMS radius: nms_radius=3
  • if images are smaller than the limit (760px), upsample them

This should make SuperPoint competitive with SIFT in terms of keypoint detection.

  1. We do know that these deep matchers are more easily tricked by symmetries, as you point out in fig 3. This seems confirmed by table 3: compared to SIFT, the mean of the front-end errors is much higher than their median and they have many more VG outliers, especially on South Building and Crane.
  • Did you try tuning the filtering threshold (minimum number of inliers, cycle consistency) for each front-end? 15 and 7° seem pretty loose for front-ends that have a high recall.
  • Did you try running the averaging+BA on edges that are inliers according to the GT poses?
  • It seems that the motion averaging does not have any robustness built-in. Zhang et al. (ICCV 2023) show that using a robust cost function is critical (table 5) and that weighting by inlier count or two-view covariance can often help. Did you try this? This paper actually shows that SuperPoint+SuperGlue can work perfectly fine for global SfM.

Thanks!
cc @Phil26AT @ducha-aiki

@dellaert
Copy link
Member

dellaert commented Dec 6, 2023

Thanks for your comments! We'll certainly discuss and try some of the things you suggest. We were not rooting for SIFT in any way :-) We appreciate the advice and hope to just get the best possible performance..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants