Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get the A matrixs which satisfies the Geometric constraints condition EP=-AEP? #34

Open
stilcrad opened this issue Jun 6, 2023 · 7 comments

Comments

@stilcrad
Copy link

stilcrad commented Jun 6, 2023

I hope everything is well with you. I am a Ph.D. student currently researching image matching and pose estimation. Recently, I read your work "Repeatability Is Not Enough: Learning Affine Regions via Discriminability" and ran the code provided in the paper's link (github.com/ducha-aiki/affnet/tree/pytorch1-4_python3) multiple times, as well as reviewing relevant materials. However, I encountered some issues I couldn't solve, so I am writing to ask for your advice. It would be greatly appreciated if you could reply at your convenience.
Firstly, I ran the demo for image matching and obtained a set of n affine transformation parameters (work_LAFs, n x 2 x 3 classical [A, (x;y)] matrix) based on my understanding. I think these parameters include the A matrix and the coordinates of the feature points, but I am not sure if my understanding is correct, and would like to know how to obtain the A matrix if it is not the case.
Secondly, we obtained the affine transformation parameters for two views (work_LAFs1 and work_LAFs2), each of which is a 2x2 matrix. I am unsure about how to obtain the so-called A matrix from these two matrices.
Thirdly, we obtained the A matrix and performed image matching. However, we found that the accuracy of the AC truth values is unsatisfactory when compared to the true values. Therefore, I would like to know how to compare them with the true values and what the accuracy of the correctly obtained A matrix should be.
Last but not least, according to this method, we can obtain the A matrix that satisfies the Geometric constraints condition EP=-AEP, where E is the essential matrix and A is the local affine transformation matrix.
Thank you very much for your consideration
Best Wishes!

@ducha-aiki
Copy link
Owner

Hi,

I don't really understand what you are doing and why, but I will try to answer as far as I understood.
First, you don't need work_LAFs, or anything, which is for training. For the inference time you go either here

https://github.com/ducha-aiki/affnet/blob/pytorch1-4_python3/examples/hesaffnet/WBS%20demo.ipynb

Or - better - to kornia tutorial, as AffNet was nicely integrated there.

https://kornia-tutorials.readthedocs.io/en/latest/_nbs/image_matching_adalam.html

This gives you pairs of corresponding lafs in both images.

I am unsure about how to obtain the so-called A matrix from these two matrices.
If you need local affine transformation, what you can do, is to take A1 = lafs1[:,:, :2, :2], A2 = lafs2[:,:, :2, :2].
For simplicity, let's assume single image in the batch:
A1 = lafs1[0,:, :2, :2], A2 = lafs2[0,:, :2, :2].

Then you can have A=A1 @ torch.inverse(A2), or if you need to have other direction, that would be A2 @ torch.inverse(A1). This will give you [Nx2x2] transformation matrices.

You also can incorporate the keypoint center coordinates, if you transform LAFs into homography by padding with [0,0,1], then you can use lafh, which is Nx3x3 matrix same as you would use homography.

However, we found that the accuracy of the AC truth values is unsatisfactory when compared to the true values.
Of course. The quality of affine correspondences is known to be bad, as they mostly are used for improving descriptors, not the geometry.

I recommend you to check this paper - https://arxiv.org/abs/2007.10032, which described how to alleviate poor precision of the affine frames, as well as this tutorial

https://cvpr22-affine-tutorial.com

Specifically "using covariant features in practice" talk - https://www.youtube.com/watch?v=NWWD7Vqt-Ho&feature=youtu.be

@stilcrad
Copy link
Author

Thanks for your reply and advances! Maybe I didn't describe it clearly that the A matrices I want to find are that satisfies the Geometric constraints condition like formula(6) in https://arxiv.org/pdf/1912.10776.pdf or formula(8)in https://arxiv.org/pdf/1706.01649.pdf. Is it possible for the method in the paper to achieve a similar level of accuracy in https://github.com/danini/affine-correspondences-for-camera-geometry

@ducha-aiki
Copy link
Owner

Man, I have no idea. You can try yourself and write results here :)

@stilcrad
Copy link
Author

stilcrad commented Mar 9, 2024

I did some exploration and it took me some time I'm sorry, I have some doubts, how matrix A helps with wide baseline matching, " as they mostly are used for improving descriptors" assuming we obtain n x [x1, y1, x2, y2, a1, a2,a3,a4], how could we improve the matching quality?

@ducha-aiki
Copy link
Owner

That's simple, if you take a look at the following example.
AffNet takes the last step to estimate affine transformation, which makes patches from different images to look more similar. After the patch extraction, the descriptor is run on the extracted patch. And the easier matching task, the less mistakes we have, therefore the better matching quality.

image

Image is from Computer Vision course from Czech Technical University in Prague.
https://cw.fel.cvut.cz/wiki/courses/mpv/labs/2_correspondence_problem/start#computing_local_invariant_description

@stilcrad
Copy link
Author

stilcrad commented Mar 9, 2024

Thank you very much for your answer. I understand a bit, but I'm not sure if it's correct. The 32 * 32 patch we obtained from the first point as the center multiplied by the affine matrix will be more similar to the patch centered on the second matching point, which is beneficial for matching.

@ducha-aiki
Copy link
Owner

Correct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants