Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing the VQA candidate answers from the dataset and paper #135

Open
MagnusOstertag opened this issue Jan 26, 2024 · 0 comments
Open

Comments

@MagnusOstertag
Copy link

Hi,
first of all, thanks for the amazing work!

You wrote in the paper: "For a fair comparison with existing methods, we constrain the decoder to only generate from the $3,192$ candidate answers". In the data to download however, the number of elements in the answer_list is only $3,128$. I first suspected a typo (and -1), because in the paper your citing there it says "The number of outputs is determined by the minimum occurrence of the answer in unique questions as nine times in the dataset, which is $3,129$."

When trying to reproduce the answer_list with the given answers or directly with VQAv2, I get a different number of answers and nearly 300 different answers. So how was the answer list actually created?
(I count the number of unique answers (not questions, because then the problem is not unambiguous). I take as a threshold at least 9 occurrences of the answer, standardizing each answer as in VQAEval. When not only considering the VQAv2.0 answers, but also VisualGenome I get a much higher number of candidate answers.)

I further noticed that you seem to have excluded 7 questions from vqav2.0 in vqa_train/val, namely the questions with ID=268735002, 293514000, 147314003, 68003002, 451818000, 362391000, 196280004. Why was that?

Best,
Magnus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant