Open-Vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models (ICCV 2023)
-
Updated
Apr 23, 2024 - Python
Open-Vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models (ICCV 2023)
Given a video, we are able to automaticaly answer questions about what is happening in the video.
FreeVA: Offline MLLM as Training-Free Video Assistant
A simple attention deep learning model to answer questions about a given video with the most relevant video intervals as answers.
Code for ACL SustaiNLP 2023 paper "Is a Video worth n × n Images? A Highly Efficient Approach to Transformer-based Video Question Answering"
Code for ACL SRW 2023 paepr "Semantic-aware Dynamic Retrospective-Prospective Reasoning for Event-level Video Question Answering"
Data and PyTorch code for the LifeQA LREC 2020 paper.
Part of my work for my Bachelor's Thesis Project on Counterfactual Reasoning for Videos.
[TIP 2022] Official code of paper “Video Question Answering with Prior Knowledge and Object-sensitive Learning”
LifeQA website code
[ICCV 2021] On the hidden treasure of dialog in video question answering
WildQA website code
[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"
Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
Multi-Scale Progressive Attention Network for Video Question Answering
Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)
[CVPR 2022] A large-scale public benchmark dataset for video question-answering, especially about evidence and commonsense reasoning. The code used in our paper "From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering", CVPR2022.
A PyTorch implementation of EmpiricalMVM
A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.
[ICCV2023] Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer
Add a description, image, and links to the video-question-answering topic page so that developers can more easily learn about it.
To associate your repository with the video-question-answering topic, visit your repo's landing page and select "manage topics."