facebookresearch / mmf Star 5.4k Code Issues Pull requests A modular framework for vision & language multimodal research from Facebook AI Research (FAIR) deep-learning dialog pytorch vqa pretrained-models captioning multimodal multi-tasking textvqa hateful-memes Updated Mar 3, 2024 Python
yashkant / sam-textvqa Star 62 Code Issues Pull requests Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020. language vision eccv textvqa Updated Sep 15, 2021 Python
phiyodr / vqaloader Star 6 Code Issues Pull requests PyTorch DataLoader for many VQA datasets pytorch vqa dataloader gqa textvqa vqav2 Updated Jan 10, 2023 Python