Undergraduate thesis project: Video Cover Generation
-
Updated
May 27, 2023 - Jupyter Notebook
Undergraduate thesis project: Video Cover Generation
A curated publication list on visual dialog
[Frontiers in AI Journal] Implementation of the paper "Interpreting Vision and Language Generative Models with Semantic Visual Priors"
[CVPR' 24] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning. ICCV 2023
[CVPR'24 Highlight] SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection
PyTorch code for Finding in NAACL 2022 paper "Probing the Role of Positional Information in Vision-Language Models".
Pytorch Implementation of NeuralTwinsTalk Presented @ IEEE HCCAI 2020.
[ICCV 2023 CLVL Workshop] Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts
[TIP 2022] Official code of paper “Video Question Answering with Prior Knowledge and Object-sensitive Learning”
This repository hosts the code for Jan Hadl's Master Thesis at TU Wien: GS-VQA, a zero-shot visual questions answering (VQA) pipeline that uses vision-language models (VLMs) for visual perception and answer-set programming (ASP) for symbolic reasoning.
[AAAI 2024] NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario.
Fourier Transform Enhanced Vision Language Multi-goal Navigation
Quality-Aware Image-Text Alignment for Real-World Image Quality Assessment
The code for generating natural distribution shifts on image and text datasets.
Mixed vision-language Attention Model that gets better by making mistakes
[NLPCC'23] ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles PyTorch Implementation
VizWiz Challenge Term Project for Multi Modal Machine Learning @ CMU (11777)
🔥🔥🔥 Object State Description & Change Detection
Add a description, image, and links to the vision-language topic page so that developers can more easily learn about it.
To associate your repository with the vision-language topic, visit your repo's landing page and select "manage topics."