[Arxiv 2024] Official Implementation of the paper: "InstrAug: Automatic Instruction Augmentation for Multimodal Instruction Fine-tuning"
-
Updated
May 15, 2024 - Jupyter Notebook
[Arxiv 2024] Official Implementation of the paper: "InstrAug: Automatic Instruction Augmentation for Multimodal Instruction Fine-tuning"
Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"
This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
A framework streamlining Training, Finetuning, Evaluation and Deployment of Multi Modal Language models
FreeVA: Offline MLLM as Training-Free Video Assistant
A curated list of awesome Image captioning strudies, aimed at annotating and reporting CT / MRI scans
Multimodal RAG and comparisons between language models. (Project for Deep Learning Module at the FHSWF)
A PyTorch-based system for highly accurate drug-target interaction predictions utilizing multi-modal large language models to discern structural affinities in drug-target pairs.
Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"
Composition of Multimodal Language Models From Scratch
A Video Chat Agent with Temporal Prior
[ACL 2024] An Easy-to-use Hallucination Detection Framework for LLMs.
This repository contains code to evaluate various multimodal large language models using different instructions across multiple multimodal content comprehension tasks.
Moondream is a lightweight multimodal large language model
mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating
A curated list of awesome Multimodal studies.
UMBRAE: Unified Multimodal Decoding of Brain Signals | Unveiling the 'Dark Side' of Brain Modality
Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".
Add a description, image, and links to the multimodal-large-language-models topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-large-language-models topic, visit your repo's landing page and select "manage topics."