#

multimodal-large-language-models

Here are 47 public repositories matching this topic...

sitamgithub-MSIT / TechSage

chatbot artificial-intelligence gradio techbot gemini-api multimodal-data huggingface-spaces generative-ai multimodal-large-language-models gemini-pro-vision gemini-pro

Updated Apr 11, 2024
Python

CKeibel / FHSWF-deep-learning

Multimodal RAG and comparisons between language models. (Project for Deep Learning Module at the FHSWF)

machine-learning deep-learning multimodal rag multimodal-large-language-models multimodal-rag

Updated May 20, 2024
Python

adithya-s-k / eagle

A framework streamlining Training, Finetuning, Evaluation and Deployment of Multi Modal Language models

vlm llm multimodal-large-language-models

Updated May 18, 2024

nicolay-r / Awesome-Image-Captioning-MLLMs

A curated list of awesome Image captioning strudies, aimed at annotating and reporting CT / MRI scans

nlp image text reports multimodality languagemodels multimodal-large-language-models

Updated Apr 12, 2024

sitamgithub-MSIT / well-being

chatbot artificial-intelligence gradio gemini-api multimodal-data huggingface-spaces generative-ai multimodal-large-language-models gemini-pro-vision gemini-pro

Updated Apr 11, 2024
Python

alexander-moore / vlm

Composition of Multimodal Language Models From Scratch

machine-learning ai vlm llm mllm vision-language-model multimodal-large-language-models mmllm

Updated May 7, 2024
Jupyter Notebook

NishilBalar / Awesome-LVLM-Hallucination

up-to-date and curated list of awesome state-of-the-art LVLMs hallucinations research work, papers & resources

hallucination large-vision-language-models multimodal-large-language-models hallucination-evaluation hallucination-detection hallucination-mitigation hallucination-survey

Updated May 23, 2024

hari-huynh / viVQA-voice-assistant

Voice assistant using Multimodal LLMs - LLaVA-NeXT (Mistral 7B) finetuned & PhoWhisper

text-to-speech lora visual-question-answering llava multimodal-large-language-models audio-speech-recognition mistral-7b

Updated May 15, 2024
Python

scofield7419 / Video-of-Thought

Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"

video video-reasoning chain-of-thought multimodal-large-language-models chain-of-thought-reasoning video-model

Updated May 6, 2024

zjunlp / EasyDetect

[ACL 2024] An Easy-to-use Hallucination Detection Framework for LLMs.

natural-language-processing artificial-intelligence knowledge-graph generation multimodal hallucination aigc large-language-models generative-ai model-editing knowledge-editing multimodal-large-language-models knowlm easydetect hallucination-detection

Updated May 18, 2024
Python

declare-lab / InstrAug

[Arxiv 2024] Official Implementation of the paper: "InstrAug: Automatic Instruction Augmentation for Multimodal Instruction Fine-tuning"

instruction-following large-language-models llm multimodal-instruction-tuning multimodal-large-language-models llm-robustness

Updated May 15, 2024
Jupyter Notebook

weihaox / UMBRAE

UMBRAE: Unified Multimodal Decoding of Brain Signals | Unveiling the 'Dark Side' of Brain Modality

neuroimaging brain-mri multimodal-large-language-models

Updated May 19, 2024
Jupyter Notebook

MileBench / MileBench

This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"

benchmark machine-learning natural-language-processing deep-neural-networks computer-vision deep-learning evaluation multimodality visual-question-answering multimodal foundation-models large-language-models llm llms long-context-transformers multimodal-large-language-models large-multimodal-models long-context-modeling

Updated May 19, 2024
Python

bigai-nlco / LSTP-Chat

A Video Chat Agent with Temporal Prior

spatial-temporal video-language llm mllm visual-instruction-tuning multimodal-large-language-models

Updated Feb 28, 2024
Python

friedrichor / Awesome-Multimodal-Papers

A curated list of awesome Multimodal studies.

deep-learning multimodal-learning multimodal multimodal-deep-learning multimodal-data multimodal-dialogue multimodal-large-language-models large-multimodal-models

Updated May 21, 2024
HTML

whwu95 / FreeVA

FreeVA: Offline MLLM as Training-Free Video Assistant

chatbot video-understanding zero-shot-video-captioning video-question-answering chatgpt vision-language-model llava training-free multimodal-large-language-models

Updated May 22, 2024
Python

declare-lab / MM-InstructEval

This repository contains code to evaluate various multimodal large language models using different instructions across multiple multimodal content comprehension tasks.

multimodal-large-language-models multimodal-content-comprehension-tasks

Updated May 2, 2024
Python

hpc203 / Chinese-CLIP-opencv-onnxrun

使用OpenCV+onnxruntime部署中文clip做以文搜图，给出一句话来描述想要的图片，就能从图库中搜出来符合要求的图片。包含C++和Python两个版本的程序

clip opencv-dnn image-text-retrieval multimodal-large-language-models

Updated Jan 15, 2024
C++

VisualWebBench / VisualWebBench

Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"

machine-learning natural-language-processing computer-vision deep-learning evaluation question-answering visual-question-answering multimodal multimodal-deep-learning foundation-models large-language-models llm llms mllm multimodal-large-language-models large-multimodal-models

Updated Apr 17, 2024
Python

Lzcstan / DrugLAMP

A PyTorch-based system for highly accurate drug-target interaction predictions utilizing multi-modal large language models to discern structural affinities in drug-target pairs.

attention-mechanism drug-target-interactions contrastive-learning multimodal-large-language-models

Updated Mar 26, 2024
Python

Improve this page

Add a description, image, and links to the multimodal-large-language-models topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-large-language-models topic, visit your repo's landing page and select "manage topics."