mllm

Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation

image-captioning nodes vlm custom-nodes img2text llm mllm llava comfyui siglip phi15 joytag img2sfx

Updated May 23, 2024
Python

Coobiw / MiniGPT4Qwen

Star

Personal Project: MPP-Qwen14B(Multimodal Pipeline Parallel-Qwen14B). Don't let the poverty limit your imagination! Train your own 14B LLaVA-like MLLM on RTX3090/4090 24GB.

fine-tuning pipeline-parallelism pretraining model-parallel deepspeed mllm multimodal-large-language-models qwen

Updated May 18, 2024
Jupyter Notebook

360CVGroup / SEEChat

Star

Multimodal chatbot with computer vision capabilities integrated

chatbot gpt4 mllm

Updated May 17, 2024
Python

Datasets, case studies and benchmarks for extracting structured information from PDFs, HTML files or images, created by the Parsee.ai team. Datasets also on Hugging Face: https://huggingface.co/parsee-ai

datasets rag llm mllm

Updated May 15, 2024
Jupyter Notebook

FoundationVision / Groma

Star

Grounded Multimodal Large Language Model with Localized Visual Tokenization

llama multimodal grounding foundation-models large-language-models llm mllm vision-language-model llama2

Updated May 15, 2024
Python

InternLM / InternLM-XComposer

Star

InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.

foundation gpt language-model multimodal multi-modality vision-transformer gpt-4 visual-language-learning llm chatgpt instruction-tuning large-language-model supervised-finetuning mllm vision-language-model large-vision-language-model

Updated May 8, 2024
Python

Ahnsun / merlin

Star

Merlin: Empowering Multimodal LLMs with Foresight Minds

mllm

Updated May 8, 2024
Python

atfortes / Awesome-LLM-Reasoning

Star

Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought, Instruction-Tuning and Multimodality.

awesome prompt question-answering gpt papers language-models reasoning cot multimodal gpt-4 in-context-learning prompt-engineering chain-of-thought chatgpt mllm vllm

Updated May 8, 2024

alexander-moore / vlm

Star

Composition of Multimodal Language Models From Scratch

machine-learning ai vlm llm mllm vision-language-model multimodal-large-language-models mmllm

Updated May 7, 2024
Jupyter Notebook

CharlieDDDD / AISurveyPapers

Star

Large Visual Language Model(LVLM), Large Language Model(LLM), Multimodal Large Language Model(MLLM), Alignment, Agent, AI System, Survey

agent agi survey alignment ai-system llm mllm lvlm

Updated Apr 28, 2024

graphic-design-ai / graphist

Star

Official Repo of Graphist

graphic-design hlg lmm llm mllm layout-generation

Updated Apr 23, 2024

VisualWebBench / VisualWebBench

Star

Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"

machine-learning natural-language-processing computer-vision deep-learning evaluation question-answering visual-question-answering multimodal multimodal-deep-learning foundation-models large-language-models llm llms mllm multimodal-large-language-models large-multimodal-models

Updated Apr 17, 2024
Python

KwaiVGI / Uniaa

Star

Unified Multi-modal IAA Baseline and Benchmark

benchmark dataset image-aesthetic-assessment mllm llava

Updated Apr 16, 2024

CircleRadon / Osprey

Star

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

sam mllm visual-instruction-tuning pixel-understanding

Updated Apr 15, 2024
Python

Atomic-man007 / Awesome_Multimodel_LLM

Star

Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Models (MLLM). It covers datasets, tuning techniques, in-context learning, visual reasoning, foundational models, and more. Stay updated with the latest advancement.

nlp dataset gpt pretrained-models multimodel llm chatgpt mllm

Updated Apr 13, 2024

Improve this page

Add a description, image, and links to the mllm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mllm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mllm

Here are 36 public repositories matching this topic...

TIGER-AI-Lab / Mantis

microsoft / unilm

BAAI-DCAI / Bunny

X-PLUG / mPLUG-DocOwl

BUAADreamer / Chinese-LLaVA-Med

gokayfem / ComfyUI_VLM_nodes

Coobiw / MiniGPT4Qwen

360CVGroup / SEEChat

parsee-ai / parsee-datasets

FoundationVision / Groma

InternLM / InternLM-XComposer

Ahnsun / merlin

atfortes / Awesome-LLM-Reasoning

alexander-moore / vlm

CharlieDDDD / AISurveyPapers

graphic-design-ai / graphist

VisualWebBench / VisualWebBench

KwaiVGI / Uniaa

CircleRadon / Osprey

Atomic-man007 / Awesome_Multimodel_LLM

Improve this page

Add this topic to your repo