#

vision-language-model

Here are 109 public repositories matching this topic...

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型

image-classification gpt multi-modal semantic-segmentation video-classification mme image-text-retrieval llm vision-language-model gpt-4v vit-6b vit-22b gpt-4o

Updated May 24, 2024
Python

llm-jp / awesome-japanese-llm

日本語LLMまとめ - Overview of Japanese LLMs

japanese generative-model japanese-language language-models language-model generative-models multimodal vision-and-language vision-language foundation-models large-language-models llm llms generative-ai large-language-model vision-language-model japanese-llm japanese-language-model llm-japanese

Updated May 24, 2024

srvCodes / clap4clip

bayesian-inference variational-inference continual-learning catastrophic-forgetting vision-language-model

Updated May 24, 2024
Python

zabir-nabil / awesome-multilingual-large-language-models

A comprehensive collection of multilingual datasets and large language models, meticulously curated for evaluating and enhancing the performance of large language models across diverse languages and tasks.

multilingual awesome-machine-learning awesome-nlp large-language-models llms large-language-model vision-language-model large-language-models-and-translation-systems multilingual-llm gpt-other-languages llms-language awesome-large-language-models

Updated May 23, 2024

whwu95 / FreeVA

FreeVA: Offline MLLM as Training-Free Video Assistant

chatbot video-understanding zero-shot-video-captioning video-question-answering chatgpt vision-language-model llava training-free multimodal-large-language-models

Updated May 22, 2024
Python

mtakamichi / ZEN-IQA

Official implementation of our IEEE Access paper (2024), ZEN-IQA: Zero-Shot Explainable and No-Reference Image Quality Assessment with Vision Language Model

pytorch clip iqa image-quality-assessment blind-image-quality-assessment pytorch-implementation nr-iqa vision-language-model

Updated May 21, 2024
Python

HenryPengZou / ImplicitAVE

[ACL ARR Under Review] Dataset and Code of "ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction"

attribute-value-extraction vision-language-model multimodal-llm implicit-attribute-value-extraction

Updated May 20, 2024
Jupyter Notebook

reidbarber / webmarker

A library for marking web pages for Set-of-Mark (SoM) prompting with vision-language models.

som prompt prompt-engineering vision-language-model set-of-mark

Updated May 19, 2024
TypeScript

jaiprakash1824 / VLM_Adv_Attack

In the dynamic landscape of medical artificial intelligence, this study explores the vulnerabilities of the Pathology Language-Image Pretraining (PLIP) model, a Vision Language Foundation model, under targeted attacks like PGD adversarial attack.

pytorch attention-mechanism clip vulnerability-detection pathology trustworthiness adversarial-attacks attention-visualization pathology-image histopathology-images pgd-adversarial-attacks contrastive-learning trustworthy-machine-learning vision-transformer trustworthy-ai plip-model histopathology-image-classfication vision-language-model

Updated May 18, 2024
Jupyter Notebook

corentin-ryr / MultiMedEval

A Python tool to evaluate the performance of VLM on the medical domain.

benchmark evaluation medical-imaging vision-language-model llava

Updated May 18, 2024
Python

PJLab-ADG / awesome-knowledge-driven-AD

A curated list of awesome knowledge-driven autonomous driving (continually updated)

autonomous-driving knowledge-driven large-language-models vision-language-model

Updated May 18, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

large-language-models vision-language-model

Updated May 18, 2024
Python

RobustVLM

chs20 / RobustVLM

[ICML 2024] Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models

ai ml clip adversarial-attacks adversarial-defense vision-language-model

Updated May 16, 2024
Python

FoundationVision / Groma

Grounded Multimodal Large Language Model with Localized Visual Tokenization

llama multimodal grounding foundation-models large-language-models llm mllm vision-language-model llama2

Updated May 15, 2024
Python

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

chatbot llama multimodal multi-modality gpt-4 foundation-models visual-language-learning chatgpt instruction-tuning vision-language-model llava llama2 llama-2

Updated May 15, 2024
Python

Oztobuzz / Vista

This is the official repository for Vista dataset - A Vietnamese multimodal dataset contains more than 700,000 samples of conversations and images

open-source vietnamese dataset vista vietnamese-nlp multimodal multi-modality vision-language-model

Updated May 14, 2024
Python

QuIIL / TQx

Towards a text-based quantitative and explainable histopathology image analysis (MICCAI 2024)

computational-pathology vision-language-model

Updated May 14, 2024

xuyang-liu16 / VGDiffZero

[ICASSP 2024] VGDiffZero: Text-to-image Diffusion Models Can Be Zero-shot Visual Grounders

computer-vision zero-shot-learning vision-language-model

Updated May 13, 2024
Python

HieuPhan33 / CVPR2024_MAVL

Multi-Aspect Vision Language Pretraining - CVPR2024

zero-shot-classification vision-language-pretraining vision-language-model zero-shot-segmentation medical-vision-and-language-pretraining

Updated May 12, 2024
Python

Mamadou-Keita / FIDAVL

[ICPR 2024] The official repo for FIDAVL: Fake Image Detection and Attribution using Vision-Language Model

image-captioning gans image-forensics deepfake diffusion-models soft-prompt-tuning large-language-model vision-language-model vision-question-answering synthetic-image-attribution

Updated May 10, 2024

Improve this page

Add a description, image, and links to the vision-language-model topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vision-language-model topic, visit your repo's landing page and select "manage topics."