#

multi-modal-learning

Here are 92 public repositories matching this topic...

mlfoundations / open_clip

An open source implementation of CLIP.

computer-vision deep-learning pytorch pretrained-models language-model contrastive-loss multi-modal-learning zero-shot-classification

Updated May 11, 2024
Jupyter Notebook

OFA-Sys / Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

nlp computer-vision deep-learning transformers pytorch chinese pretrained-models multi-modal clip coreml-models contrastive-loss vision-language multi-modal-learning image-text-retrieval vision-and-language-pre-training

Updated Nov 29, 2023
Python

hcaptcha-challenger

QIN2DIM / hcaptcha-challenger

🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.

computer-vision solver yolo object-detection image-segmentation multi-modal clip opencv-python onnx hcaptcha multi-modal-learning onnxruntime playwright onnx-models yolov5 zero-shot-classification hcaptcha-solver

Updated Apr 20, 2024
Python

lyuchenyang / Macaw-LLM

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

machine-learning natural-language-processing deep-learning neural-networks language-model multi-modal-learning

Updated Apr 3, 2024
Python

jokieleung / awesome-visual-question-answering

A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.

vqa awesome-list multi-modal multi-modal-learning attention-networks

Updated Jul 6, 2023

NVlabs / prismer

The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".

vqa image-captioning language-model multi-task-learning vision-and-language multi-modal-learning vision-language-model

Updated Jan 17, 2024
Python

lucidrains / x-clip

A concise but complete implementation of CLIP with various experimental improvements from recent papers

deep-learning artificial-intelligence zero-shot-learning multi-modal-learning contrastive-learning

Updated Oct 16, 2023
Python

moabarar / nemar

[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation

deep-learning cnn pytorch multi-modal image-registration affine-transformation stn image-to-image-translation multimodal deformable-transformation multi-modal-learning cvpr2020 registartion multimodal-image-registration

Updated Aug 2, 2020
Python

zeta

kyegomez / zeta

Build high-performance AI models with modular building blocks

multi-platform deep-learning transformers pytorch artificial-intelligence transformer speech-recognition multi-modal multi-agent-systems multi-modal-learning gpt4 llama2 longnet

Updated May 13, 2024
Python

OpenRobotLab / EmbodiedScan

[CVPR 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

computer-vision robotics 3d-vision multi-modal-learning

Updated May 18, 2024
Python

DmitryRyumin / CVPR-2023-24-Papers

CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. ⭐ support visual intelligence development!

Updated May 18, 2024
Python

rentainhe / TRAR-VQA

[ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"

visualization pytorch transformer attention official multi-modal clevr visual-question-answering vision-and-language dynamic-network multi-modality multi-modal-learning multi-scale-features vqav2 iccv2021 local-and-global

Updated Oct 11, 2021
Python

Ysz2022 / NeRCo

[ICCV 2023] Implicit Neural Representation for Cooperative Low-light Image Enhancement

iccv low-light-image multi-modal-learning low-light-image-enhancement neural-representation iccv2023

Updated Mar 18, 2024
Python

zjukg / KG-MM-Survey

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

information-extraction survey knowledge-graph awsome image-classification image-generation surveys entity-linking knowledge-graph-embeddings visual-question-answering entity-alignment paper-list awsome-list cross-modal-retrieval multi-modal-learning multi-modal-fusion large-language-models multi-modal-knowledge-graph

Updated May 16, 2024

HyperDenseNet_pytorch

josedolz / HyperDenseNet_pytorch

Pytorch version of the HyperDenseNet deep neural network for multi-modal image segmentation

deep-learning pytorch segmentation image-segmentation medical-image-processing 3d-convolutional-network 3d-cnn pytorch-cnn medical-image-segmentation hyperdensenet multi-modal-imaging multi-modal-learning

Updated Nov 20, 2019
Python

likyoo / Multimodal-Remote-Sensing-Toolkit

A python tool to perform deep learning experiments on multimodal remote sensing data.

python pytorch remote-sensing multi-modal-learning

Updated Jan 23, 2022
Python

qizekun / ReCon

[ICML 2023] Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining

representation-learning 3d-point-clouds self-supervised-learning multi-modal-learning

Updated Mar 27, 2024
Python

huggingface / chug

Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.

computer-vision pdf-document datasets distributed-training dataloading document-understanding multi-modal-learning webdataset

Updated Apr 3, 2024
Python

depshad / Deep-Learning-Framework-for-Multi-modal-Product-Classification

Code repository for Rakuten Data Challenge: Multimodal Product Classification and Retrieval.

nlp computer-vision deep-learning pytorch multi-modal-learning rakuten-data-challenge

Updated May 10, 2021
Jupyter Notebook

jackyjsy / SAM-SLR-v2

SAM-SLR-v2 is an improved version of SAM-SLR for sign language recognition.

graph-convolutional-networks sign-language-recognition-system sign-language-recognition multi-modal-learning

Updated Oct 20, 2021
Python

Improve this page

Add a description, image, and links to the multi-modal-learning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multi-modal-learning topic, visit your repo's landing page and select "manage topics."