The open source implementation of the model from "Scaling Vision Transformers to 22 Billion Parameters"
-
Updated
May 17, 2024 - Python
The open source implementation of the model from "Scaling Vision Transformers to 22 Billion Parameters"
Implementation of Qformer from BLIP2 in Zeta Lego blocks.
Limit the use of end-to-end data for Speech Translation (by leveraging Automatic Speech Recognition and Machine Translation data instead) using zero-shot multilingual text translation techniques.
This repository provides a streamlit application that enables a user to upload a screenshot which will than be queried against a database of PDF documents. Both the image structure as well as the (possibly) included text are used to find matching documents for a self defined set.
Code for ASGEA: Exploiting Logic Rules from Align-Subgraphs for Entity Alignment
Repository for air water and land surveillance robot developed as a part of DRDO Robotics and Unmanned Systems Exposition.
This library provides packages on DoubleML / Causal Machine Learning and Neural Networks in Python for Simulation and Case Studies.
Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta
Omni-Modality Processing, Understanding, and Generation
Project based on VQA (Visual Question Answering), one of tasks of Multi-Modal
Multi-modal data augmentation for machine learning
MMIR: Multimodal Image Registration
Repository For [ICASSP2023] [VISION, DEDUCTION AND ALIGNMENT: AN EMPIRICAL STUDY ON MULTI-MODAL KNOWLEDGE GRAPH ALIGNMENT]
Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter"
Open-Souce Multi-discipline Agentic Framework that adapt and leverage LLMs for your unique needs.
RnD Topic: Object Detection in Adverse Weather Conditions using Tightly-coupled Data-driven Multi-modal Sensor Fusion
Code the ICML 2024 paper: "EMC^2: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence"
Kedro pipelines for multimodal ML in TensorFlow.
A Knowledge Network implementation from Knowledge Graphs
Implementation of M2PT in PyTorch from the paper: "Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities"
Add a description, image, and links to the multi-modal topic page so that developers can more easily learn about it.
To associate your repository with the multi-modal topic, visit your repo's landing page and select "manage topics."