Image Recommendation for Wikipedia Articles
-
Updated
May 22, 2021 - Jupyter Notebook
Image Recommendation for Wikipedia Articles
Create a large, well-managed and clean data-set for the task of music composition for video soundtracks.
All experiments were done to classify multimodal data.
Official Git repository for "Hakimov, S., and Schlangen, D., (2023). Images in Language Space: Exploring the Suitability of Large Language Models for Vision & Language Tasks. Findings of the Association for Computational Linguistics (ACL 2023 Findings)"
AI-multimodal : Modeling the new text - video retrieval framework
Pre-Processing of Annotated Music Video Corpora (COGNIMUSE and DEAP)
Data and code of the Findings of EMNLP'23 paper MuG: A Multimodal Classification Benchmark on Game Data with Tabular, Textual, and Visual Fields
Collects a multimodal dataset of Wikipedia articles and their images
Real-world photo sequence question answering system (MemexQA). CVPR'18 and TPAMI'19
Code and data to evaluate LLMs on the ENEM, the main standardized Brazilian university admission exams.
[Paperlist] Awesome paper list of multimodal dialog, including methods, datasets and metrics
This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl.acm.org/doi/abs/10.1145/3617833 .
Compose multimodal datasets 🎹
[NeurIPS 2023 Oral] Quilt-1M: One Million Image-Text Pairs for Histopathology.
500,000 multimodal short video data and baseline models. 50万条多模态短视频数据集和基线模型(TensorFlow2.0)。
Pytorch implementation of Multimodal Fusion Transformer for Remote Sensing Image Classification.
This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers". As a part of this release we share the information about recent multimodal datasets which are available for research purposes. We found that although 100+ multimodal language resources are available…
LAVIS - A One-stop Library for Language-Vision Intelligence
Add a description, image, and links to the multimodal-datasets topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-datasets topic, visit your repo's landing page and select "manage topics."