Skip to content

Multimodal and multilingual topic model with pretrained embeddings

License

Notifications You must be signed in to change notification settings

ezosa/M3L-topic-model

Repository files navigation

Code for our COLING 2022 paper Multilingual and Multimodal Topic Modelling with Pretrained Embeddings

Abstract

We present M3L-Contrast--—a novel multimodal multilingual (M3L) neural topic model for comparable data that maps multilingual texts and images into a shared topic space using a contrastive objective. As a multilingual topic model, it produces aligned language-specific topics and as multimodal model, it infers textual representations of semantic concepts in images. We also show that our model performs almost as well on unaligned embeddings as it does on aligned embeddings.

Our proposed topic model is:

  • multilingual
  • multimodal (image-text)
  • multimodal and multilingual (M3L)

Our model is based on the Contextualized Topic Model (Bianchi et al., 2021)

We use the PyTorch Metric Learning library for the InfoNCE/NTXent loss

Model architecture

Dataset

  • Aligned articles from the Wikipedia Comparable Corpora
  • Images from the WIT dataset
  • We will release the article titles and image urls in the train and test sets (soon!)

Talks and slides

  • Slides and video from my talk at the Helsinki Language Technology seminar

Trained models

We shared some of the models we trained:

  • M3L topic model trained with CLIP embeddings for texts and images
  • M3L topic model trained with multilingual SBERT for text and CLIP for images
  • M3L topic model trained with monolingual SBERT models for the English and German texts and CLIP for images

Citation

@inproceedings{zosa-pivovarova-2022-multilingual,
    title = "Multilingual and Multimodal Topic Modelling with Pretrained Embeddings",
    author = "Zosa, Elaine  and  Pivovarova, Lidia",
    booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
    month = oct,
    year = "2022",
    address = "Gyeongju, Republic of Korea",
    publisher = "International Committee on Computational Linguistics",
    url = "https://aclanthology.org/2022.coling-1.355",
    pages = "4037--4048",
}

About

Multimodal and multilingual topic model with pretrained embeddings

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published