[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
-
Updated
Jan 23, 2024 - Python
[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Video Foundation Models & Data for Multimodal Understanding
A collection of literature after or concurrent with Masked Autoencoder (MAE) (Kaiming He el al.).
[Survey] Masked Modeling for Self-supervised Representation Learning on Vision and Beyond (https://arxiv.org/abs/2401.00897)
Official Codes for "Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality"
SimpleClick: Interactive Image Segmentation with Simple Vision Transformers (ICCV 2023)
PyTorch implementation of BEVT (CVPR 2022) https://arxiv.org/abs/2112.01529
reproduction of semantic segmentation using masked autoencoder (mae)
[CVPR2023] Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning (https://arxiv.org/abs/2212.04500)
Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations
[CVPR'23] Hard Patches Mining for Masked Image Modeling
Official Implementation of the CrossMAE paper: Rethinking Patch Dependence for Masked Autoencoders
Unofficial PyTorch implementation of Masked Autoencoders that Listen
Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
[SIGIR'2023] "MAERec: Graph Masked Autoencoder for Sequential Recommendation"
[NeurIPS 2022 Spotlight] VideoMAE for Action Detection
Official implementation of "A simple, efficient and scalable contrastive masked autoencoder for learning visual representations".
Multi-scale Transformer Network for Cross-Modality MR Image Synthesis (IEEE TMI)
Add a description, image, and links to the masked-autoencoder topic page so that developers can more easily learn about it.
To associate your repository with the masked-autoencoder topic, visit your repo's landing page and select "manage topics."