MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild

This repository provides an official implementation for the paper MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild.

Installation

Please create an environment with Python 3.10 and use requirements file to install the rest of the libraries

pip install -r reqiurements.txt

Data preparation

We provide the codes for DFEW and MAFW datasets, which you would need to download. Then, please refer to DFER-CLIP repository for transforming the annotations that are provided in annotations/ folder to your own paths. To extract faces from MAFW dataset, please refer to data_utils that has an example of face detection pipeline.

You will also need to download pre-trained checkpoints for vision encoder from https://github.com/FuxiVirtualHuman/MAE-Face and for audio encoder from https://github.com/facebookresearch/AudioMAE Please extract them and rename the audio checkpoint to 'audiomae_pretrained.pth'. Both checkpoints are expected to be in root folder.

Running the code

The main script in main.py. You can invoke it through running:

./train_DFEW.sh

./train_MAFW.sh

Evaluation

You can download pre-trained models on DFEW from here. Please respect the dataset license when downloading the models! Evaluation can be done as follows:

python evaluate.py --fold $FOLD --checkpoint $CHECKPOINT_PATH --img-size $IMG_SIZE

References

This repository is based on DFER-CLIP https://github.com/zengqunzhao/DFER-CLIP. We also thank the authors of MAE-Face https://github.com/FuxiVirtualHuman/MAE-Face and Audiomae https://github.com/facebookresearch/AudioMAE

Citation

If you use our work, please cite as:

@article{chumachenko2024mma, title={MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild}, author={Chumachenko, Kateryna and Iosifidis, Alexandros and Gabbouj, Moncef}, journal={arXiv preprint arXiv:2404.09010}, year={2024} }

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
AudioMAE		AudioMAE
annotation		annotation
data_utils		data_utils
dataloader		dataloader
models		models
README.md		README.md
evaluate.py		evaluate.py
fff.drawio.png		fff.drawio.png
main.py		main.py
requirements.txt		requirements.txt
scheduler.py		scheduler.py
train_DFEW.sh		train_DFEW.sh
train_MAFW.sh		train_MAFW.sh

katerynaCh/MMA-DFER

Folders and files

Latest commit

History

Repository files navigation

MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild

Installation

Data preparation

Running the code

Evaluation

References

Citation

About

Topics

Resources

Stars

Watchers

Forks

Languages