Skip to content

This repository provides an official implementation for the paper MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild.

Notifications You must be signed in to change notification settings

katerynaCh/MMA-DFER

Repository files navigation

PWC PWC

MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild

This repository provides an official implementation for the paper MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild.

a

Installation

Please create an environment with Python 3.10 and use requirements file to install the rest of the libraries

pip install -r reqiurements.txt

Data preparation

We provide the codes for DFEW and MAFW datasets, which you would need to download. Then, please refer to DFER-CLIP repository for transforming the annotations that are provided in annotations/ folder to your own paths. To extract faces from MAFW dataset, please refer to data_utils that has an example of face detection pipeline.

You will also need to download pre-trained checkpoints for vision encoder from https://github.com/FuxiVirtualHuman/MAE-Face and for audio encoder from https://github.com/facebookresearch/AudioMAE Please extract them and rename the audio checkpoint to 'audiomae_pretrained.pth'. Both checkpoints are expected to be in root folder.

Running the code

The main script in main.py. You can invoke it through running:

./train_DFEW.sh
./train_MAFW.sh

Evaluation

You can download pre-trained models on DFEW from here. Please respect the dataset license when downloading the models! Evaluation can be done as follows:

python evaluate.py --fold $FOLD --checkpoint $CHECKPOINT_PATH --img-size $IMG_SIZE

References

This repository is based on DFER-CLIP https://github.com/zengqunzhao/DFER-CLIP. We also thank the authors of MAE-Face https://github.com/FuxiVirtualHuman/MAE-Face and Audiomae https://github.com/facebookresearch/AudioMAE

Citation

If you use our work, please cite as:

@article{chumachenko2024mma, title={MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild}, author={Chumachenko, Kateryna and Iosifidis, Alexandros and Gabbouj, Moncef}, journal={arXiv preprint arXiv:2404.09010}, year={2024} }

About

This repository provides an official implementation for the paper MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published