Skip to content

IDEA-Research/UniPose

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

27 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🀩 News

  • 2024.02.14: We update a file to highlight all classes (1237 classes) in the UNIKPT dataset.
  • 2023.11.28: We are excited to highlight the 68 face keypoints detection ability of UniPose across any categories in this figure. The definition of face keypoints follows this dataset.
  • 2023.11.9: Thanks to OpenXLab, you can try a quick online demo. Looking forward to the feedback!
  • 2023.11.1: We release the inference code, demo, checkpoints, and the annotation of the UniKPT dataset.
  • 2023.10.13: We release the arxiv version.

In-the-wild Test via UniPose

UniPose has strong fine-grained localization and generalization abilities across image styles, categories, and poses.


Detecting any Face Keypoints:


πŸ—’ TODO

  • Release inference code and demo.
  • Release checkpoints.
  • Release UniKPT annotations.
  • Release training codes.

πŸ’‘ Overview

β€’ UniPose is the first end-to-end prompt-based keypoint detection framework.


β€’ It supports multi-modality prompts, including textual and visual prompts to detect arbitrary keypoints (e.g., from articulated, rigid, and soft objects).

Visual Prompts as Inputs:


Textual Prompts as Inputs:


πŸ”¨ Environment Setup

  1. Clone this repo
git clone https://github.com/IDEA-Rensearch/UniPose.git
cd UniPose
  1. Install the needed packages
pip install -r requirements.txt
  1. Compiling CUDA operators
cd models/UniPose/ops
python setup.py build install
# unit test (should see all checking is True)
python test.py
cd ../../..

β–Ά Demo

1. Guidelines

β€’ We have released the textual prompt-based branch for inference. As the visual prompt involves a substantial amount of user input, we are currently exploring more user-friendly platforms to support this functionality.

β€’ Since UniPose has learned strong structural prior, it's best to use the predefined skeleton as the keypoint textual prompts, which are shown in predefined_keypoints.py.

β€’ If users don't provide a keypoint prompt, we'll try to match the appropriate skeleton based on the user's instance category. If unsuccessful, we'll default to using the animal's skeleton, which covers a wider range of categories and testing requirements.

2. Run

Replace {GPU ID}, image_you_want_to_test.jpg, and "dir you want to save the output" with appropriate values in the following command

CUDA_VISIBLE_DEVICES={GPU ID} python inference_on_a_image.py \
-c config/UniPose_SwinT.py \
-p weights/unipose_swint.pth \
-i image_you_want_to_test.jpg \
-o "dir you want to save the output" \
-t "instance categories" \ (e.g., "person", "face", "left hand", "horse", "car", "skirt", "table")
-k "keypoint_skeleton_text" (If necessary, please select an option from the 'predefined_keypoints.py' file.)

We also support the inference using gradio.

python app.py

Checkpoints

name backbone Keypoint AP on COCO Checkpoint Config
1 UniPose Swin-T 74.4 Google Drive / OpenXLab GitHub Link
2 UniPose Swin-L 76.8 Coming Soon Coming Soon

The UniKPT Dataset


Datasets KPT Class Images Instances Unify Images Unify Instance
COCO 17 1 58,945 156,165 58,945 156,165
300W-Face 68 1 3,837 4,437 3,837 4,437
OneHand10K 21 1 11,703 11,289 2,000 2000
Human-Art 17 1 50,000 123,131 50,000 123,131
AP-10K 17 54 10,015 13,028 10,015 13,028
APT-36K 17 30 36,000 53,006 36,000 53,006
MacaquePose 17 1 13,083 16,393 2,000 2,320
Animal Kingdom 23 850 33,099 33,099 33,099 33,099
AnimalWeb 9 332 22,451 21,921 22,451 21,921
Vinegar Fly 31 1 1,500 1,500 1,500 1,500
Desert Locust 34 1 700 700 700 700
Keypoint-5 55/31 5 8,649 8,649 2,000 2,000
MP-100 561/293 100 16,943 18,000 16,943 18,000
UniKPT 338 1237 - - 226,547 418,487

β€’ UniKPT is a unified dataset from 13 existing datasets, which is only for non-commercial research purposes.

β€’ All images included in the UniKPT dataset originate from the datasets listed in the table above. To access these images, please download them from the original repository.

β€’ We provide the annotations with precise keypoints' textual descriptions for effective training. More conveniently, you can find the text annotations in the link.

Citing UniPose

If you find this repository useful for your work, please consider citing it as follows:

@article{yang2023unipose,
  title={UniPose: Detection Any Keypoints},
  author={Yang, Jie and Zeng, Ailing and Zhang, Ruimao and Zhang, Lei},
  journal={arXiv preprint arXiv:2310.08530},
  year={2023}
}
@inproceedings{yang2023neural,
  title={Neural Interactive Keypoint Detection},
  author={Yang, Jie and Zeng, Ailing and Li, Feng and Liu, Shilong and Zhang, Ruimao and Zhang, Lei},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={15122--15132},
  year={2023}
}
@inproceedings{yang2022explicit,
  title={Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation},
  author={Yang, Jie and Zeng, Ailing and Liu, Shilong and Li, Feng and Zhang, Ruimao and Zhang, Lei},
  booktitle={The Eleventh International Conference on Learning Representations},
  year={2022}
}

About

Official implementation of the paper "UniPose : Detecting Any Keypoints"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published