An easy to use face detection API based on the RetinaFace architecture with a ResNet50 backbone.
Influenced by biubug6's implementation and optimized for inference and simplicity of integration.
All the dependencies have been packaged into a Dockerfile. To run the server inside Docker, you can:
docker build . -t face_detection
to build the image.docker run face_detection
to run the image and start the server.
Alternatively, just run docker compose up --build
.
If you want to run it locally instead, you can:
pip install -r requirements.txt
to install dependencies. Recommend using venv or Conda here.uvicorn face_detection.api:app --host 0.0.0.0 --port 8080
to start the server.
Batch Size / Image Size | 256x256 | 512x512 | 1024x1024 | 2048x2048 |
---|---|---|---|---|
1 | 44ms | 94ms | 303ms | 378ms |
4 | 23ms | 69ms | 267ms | 344ms |
16 | 17ms | 63ms | 261ms | 342ms |
Batch Size / Image Size | 256x256 | 512x512 | 1024x1024 | 2048x2048 |
---|---|---|---|---|
1 | 10ms | 17ms | 38ms | 100ms |
4 | 4ms | 8ms | 23ms | 83ms |
16 | 3ms | 6ms | 22ms | 80ms |
* On a machine with a Intel 14900K CPU, Nvidia 4090 GPU, and 32GB of DDR5 RAM.
Benchmark Name | Accuracy |
---|---|
WIDER FACE (Easy) | 95% |
WIDER FACE (Medium) | 94% |
WIDER FACE (Hard) | 84% |
When you start the server, it will listen on port 8080 by default. You can navigate to http://localhost:8080/docs
and use the GUI to send test requests to the API.
The following endpoints are available:
/detect_faces_from_urls/
: Provide a list of image URLs and the server will try to download them./detect_faces_from_base64/
: Provide images encoded as base64 strings and the server will decode them//detect_faces_from_files/
: Provide images directly as form-data./ready/
: Returns HTTP 200 if the server is initialized and ready.
An example response looks like this:
{
"result": [
{
"faces_count": 1,
"faces": [
{
"bounding_box": [
287.1794128417969,
147.63372802734375,
771.133056640625,
771.7412719726562
],
"landmarks": [
{
"position": [
451.84375,
378.625
],
"type": "left_eye"
},
{
"position": [
666.25,
376.875
],
"type": "right_eye"
},
{
"position": [
585.8125,
518.03125
],
"type": "nose"
},
{
"position": [
465.09375,
605.375
],
"type": "left_mouth"
},
{
"position": [
647.25,
602.1875
],
"type": "right_mouth"
}
],
"confidence": 0.9999823570251465
}
]
}
]
}
The bounding box is represented by four numbers [x1, y1, x2, y2]
which are the coordinates of the top-left and bottom-right vertices.
If you don't need a server but want to do inference directly in your own code, you can also use this repo as a library:
from PIL import Image
from face_detection.service import detect_faces
image = Image.open("some_image.jpg")
results = detect_faces([image])
There are some basic tests under face_detection/tests
.
They can be run with python -m pytest .
from inside the directory.
@misc{deng2019retinaface,
title={RetinaFace: Single-stage Dense Face Localisation in the Wild},
author={Jiankang Deng and Jia Guo and Yuxiang Zhou and Jinke Yu and Irene Kotsia and Stefanos Zafeiriou},
year={2019},
eprint={1905.00641},
archivePrefix={arXiv},
primaryClass={cs.CV}
}