Howo to recognize more than one face on one image? #216

bit-scientist · 2023-12-07T09:00:43Z

An inference pipeline given in infer deals with one face per image. I have several images where each image has many people. I would like to recognize all unique persons individually on all images.
I have made an exemplanary case below:

_{(images belong to their respective owners)}

Here, I would like to assign IDs (0 to 5) to each character on these given images. I have been thinking of how I can accomplish this.

For now, I could think of one way where I loop all images through mtcnn (keep_all=True) to get face crops (6 per image), calculate their embeddings with resnet, then calculate their distance matrix using:

dists = [[(e1 - e2).norm().item() for e2 in embeddings] for e1 in embeddings]
print(pd.DataFrame(dists, columns=names, index=names))

But I don't know what to do with these numbers afterwards.
Since each face crop (for now) is considered unique, I will get 36x36 matrix, right?
How to proceed from here?

I hope to get some help as I am quite new to face recognition domain. Thanks!

The text was updated successfully, but these errors were encountered:

Aldhanekaa · 2024-03-21T08:04:45Z

Hello @bit-scientist, I just figured out on how to recognise more than one face using this Pytorch Facenet's Face Recognition approach.

As the examples gives in infer.py, it gets individual face embeddings from each person image and save it inside list as the code below :

aligned = []
names = []
for x, y in loader:
    x_aligned, prob = mtcnn(x, return_prob=True)
    if x_aligned is not None:
        print('Face detected with probability: {:8f}'.format(prob))
        aligned.append(x_aligned)
        names.append(dataset.idx_to_class[y])

aligned = torch.stack(aligned).to(device)
embeddings = resnet(aligned).detach().cpu()

Then, you can save both embeddings and names as model file :

data = [embeddings, names] 
torch.save(data, 'Facenet Pytorch Finetuning embeddings.pt') # saving data.pt file

I tried to classify 5 different people with dozens of image on each people and it only took 1.5 mb size of model

Next up, here is how to use the model;

# importing libraries

from facenet_pytorch import MTCNN, InceptionResnetV1
import torch
from torchvision import datasets
from torch.utils.data import DataLoader
from PIL import Image
import cv2
import time
import os


"""Initializing global variables"""

def get_device():
    """Returns the best available device for PyTorch."""

    if torch.backends.mps.is_available():
        device = torch.device("mps")
    elif torch.cuda.is_available():
        device = torch.device("cuda:0")
    else:
        device = torch.device("cpu")

    return device
device = get_device()
load_data = torch.load('Facenet Pytorch Finetuning embeddings.pt',map_location=device) 
embedding_list = load_data[0] 
name_list = load_data[1] 
print(embedding_list.shape)

resnet = InceptionResnetV1(pretrained='vggface2',).eval()
# resnet.to(device)
mtcnn = MTCNN(image_size=160,margin=0,min_face_size=20,
    thresholds=[0.6,0.7,0.7],factor=0.709,post_process=True,keep_all=True ) # keep_all=True


cam = cv2.VideoCapture(0) 

while True:
    ret, frame = cam.read()
    if not ret:
        print("fail to grab frame, try again")
        break
        
    img = Image.fromarray(frame)
    img_cropped_list, prob_list = mtcnn(img, return_prob=True) 
    
    if img_cropped_list is not None:
        boxes, _ = mtcnn.detect(img)
                
        for i, prob in enumerate(prob_list):
            if prob>0.90:
                emb = resnet(img_cropped_list[i].unsqueeze(0)).detach() 
                
                dist_list = [] # list of matched distances, minimum distance is used to identify the person
                
                for idx, emb_db in enumerate(embedding_list):
                    dist = torch.dist(emb.to(device=device), emb_db).item()
                    dist_list.append(dist)

                min_dist = min(dist_list) # get minumum dist value
                min_dist_idx = dist_list.index(min_dist) # get minumum dist index
                name = name_list[min_dist_idx-1] # get name corrosponding to minimum dist
                
                box = boxes[i] 
                original_frame = frame.copy() # storing copy of frame before drawing on it

                if min_dist<0.65:
                    min_dist = 1 - min_dist
                    frame = cv2.putText(frame, name+' '+str(min_dist), (int(box[0]),int(box[1])), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,255,0),1)
                    print(f"{name} {min_dist}")
                else:
                    frame = cv2.putText(frame, "Unknown", (int(box[0]),int(box[1])), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,255,0),1)
                
                frame = cv2.rectangle(frame, (int(box[0]),int(box[1])) , (int(box[2]),int(box[3])), (255,0,0), 2)

    cv2.imshow("IMG", frame)
    if cv2.waitKey(1) == ord('q'):
        break        
        
        
cam.release()
cv2.destroyAllWindows()

Thanks! Hope it helps 🤠

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Howo to recognize more than one face on one image? #216

Howo to recognize more than one face on one image? #216

bit-scientist commented Dec 7, 2023

Aldhanekaa commented Mar 21, 2024

Howo to recognize more than one face on one image? #216

Howo to recognize more than one face on one image? #216

Comments

bit-scientist commented Dec 7, 2023

Aldhanekaa commented Mar 21, 2024