Audio and Image Recognition

This work consists in the developement of several Machine Learning and Deep Learning models for recognition of audio and images. We develop two tasks:

Audio Recognition, we try to identify the identity of the speaker starting from a mono-dimensional signal;
Face Recognition, we aims to identify the indentity of the subject into an image.

First, we collected manually the audios and the images used for train our models. In particular we recorded 300 audio of five seconds (next segmented into overlapping smaller fragments of 2 seconds) and 300 images containing faces with different facial expressions and different light conditions.

For the task of audio recognition we experimented different approaches:

SVM;
RF;
Neural Network from Scratch;
CNN using the spectrograms of the audios.

For the task of face recognition we used the Fine Tuning technique testing different CNN including:

MobileNetV2;
VGG;
VGGface.

About us:

Ginevra Mariani - enigarv

Lorenzo Mora - lomProg

Riccardo Confalonieri - rconfa

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
AudioAcquisition.ipynb		AudioAcquisition.ipynb
AudioRecognition.ipynb		AudioRecognition.ipynb
FaceRecognition.ipynb		FaceRecognition.ipynb
ImageAcquisition.ipynb		ImageAcquisition.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio and Image Recognition

About

Languages

enigarv/Audio_Image_Management

Folders and files

Latest commit

History

Repository files navigation

Audio and Image Recognition

About

Topics

Resources

Stars

Watchers

Forks

Languages