Food Recognition using Deep Learning

This repository contains a description in both french and english.

Food Recognition using Deep Learning

The main goal of this project is the recognition of food based on a picture of these. This project is a starter for a bigger project aiming to help diabetic people having an estimation of the glucides they are eating, based on a cellphone picture.

The source code of this project is available there.

Useful ressources

Work done

Discovering the object detection world

Discovery

Searching for food datasets
Documentation about various deep learning algorithms
Understanding the Food-101 dataset

First experience

Setting up a IBM Cloud Storage instance
Upload a small subset on it
Using the Cloud Annotation Tool on it
Training a SSD MobileNet model on an IBM Cloud Computing instance
Get the model back
Install and configure a ReactJS app to test the model directly on the webcam

The subset was way too small to have good results. The detections were completely chaotic. Moreover, the choice to be able to detect food based on a video (or webcam) is not relevant, we don't need to sacrifice performance for calculus time on our use case.

Creating a custom dataset

I extracted around 15 classes from the Food101 dataset, choosing those having a high chance of containing other classes in their images, so that I could avoid spending too much time scrapping and annotation time (ex: burger, ketchup and fries are often together !).

I scrapped Google Images to obtain custom classes data like "salad" using the download tool Bulk Image Downloader.

I obtained around 1900 images dispatched in 19 classes, therefore at least 100 images per class. I then resized these into a 250x250 pixels format.

For the annotation, I used the LabelImg tool, saving boundary boxes as PascalVOC format.

Setting up an AWS GPU optimised instance

Following many problems in the setup and configuration of Tensorflow compiler on Windows, I decided to work on Linux. I ordered a virtual machine at Amazon Web Services, its type is g3.4xlarge, whoch is optimised for GPU computation (16 cores, 47 ECU, 122Gio memory). I then obtained a working environment.

Formatting & splitting data

Send the dataset to the remote server

scp -i <key_path> -r <user>@<server>:<remote_path> <local_path>

Conversion of annotations from PascalVOC to CSV

python voc_to_csv.py

Writting a splitting script for train and test subsets

python labels_test_train_split.py

Compiling the dataset into a tfrecord file, usable bu Tensorflow. This has to be done two times, for training and testing set.

python generate_tfrecord.py

Creating a class map file

Model

We are using the transfer learning method to initialise our model based on an already-trained model, saving a lot a time and pain.

We had the choice between lighter models and more accurate ones. I chose te second option following my first experience described above. The model used is Inception v2 trained on the Coco dataset. I decided that time was not really a factor in my use case.

Setup

Clone the project from the TensorFlow GitHub
Integrate to ours, and configure the pipeline

Training

We can launch the training that way:

python object_detection/model_main.py \
    --pipeline_config_path=training/inception_v2.config \
    --model_dir=<RESULT_MODEL_PATH>/ \
    --num_train_steps=<NUM_TRAIN_STEP> \
    --sample_1_of_n_eval_examples=1 \
    --alsologtostderr

Let's supervise the training with TensorBoard (evaluation jobs):

# ssh into the server and tunnel the ports
ssh -L 127.0.0.1:<local_port>:127.0.0.1:<remote_port> <user>@<server>
# launch tensorboard
cd path/to/project
tensorboard log_dir=.

Finally lets save the model as a .pb file:

python export_inference_graph.py 
    --input_type image_tensor 
    --pipeline_config_path training/inception_v2.config
    --trained_checkpoint_prefix <last_checkpoint_path>
    --output_directory <out_directory>

Roadmap

Conversion script to convert the model from a .pb file format to JSON, for it to be usable on the web
Modify the demo app so we can test the new model on static image and not webcam
Make sure we chose the best fitting algorithm for our use case
Optimise the chosen model
Develop an API serving our model to the world
Refactor the architecture of our project so that it is more clean
Add classes to be recognised ?

Conclusion

I encountered many difficulties to setup a working environment on Windows 10, moreover the custom dataset creation is a very boring and repetitive task.

I took a lot of pleasure setting up my own unique dataset, and to put it to pratice on a Deep Learning algorithm. Finally, despite many possible optimisations, the model has correct performances, and that makes me happy.

Author

Georges Cosson : LinkedIn - GitHub

Licence

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
annotations		annotations
data		data
images		images
models		models
report		report
scripts		scripts
servers		servers
training		training
.gitignore		.gitignore
LICENSE		LICENSE
README.fr.md		README.fr.md
README.md		README.md
package-lock.json		package-lock.json
sem2.md		sem2.md

License

stressGC/food-deep-learning-detection

Folders and files

Latest commit

History

Repository files navigation

Food Recognition using Deep Learning

Useful ressources

Work done

Discovering the object detection world

Discovery

First experience

Creating a custom dataset

Setting up an AWS GPU optimised instance

Formatting & splitting data

Model

Setup

Training

Roadmap

Conclusion

Author

Licence

About

Topics

Resources

License

Stars

Watchers

Forks

Languages