simple_object_localization_app

This project is to localization and predict an object in the image note: this project only detect cucumber, eggplant, and mushroom due the dataset that I used only contains those object. I also using flask as a backend to create an API and html as an interface to make a web from it.

Dataset

You can get the dataset from Kaggle - Image Localization Dataset, The dataset contains object image with jpg format and xml file is contains annotation from the corresponding images.

Notebook

I built the model in .ipynb file, I used google colab to helped me built the model and this is the explanation about the .ipynb file:

I test to plot image with the bounding box, I done this using xml.etree.ElementTree library to extract xml fit corresponding image, I extract xmin, ymin, xmax, and ymax from xml file and plot the bounding box around the image using cv2.rectangle() with xmin, ymin, xmax, and ymax from the xml files, and this is the result

Then I read all xml files to extract label, xmin, ymin, xmax, and ymax from those xml files and append them into list. I encode the categorical value into numerical value {"cucumber": 0, "eggplant": 1, "mushroom": 2}, I also read all image files and append the image into list
I used np.array() to convert the lists of image files and outputs (contains label, xmin, ymin, xmax, and ymax)
Then I split inputs and outputs array into x_train, x_test, y_train, and y_test, using sklearn.model_selection.train_test_split() with parameters as follows test_size = 0.3 and random_state = 42)
Because y_train and y_test has 5 values contains (label, xmin, ymin, xmax, and ymax) I seperate label with other values (coordinate xmin, ymin, xmax, and ymax to build the bounding box) because our model will have 2 outputs (labels and bounding box coordinate) and 1 input (image array).
I encode the labels using tf.keras.utils.to_categorical()
For the model I used pretrained model MobileNetV2 with input_shape = (224,224,3), with 3 classes, weight = 'imagenet' and include_top = False
then I added pretrained model into my own layers, I also compile the model with optimizers = Adam(lr=1e-4), loss function has 2 loss for classification is categorical_crossentropy and for bounding box is mse, also in metrics I used 2 metrics, for classification is accuracy and bounding box is mse. Then I fit the model with 50 epochs, and I get this result

I saved the model to used in API later
I test the model to predict image and got predict object localization as follows:

Web APP

For the web app I have:

app.py for my backend and build API
static folder for save static files like image and predicted image
template folder to save html or front end folder

Here's the result

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
web		web
README.md		README.md
object_localization.ipynb		object_localization.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

web

web

README.md

README.md

object_localization.ipynb

object_localization.ipynb

Repository files navigation

simple_object_localization_app

Dataset

Notebook

Web APP

About

Releases

Packages

Languages

lixx21/simple_object_localization_app

Folders and files

Latest commit

History

Repository files navigation

simple_object_localization_app

Dataset

Notebook

Web APP

About

Topics

Resources

Stars

Watchers

Forks

Languages