Skip to content

This repository is a small POC for implementing a DVC pipeline on the classification of Fashion MNIST data sets using CNN's. Also, implemented is the CI-CD pipeline through GitHub actions which runs the whole DVC pipeline in the background when any changes are pushed to the repository.

License

Notifications You must be signed in to change notification settings

vmalgi/Fashion_MNIST_DVC

Repository files navigation

Aim of this small POC

The aim of the small POC is to create an end to end machine learning pipeline for classifying the Fashion MNIST images using DVC (Data version control) framework and then deploy the whole ML Pipeline using Github actions as CI-CD pipeline. Additional goal is to get an understanding of how ML pipelines work using state of the art framework like DVC and view the results of the model on DVC studio. This POC can be further extended to deploy the ML pipeline on Amazon AWS.

About the data set

Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. Zalando intends Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.

Steps to recreate this experiment

  1. Create a conda environment using VSCode first in your respective directory/ you can clone this repository itself.
conda create --prefix ./env python=3.7 -y
  1. Activate the conda environment
conda activate ./env

OR

source activate ./env
  1. install the requirements
pip install -r requirements.txt
  1. initialize the dvc project
dvc init
  1. Run the ML pipeline using the command
dvc repro
  1. View the results of the model using DVC interactive studio image

  2. View the ML pipeline setup using the command

dvc dag

ML-Pipeline-FMNIST

Note :

  1. dvc needs to be installed first before running dvc repro. dvc can be installed using pip install dvc
  2. Experiment results can be viewed in the Interactive studio using the link (https://studio.iterative.ai)
  3. Using Continous Machine Learning (CML) CI-CD pipelines can be created in the github (https://github.com/iterative/cml#getting-started)

About

This repository is a small POC for implementing a DVC pipeline on the classification of Fashion MNIST data sets using CNN's. Also, implemented is the CI-CD pipeline through GitHub actions which runs the whole DVC pipeline in the background when any changes are pushed to the repository.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages