Skip to content

Ensemble is a machine learning concept in which multiple models are trained using the same learning algorithm

Notifications You must be signed in to change notification settings

harjeet-blue/Ensemble-Learning-Bagging-and-Boosting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Ensemble-Learning-Bagging-and-Boosting

Ensemble is a machine learning concept in which multiple models are trained using the same learning algorithm. Bagging is a way to decrease the variance in the prediction by generating additional data for training from dataset using combinations with repetitions to produce multi-sets of the original data. Boosting is an iterative technique which adjusts the weight of an observation based on the last classification. If an observation was classified incorrectly, it tries to increase the weight of this observation. Boosting in general builds strong predictive models.

In this repository we have applied bagging and gradient boosting on decision trees on MNIST datasets and have obtained the accuracy of 90%

Bagging

Bagging is used when the goal is to reduce the variance of a decision tree classifier. Here the objective is to create several subsets of data from training sample chosen randomly with replacement. Each collection of subset data is used to train their decision trees. As a result, we get an ensemble of different models. Average of all the predictions from different trees are used which is more robust than a single decision tree classifier Bagging To know more about Bagging technique | Bagging

Gradient Boosting

Gradient Boosting is a popular boosting algorithm. In gradient boosting, each predictor corrects its predecessor’s error. In contrast to Adaboost, the weights of the training instances are not tweaked, instead, each predictor is trained using the residual errors of predecessor as labels. Boosting Boosting To know more about Gradient Boosting | Boosting

MNIST and FMNIST Dataset

MNIST Dataset

The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning. MNIST To know more about MNIST dataset | MNIST

FMNIST Dataset

The Fashion MNIST dataset is an alternative to the standard MNIST dataset. Instead of handwritten digits, it contains 70000 28x28 grayscale images of ten types of fashion items. FMNIST To know more about FMNIST dataset | FMNIST