The purpose of this repository is to develop the fundamental ML algorithms from scratch and to study and review their concepts. ** It is also important to mention that this idea caught me in times of global quarantine. **
-
Naive Bayes Multinomial/Bernoulli: Use only categorical features (both binomial or multinomial distributions) to do the classification.
-
Naive Bayes Gaussian: Use only numerical features (with the assumption of normallity of the data) to do the classification.
- ID3 (Iterative Dichotomiser 3): was developed in 1986 by Ross Quinlan. The algorithm creates a multiway tree, finding for each node (i.e. in a greedy manner) the categorical feature that will yield the largest information gain for categorical targets. Trees are grown to their maximum size and then a pruning step is usually applied to improve the ability of the tree to generalise to unseen data.
- Euclidean Distance K Nearest Neighbours : is a non-parametric method proposed by Thomas Cover used for classification and regression. In both cases, the input consists of the k closest training examples in the feature space.
-
Simple Linear Regression: it is a simplified version of the regression method that can be used in place of ordinary least squares. It consists of a sample of two-dimensional points with an independent variable and a dependent variable.
-
Least Squares Regression: is a linear model function that fits in the least-squares sense in order to minimizes the sum of squared residuals. Includes both: using simple derivatives or gradient descendent training ( with Ridge, Lasso regularization).
TODO
TODO
TODO