Skip to content

Logistic Regression is one of the basic yet complex machine learning algorithm. This is often the starting point of a classification problem. This repository will help in understanding the theory/working behind logistic regression and the code will help in implementing the same in Python. Also, This is a basic implementation of Logistic Regressi…

Notifications You must be signed in to change notification settings

amit-raj-repo/ML-Logistic-Regression-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

ML-Logistic-Regression-Python

To understand Logistic Regression, we need to understand the concept of Maximum likelihood. Following steps will help us in doing that.

Maximum Likelihood

Step-1

Suppose we have a distribution of weights i.e. mouse weights

Step-2

We need to find the normal distribution curve which will fit best on this data, for this we need optimum values of mu and sigma

Step-3

  • We then try out various values of mean so that the weights are closest to the mean and gives maximum likelihood while doing this, we fix sigma to any value and not chnge it
  • Once we get the optimum value of mean i.e. mu then we switch to sigma and vary sigma to finalize of curve's spread to get max likelihood value during this time the mean stays fixed to the optimum value obtained in previous step
  • There is a catch here, we have n number of observations here so to calculate max likelihood, we calculate max likelihood for individual samples and multiply them together
  • Finally we get our optimum values of mean and sd using which we can best fit our data to normal distribution

Logistic Regression

  • On the left side of the image we have our base data, we can have n number of variables in the data, just imagine an N dimension graph and all our data points plotted there.

  • Logistic regression uses a logit function which is used to calculate probabilities given a set of values for the variables. The graph with the formula of logit function is given below.

  • To get the best fitting curve of logit function, we need to follow a bunch of steps

    • First being converting the y axis with probability to log odds using the formula: Log odds = Log(p/1-p)
    • Once the probability is converted to log odds, the y axis ranges from -infinity(former 0) to + infinity(former 1)

    • We now select a random line to start with and then project the points on the line

    - From the step above, we have the log odds for each point, we now use the following formula to calculate the probability: Probability = e^(log odds)/ 1 + e^(log odds)
  • We now calculate the likelihood for each point, which is nothing but the probability value for all 1s and 1-p value for all 0s, we have seen earlier that to calculate the final likelihood value we multiplied the values but here, we calculate log likelihood which is nothing but sum of log of likelihood values

  • Now the algorithm rotates the line in such a way that it increases the value of log likelihood. We continue the above steps till we get the best fit squiggle/ maximum log likelihood

About

Logistic Regression is one of the basic yet complex machine learning algorithm. This is often the starting point of a classification problem. This repository will help in understanding the theory/working behind logistic regression and the code will help in implementing the same in Python. Also, This is a basic implementation of Logistic Regressi…

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published