Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagnostic Plots for Linear Regression Analysis #1211

Open
pkaf opened this issue Jan 25, 2022 · 4 comments
Open

Diagnostic Plots for Linear Regression Analysis #1211

pkaf opened this issue Jan 25, 2022 · 4 comments
Labels
type: feature a new visualizer or utility for yb

Comments

@pkaf
Copy link
Contributor

pkaf commented Jan 25, 2022

If not already considered or being developed it would be neat to have some standard diagnostic plots for linear regression analysis mainly

a. residual vs fitted
b. normal q-q
c. scale-location
d. residual vs leverage

as shown in https://data.library.virginia.edu/diagnostic-plots/. Example plots
Screen Shot 2022-01-25 at 8 53 42 pm
Screen Shot 2022-01-25 at 8 53 48 pm
Screen Shot 2022-01-25 at 8 53 53 pm
Screen Shot 2022-01-25 at 8 54 00 pm

I am happy to PR.

@pkaf
Copy link
Contributor Author

pkaf commented Jan 25, 2022

I will love to hear your thought on ^^ @bbengfort .

@bbengfort bbengfort added the type: feature a new visualizer or utility for yb label Feb 19, 2022
@bbengfort
Copy link
Member

@pkaf We'd certainly be open to more regression analysis tools or adaptations of our current tools to support these types of analyses.

The ResidualsPlot is currently plotted against the fitted value, so I think that's what plot 1 is - it also has the option to have a Q-Q plot alongside it, which I think is plot 2. Perhaps that plot could be modified to plot the residuals against actual value instead of the predicted value?

Scale-location vs fitted values (your third plot) also seems like it might be an adaptation of the ResidualsPlot to standardize the residuals rather than using the raw residuals - this would be a great param to add!

We also have a CooksDistance visualizer, which may be related to your last plot of standardized residuals to Leverage, or might be a building block towards that visualization.

If the ResidualsPlot is not sufficient, perhaps you could look into creating a ResidualsDiagnostics visualizer that plots all four of these graphs in 4 separate axes? We haven't done a lot of multi-axes plotting, but this could be a good start toward that.

@pkaf
Copy link
Contributor Author

pkaf commented Mar 15, 2022

@bbengfort recently, I pushed an example depicting above graphs in statsmodels https://www.statsmodels.org/devel/examples/notebooks/generated/linear_regression_diagnostics_plots.html . We can adapt it here too.

@bbengfort
Copy link
Member

@pkaf awesome - we welcome any PRs that you might open for Yellowbrick!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature a new visualizer or utility for yb
Projects
None yet
Development

No branches or pull requests

2 participants