-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for reordering the contents in M1 and M2 #398
Comments
So I see at least four different things here: Removing cross-validation from M1With M1 we aimed at a tight session to go as efficiently as possible from knowing almost nothing about scikit-learn to a realistic scikit-learn pipeline (i.e. something that we would use in practice). Removing cross-validation completely from M1 does not seem great from this point of view. Basically, people involved in this MOOC will tell you cross-validation is super important.
Generally speaking, repeating things or covering the same thing in a slightly different way is completely fine and I would argue actually a good thing pedagogically. So I don't see a huge problem with this, especially given the cost associated to moving things around (in our repo, changing the exercise that use cross-validation, checking all the notebooks that may mention cross-validation, deciding where to move this or say "in the next module we will see this in more details" and probably also in FUN for the quizzes). Light refactoring within M2It seems like you want to move train and test scores to its own lesson, why not? Can you explain a bit more why you don't like the way it is done currently? I guess at one point we had in mind that the videos would come first to give the intuitions and the code later to reinforce the intuitions. Adding content about score distributions and model improvement/deteriorationMentioned in #366, let's keep the discussion there. More content on handling missing dataLet's do it in #361. |
Regarding the cross-validation and evaluation => #415
Missing values
Module 2 => #416The proposal of Arturo is good:
|
I opened several issue to split the work we agreed on, so closing this one. |
Proposal to be voted in v.2: Some changes in the ordering of the topics in M1 and M2 may make the content clearer.
Motivation:
Notation:
Module
Actual contents:
Module 1. The Predictive Modeling Pipeline
Module 2. Selecting the Best Model
Proposed contents:
Module 1. The Predictive Modeling Pipeline
cross-validationand cross-validation,pipeline with imputer + cross-validation and score distributions, (🔺)more on handling missing dataModule 2. Selecting the Best Model
train and test errors,cross-validation in detail, error distributions, target distributionThe text was updated successfully, but these errors were encountered: