-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rigorous treatment effect estimation #59
Labels
Comments
Adding some references for completeness: |
I would love to add
I would recommend installation and importing metrics/bootstrap procedure. This could automatically be calculated on the |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The original motivation behind the toolbox has been to put an emphasis on biological study design and to provide a new perspective of analysing neural networks (inspired by Jon Frankle). This means replicable experiment pipelines (git hash, random seeds, configuration cloning, protocol logging, etc.). This has mainly been implemented by now.
But another super useful perspective comes from Economics and the evaluation of treatments (Thx Nandan!). In ML most result presentations do not show any significance levels/results. Often the learning curve confidence intervals even overlap. The entire process is not scientific and can benefit from tools from Econ such as diff-in-diff style experiments and simply being rigorous about the effect you are trying to sell. I want to incorporate this into the toolbox and make it part of my daily research routine for testing scientific ML hypothesis!
In the near future:
Note I: I think that it makes sense to add a
causality
subdirectory, which collects the different tests and can easily be extended. E.g. starting with a base classTreatmentTest
, we can have different models standard error and correction formulations.Note II: Many of these features will also be needed in the population-based training pipeline, e.g. Wald t-test for deciding whether two sampled population members are performing better/worse.
In the distant future:
Things to Think About
The text was updated successfully, but these errors were encountered: