-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce false positive rate of timing tests and add tools for handling them #673
Comments
see also #106 |
Actually, we should be careful with sample sizes, as too small sample sizes will not show effect sizes that are measurable in practice. See https://stats.stackexchange.com/a/2522/289885 :
i.e. to detect a 1% effect size we need a sample size of 10k, and 1M sample size to detect an effect size of 0.1% and we need to remember that p-value is independent of sample size: the 5% false positive rate for alpha of 0.05 is a constant for very large sample sizes and quick response times we may need to look into checking the statistical importance not statistical significance of the result (as a result that tells us that one class is different than another by less that one CPU cycle, then it's not a meaningful result), see https://stats.stackexchange.com/a/7849/289885 |
While we have tests to verify Lucky13 and Bleichenbacher now:
they have quite significant false positive rate (>20%). We should improve the used statistical classifiers, handling of outliers, way the data is collected, etc., so that the false positive rate is more manageable (<5%)
The text was updated successfully, but these errors were encountered: