Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterate through all revisions to find first good revision before bisecting. #3934

Draft
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

letitz
Copy link
Collaborator

@letitz letitz commented Apr 9, 2024

This change builds on #3927 to change the way regression task operates.

To review, given GitHub's lack of support for stacked PRs, this link should work: https://github.com/google/clusterfuzz/pull/3934/files/1bdca4468269ee897b5145f6dc24c395b6eb4d2d..dc74698caba2f0ce6512eca69d93d41d1692aef7

The overall goal is to be more dogged when trying to find the first good revision to start bisecting from, and try all revisions. This may take too long, so we respect the task deadline and ensure that this step is resumable in a subsequent regression task. Regression task already supports resuming timed out regressions later, as this is how the core bisection works.

Previously, regression task would start its first run (detected by the absence of last_regression_min or last_regression_max metadata on the testcase) by:

  • checking whether the regression happened in the last N revisions, ending early if so, having found the regression range
  • checking whether one of the first N revisions was good, and:
    • if not, ending early with an error (this is what is changing)
    • if so, and the first good revision crashed, ending early, having found the regression range to be [0, min)
    • if so, and the first good revision did not crash, going on to bisect

This change does a few things:

  • splits the notion of first run into two: check separately for last_regression_max and last_regression_min
  • when last_regression_max is absent from the testcase:
    • check the latest N revisions for a regression
    • update last_regression_max on the testcase to indicate this happened
  • when last_regression_min is absent from the testcase:
    • check all revisions sequentially to find the first good revision
    • update last_regression_min as we go along to record our progress
    • time out if this takes too long
  • side benefit: since we now record the progress made in these two steps, we can avoid:
    • testing the same revision in check_latest_revisions and find_earliest_good_revision
    • bisecting revisions that were already tested in check_latest_revisions or find_earliest_good_revision

TODO: An integration test of some kind that checks that forward progress is made after timeouts.

@letitz letitz requested a review from alhijazi April 9, 2024 15:03
@letitz
Copy link
Collaborator Author

letitz commented Apr 9, 2024

@alhijazi PTAL while I work out how to write some kind of integration test.

@letitz
Copy link
Collaborator Author

letitz commented Apr 12, 2024

Thanks! Onto @jonathanmetzman, who may have suggestions on how/where to write an overarching test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants