feature(invariant) - persist and replay failure #7899

grandizzy · 2024-05-09T10:30:03Z

Motivation

Partially address #2552 - persist and replay shrinked failed invariant sequence CC @mds1

when invariant fails, the shrinked sequence is persisted and replayed on subsequent runs. If the sequence still fails then test exists immediately. Replaying failed sequence is done with current configs (for example if the persisted sequence fails in 100 steps but config was altered to run with a depth of 50 then persisted sequence will pass)
only the last failed sequence is persisted, the default file is PROJ_ROOT/cache/invariant/failures/{TEST_SUITE_NAME}/{INVARIANT_NAME}, layout of cache dir becoming
the default failure dir can be changed in foundry.toml

[invariant]
failure_persist_dir="/path/to/failure/dir"

or by using inline config

/// forge-config: default.invariant.failure-persist-dir = /path/to/failure/dir

currently the sequence is persisted in a proprietary format (but could be easy to change to a standard corpus one when available, see Allow Echidna & Medusa to share the same corpus crytic/medusa#234 (comment)). For example, the transfer ownership in 2 steps shrinked sequence is saved as

[
  {
    "sender": "0x0000000000000000000000000000000000000003",
    "addr": "0x2e234dae75c793f67a35089c9d99245e1c58470b",
    "calldata": "0x6d4354210000000000000000000000007fa9385be102ac3eac297483dd6233d62b3e14960000000000000000000000000000000000000000000000000000000000000e72",
    "contract_name": "test/Owned.t.sol:Handler",
    "signature": "transferOwnership(address,address)",
    "args": "0x7FA9385bE102ac3EAc297483Dd6233D62b3e1496, 0x0000000000000000000000000000000000000e72"
  },
  {
    "sender": "0x0000000000000000000000000000000000000001",
    "addr": "0x2e234dae75c793f67a35089c9d99245e1c58470b",
    "calldata": "0x51710e450000000000000000000000000000000000000000000000000000000000000e72",
    "contract_name": "test/Owned.t.sol:Handler",
    "signature": "acceptOwnership(address)",
    "args": "0x0000000000000000000000000000000000000e72"
  }
]

Solution

add option of failure dir in invariant config, defaults to cache/invariant
remove dir on forge clean
run invariant check only once when replaying - checking after each call doesn't add valuable info for passing scenario (invariant call result is always success) nor for failed scenarios (invariant call result is always success until the last call that breaks it)
reuse check_sequence shrink utility to check persisted failure
store args as String in BaseCounterExample so we can serialize / deserialize it and use in replayed failed counterexample
if file with failed scenario exists, load and check sequence, exit if it still fails, continue on success. If fail doesn't exist or data cannot be loaded then file is ignored and test continues. Only last failed scenario is persisted in failure file. If failed scenario fails to pe persisted then error is displayed
added tests to check failure replay, move duped code for getting counterexample in get_counterexample macro
added test to make sure dir is removed on forge clean

mattsse

really love the tests!

only have one pedantic nit

mattsse · 2024-05-17T09:58:30Z

crates/config/src/lib.rs

+ macro_rules! remove_test_cache {
+ ($cache_dir:expr) => {
+ if let Some(test_cache) = $cache_dir {
+ let path = project.root().join(test_cache);
+ if path.exists() {
+ std::fs::remove_dir_all(&path).map_err(|e| SolcIoError::new(e, path))?;
+ }
+ }
+ };
 }


can we convert this into either a private function or closure instead?

not a fan of using private macros for this

changed in 9b22080 I opted not to throw the SolcIoError anymore (as is not really solc related) and just ignore, can readd it if you think it should be there

mattsse · 2024-05-17T10:00:25Z

crates/forge/src/runner.rs

+ if let Ok(data) = fs::read_to_string(failure_file) {
+ if let Ok(call_sequence) = serde_json::from_str::<Vec<BaseCounterExample>>(&data) {


I believe there's a foundry common fs function for reading json, perhaps we can use this here and get rid of one scope?

oh, yes, indeed, I used them for both read and write in 9b22080

- replace test cache rm macro with closure - use commons for load / persist failure sequence

mds1 · 2024-05-17T14:25:05Z

This is great!

(for example if the persisted sequence fails in 100 steps but config was altered to run with a depth of 50 then persisted sequence will pass)

In this case, are we only running the first 50 steps of the sequence which results in it being passed? I think we may still want the full 100 step sequence to run to avoid thinking my test is now passing when it reality it doesn't

PROJ_ROOT/cache/invariant/failures/{TEST_NAME}/{INVARIANT_NAME}

Two small comments on this path:

I think TEST_NAME is a typo and you meant CONTRACT_NAME?
What if two invariants have the same name but different signatures, e.g. invariant1(uint256 x) and invariant1(uint256 x, uint256 y)? The file name probably needs to be the full normalized signature (selector also works, but is harder to less user-friendly)

klkvr

LGTM

I think there can be potential issues with fail_on_revert set to true. If user changes setUp routines, then persisted sequence might become invalid because contract addresses will change and cached calldata might be invalid for a given address.

For example, if persisted sequence is

call foo() on contract A
call bar() on contract B

then adding more deployments to setUp will cause addresses of contract A/B change due to a deployer nonce shift, and sequence above might start calling bar() on A or vice versa.

Not sure how we can address that, perhaps display some kind of warning when persisted sequence is reverting rather than failing an invariant?

grandizzy · 2024-05-17T14:36:41Z

This is great!

(for example if the persisted sequence fails in 100 steps but config was altered to run with a depth of 50 then persisted sequence will pass)

In this case, are we only running the first 50 steps of the sequence which results in it being passed? I think we may still want the full 100 step sequence to run to avoid thinking my test is now passing when it reality it doesn't

Yes, that's correct, there'll only be first 50 runs resulting in test pass. That was done to avoid situations like the one described here crytic/echidna#1231 where changes in configs are not applied on failure reply. Happy to change it if you think should be.

PROJ_ROOT/cache/invariant/failures/{TEST_NAME}/{INVARIANT_NAME}

Two small comments on this path:

I think TEST_NAME is a typo and you meant CONTRACT_NAME?

Yeah, I meant test suite (that is the test contract)

What if two invariants have the same name but different signatures, e.g. invariant1(uint256 x) and invariant1(uint256 x, uint256 y)? The file name probably needs to be the full normalized signature (selector also works, but is harder to less user-friendly)

Rn we don't support params in invariant functions, is something that was requested in #4834 but no resolution for it yet

mds1 · 2024-05-17T14:51:30Z

Yes, that's correct, there'll only be first 50 runs resulting in test pass. That was done to avoid situations like the one described here crytic/echidna#1231 where changes in configs are not applied on failure reply. Happy to change it if you think should be.

Thanks, I left a comment there to better understand

Rn we don't support params in invariant functions

Oh right, forgot about that 😅

grandizzy · 2024-05-17T15:00:04Z

LGTM

I think there can be potential issues with fail_on_revert set to true. If user changes setUp routines, then persisted sequence might become invalid because contract addresses will change and cached calldata might be invalid for a given address.

For example, if persisted sequence is
1. call `foo()` on contract `A`

2. call `bar()` on contract `B`
then adding more deployments to setUp will cause addresses of contract A/B change due to a deployer nonce shift, and sequence above might start calling bar() on A or vice versa.

Not sure how we can address that, perhaps display some kind of warning when persisted sequence is reverting rather than failing an invariant?

good point, a simpler scenario would be renaming handler functions, like foo() on contract A to become foo1(). Will display something like replay reverted if such to avoid confusion

…erts before checking invariant

grandizzy · 2024-05-17T17:42:27Z

Not sure how we can address that, perhaps display some kind of warning when persisted sequence is reverting rather than failing an invariant?

@klkvr please check db1b619

klkvr · 2024-05-17T18:22:57Z

crates/evm/evm/src/executors/invariant/shrink.rs

 let (sender, (addr, bytes)) = &calls[call_index];
 let call_result =
 executor.call_raw_committing(*sender, *addr, bytes.clone(), U256::ZERO)?;
- if call_result.reverted && failed_case.fail_on_revert {
+ if call_result.reverted && fail_on_revert {
 // Candidate sequence fails test.
 // We don't have to apply remaining calls to check sequence.
 sequence_failed = true;
 break;


can't we just return here instead of managing sequence_failed flag and counting calls for replayed_entirely?

yep yep, nice cleanup! 7686c03

klkvr · 2024-05-17T18:24:07Z

crates/forge/src/runner.rs

+ ))
+ } else {
+ Some(format!(
+ "{} persisted failure revert",


yep, that should explain what actually happened

grandizzy · 2024-05-19T06:16:34Z

Yes, that's correct, there'll only be first 50 runs resulting in test pass. That was done to avoid situations like the one described here crytic/echidna#1231 where changes in configs are not applied on failure reply. Happy to change it if you think should be.

Thanks, I left a comment there to better understand

@mds1 I think I am on the same page as Rappie mentioned in crytic/echidna#1231 (comment)
In foundry we have plans to add regression test generation where we capture all the conditions to replicate failure (test driver solidity code, env vars used at the time of test running, etc.). This should avoid the impression that test is passing when it reality it doesn't, wdyt?

mds1 · 2024-05-20T21:19:13Z

@mds1 I think I am on the same page as Rappie mentioned in crytic/echidna#1231 (comment)
In foundry we have plans to add regression test generation where we capture all the conditions to replicate failure (test driver solidity code, env vars used at the time of test running, etc.). This should avoid the impression that test is passing when it reality it doesn't, wdyt?

From rappie in that issue:

the config leads, meaning sequences in the corpus which are illegal according to the current config should be skipped.

That does make sense to me, I'm onboard 👌

grandizzy added 2 commits May 9, 2024 12:46

feature(invariant) - persist and replay failure

9b9ccf8

Fix unit test

f6e1eef

grandizzy marked this pull request as ready for review May 9, 2024 11:01

grandizzy requested review from DaniPopes, Evalir and mattsse as code owners May 9, 2024 11:01

mattsse requested changes May 17, 2024

View reviewed changes

Changes after review:

9b22080

- replace test cache rm macro with closure - use commons for load / persist failure sequence

grandizzy requested a review from mattsse May 17, 2024 11:54

mattsse approved these changes May 17, 2024

View reviewed changes

klkvr approved these changes May 17, 2024

View reviewed changes

Changes after review: display proper message if replayed sequence rev…

db1b619

…erts before checking invariant

klkvr reviewed May 17, 2024

View reviewed changes

klkvr approved these changes May 17, 2024

View reviewed changes

crates/forge/src/runner.rs

))

} else {

Some(format!(

"{} persisted failure revert",

Copy link

Member

klkvr May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that should explain what actually happened

Changes after review: simplify check sequence logic

7686c03

Merge remote-tracking branch 'origin' into invariant-failure-persistence

69c523d

grandizzy merged commit c9ae920 into foundry-rs:master May 21, 2024
18 of 19 checks passed

grandizzy deleted the invariant-failure-persistence branch May 21, 2024 05:00

0xalpharush mentioned this pull request May 31, 2024

feat: fuzz corpus saving and replay #2552

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature(invariant) - persist and replay failure #7899

feature(invariant) - persist and replay failure #7899

grandizzy commented May 9, 2024 •

edited

mattsse left a comment

mattsse May 17, 2024

grandizzy May 17, 2024

mattsse May 17, 2024

grandizzy May 17, 2024

mds1 commented May 17, 2024

klkvr left a comment •

edited

grandizzy commented May 17, 2024

mds1 commented May 17, 2024

grandizzy commented May 17, 2024

grandizzy commented May 17, 2024

klkvr May 17, 2024

grandizzy May 17, 2024

klkvr May 17, 2024

grandizzy commented May 19, 2024

mds1 commented May 20, 2024 •

edited

		if let Ok(data) = fs::read_to_string(failure_file) {
		if let Ok(call_sequence) = serde_json::from_str::<Vec<BaseCounterExample>>(&data) {

feature(invariant) - persist and replay failure #7899

feature(invariant) - persist and replay failure #7899

Conversation

grandizzy commented May 9, 2024 • edited

Motivation

Solution

mattsse left a comment

Choose a reason for hiding this comment

mattsse May 17, 2024

Choose a reason for hiding this comment

grandizzy May 17, 2024

Choose a reason for hiding this comment

mattsse May 17, 2024

Choose a reason for hiding this comment

grandizzy May 17, 2024

Choose a reason for hiding this comment

mds1 commented May 17, 2024

klkvr left a comment • edited

Choose a reason for hiding this comment

grandizzy commented May 17, 2024

mds1 commented May 17, 2024

grandizzy commented May 17, 2024

grandizzy commented May 17, 2024

klkvr May 17, 2024

Choose a reason for hiding this comment

grandizzy May 17, 2024

Choose a reason for hiding this comment

klkvr May 17, 2024

Choose a reason for hiding this comment

grandizzy commented May 19, 2024

mds1 commented May 20, 2024 • edited

grandizzy commented May 9, 2024 •

edited

klkvr left a comment •

edited

mds1 commented May 20, 2024 •

edited