-
Notifications
You must be signed in to change notification settings - Fork 376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: invalid txs_results
returned for legacy ABCI responses
#3031
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
I am working on some additional tests to ensure the logic holds and it makes sense |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reserve against this change.
I would move the added test unit elsewhere thought.
…ometbft into andy/3002-invalid-txs-results
state/store.go
Outdated
if err := legacyResp.Unmarshal(buf); err != nil { | ||
// only return an error, this method is only invoked through the `/block_results` not for state logic and | ||
// some tests, so no need to exit cometbft if there's an error, just return it. | ||
store.Logger.Debug("failed to unmarshall legacy ABCI response: %s", err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
store.Logger.Debug("failed to unmarshall legacy ABCI response: %s", err.Error()) | |
store.Logger.Debug("failed to unmarshall FinalizeBlockResponse (also tried as legacy ABCI response): %s", err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also include the error you get in line 662. The reason is, if it was indeed not a legacy response, but it failed to deserialise, the error you get when you later try to deserialise as legacy is not useful.
state/store.go
Outdated
// only return an error, this method is only invoked through the `/block_results` not for state logic and | ||
// some tests, so no need to exit cometbft if there's an error, just return it. | ||
store.Logger.Debug("failed to unmarshall legacy ABCI response: %s", err.Error()) | ||
return nil, ErrABCIResponseCorruptedOrSpecChangeForHeight{Height: height} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if we don't break any API (not sure) it'd be good to include the error inside this error, so the caller can print it if needed
} | ||
|
||
responseDeliverTx := abciv1beta2.ResponseDeliverTx{ | ||
Code: abci.CodeTypeOK, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should you populate the rest of the fields here? (GasWanted
, Codespace
, etc)
require.Equal(t, 1, len(legacyABCIResponse.DeliverTxs)) | ||
require.Equal(t, 1, len(legacyABCIResponse.BeginBlock.Events)) | ||
require.Equal(t, 1, len(legacyABCIResponse.EndBlock.Events)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be more exhaustive checking the values of these
require.Error(t, err) | ||
require.ErrorContains(t, err, "unexpected EOF") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand, by looking at the production changes below, I had understood that the bug is that this unmarshalling actually doesn't error, but leaves some fields unfilled...
require.NotNil(t, legacyResponseWithNull.DeliverTxs) | ||
require.Nil(t, legacyResponseWithNull.EndBlock) | ||
require.NotNil(t, legacyResponseWithNull.BeginBlock) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we do the same checks here as in TestStateProtoV1Beta2ToV1
above?
Co-authored-by: Sergio Mena <[email protected]>
Hey, is there already an timeline when this fix will be released? /block and /block_results are super important queries for cosmos/cometBFT nodes. |
Please also add the
// responseFinalizeBlockFromLegacy is a convenience function that takes the old abci responses and morphs
// it to the finalize block response. Note that the app hash is missing
func responseFinalizeBlockFromLegacy(legacyResp *cmtstate.LegacyABCIResponses) *abci.ResponseFinalizeBlock {
// Add BeginBlock attribute to BeginBlock events
for idx, event := range legacyResp.BeginBlock.Events {
legacyResp.BeginBlock.Events[idx].Attributes = append(event.Attributes, abci.EventAttribute{
Key: "mode",
Value: "BeginBlock",
Index: false,
})
}
// Add EndBlock attribute to BeginBlock events
for idx, event := range legacyResp.EndBlock.Events {
legacyResp.EndBlock.Events[idx].Attributes = append(event.Attributes, abci.EventAttribute{
Key: "mode",
Value: "EndBlock",
Index: false,
})
}
return &abci.ResponseFinalizeBlock{
TxResults: legacyResp.DeliverTxs,
ValidatorUpdates: legacyResp.EndBlock.ValidatorUpdates,
ConsensusParamUpdates: legacyResp.EndBlock.ConsensusParamUpdates,
Events: append(legacyResp.BeginBlock.Events, legacyResp.EndBlock.Events...),
// NOTE: AppHash is missing in the response but will
// be caught and filled in consensus/replay.go
}
} |
Noble would love to see this released as well! |
Just an update, still working on this, need to add some additional testing to ensure these changes are good especially backporting to old releases. But hopefully should be ready to be merged sooner than later. |
Just an update. The fix is still on-going. We believe we have a solution but might be hard to test with live production data (archive nodes). Would anyone that benefits from this fix be able to test against real data before we cut an official release ? Once the fix is in, we could backport to v0.38.x branch (but we would not cut a release until we have some confirmation the fix works). Then you can test using a binary compiled from that branch and let us know. |
close: #3002
This PR fixes the issue reported above.
This is not a storage issue in particular, the results are still in storage after an upgrade, but not returned properly by the RPC endpoint. The fix is to make the
/block_results
endpoint inv0.38
to return properly a legacy ABCI response created withv0.37
.Once this fix is merged on
v0.38
and a patch release is cut, any node onv0.38
(e.g. an archive node) that applies the patch release, should have the results returned properly by the RPC/block_results
endpoint.PR checklist
.changelog
(we use unclog to manage our changelog)Updated relevant documentation (docs/
orspec/
) and code comments