New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fetch chunk metadata #2152
base: dev
Are you sure you want to change the base?
Fetch chunk metadata #2152
Conversation
pwndbg/heap/ptmalloc.py
Outdated
for field in include_only_fields: | ||
if field is ChunkField.PREV_SIZE: | ||
requested_fields.add(prev_size_field_name) | ||
elif field is ChunkField.SIZE: | ||
requested_fields.add(size_field_name) | ||
elif field is ChunkField.FD: | ||
requested_fields.add("fd") | ||
elif field is ChunkField.BK: | ||
requested_fields.add("bk") | ||
elif field is ChunkField.FD_NEXTSIZE: | ||
requested_fields.add("fd_nextsize") | ||
elif field is ChunkField.BK_NEXTSIZE: | ||
requested_fields.add("bk_nextsize") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might actually be a good time to use a dictionary, mapping ChunkField.PREV_SIZE
to prev_size_field_name
, and so on. Then the for loop becomes simple and easy to read, and we can use the same dictionary in the next for loop, simplifying that code as well and make it less error prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dictionary was my first thought, but I knew @disconnect3d would say "this dictionary is created on every function call" and I'd have to turn it into an if/elif
statement 😅
So I'll let you two pick which way you think is best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually need an enum? :P
It feels like without it we could just do include_only_fields=include_only_fields
below and we seem to do the same job twice (note: we don't even validate if one passed an invalid value in include_only_fields
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My justification for the Enum is in the description old chap ☝️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the dictionary, my thought process is usually if it simplifies more than one if-statement it's worth it, otherwise usually an if-statement is better. And in this case it would simplify two.
If the issue is about constructing the dictionary inside the function (I personally don't mind), it could be constructed outside, and the size/prev_size fields could be set inside the function the first time its run. Alternatively, you can create a second function that essentially does what a dictionary would do, taking a key and returning a value, but I honestly don't know how a function call compares to creating a dictionary in terms of performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the enum, if we don't devs to have to think too much about it, we could also just accept all names for a parameter, i.e. size
and mchunk_size
. The downsides of that are mainly that this function gets a bit more complex, and that it may not be clear to the caller what's actually going to happen if the field was renamed. With an enum, it's more clear that we'll handle the field names for the caller.
I'm fine either way, will leave it to @disconnect3d.
Needs fixing conflicts :< |
So if I keep the import statement for |
Thanks to @gsingh93 for figuring out what I'd done wrong here: capitalized "Set" must be used for compatibility with Python 3.8
@disconnect3d Conflicts resolved & linter is happy 👍 |
This PR builds on #2082 by adding a
fetch_chunk_metadata()
function to ptmalloc.pyIt can be used to fetch all of a chunk's metadata given its address, or just fetch specific fields. It returns a dictionary of
ChunkField
:int
. It also handles resolution of chunk fields that have had different names over the years e.g. "prev_size" vs. "mchunk_prev_size".I know using a ChunkField enum looks over-engineered, but I think it has some benefits:
Screenshot of the function in action:
The next step is to integrate
fetch_chunk_metadata()
into the Chunk class, which will hopefully remove any direct references to the gdb module. Testing should be a bit easier after that, you could create fake chunks for testing by building a dictionary or mocking return values of calls tofetch_chunk_metadata()