Fetch chunk metadata #2152

CptGibbon · 2024-05-08T21:39:22Z

This PR builds on #2082 by adding a fetch_chunk_metadata() function to ptmalloc.py

It can be used to fetch all of a chunk's metadata given its address, or just fetch specific fields. It returns a dictionary of ChunkField : int. It also handles resolution of chunk fields that have had different names over the years e.g. "prev_size" vs. "mchunk_prev_size".

I know using a ChunkField enum looks over-engineered, but I think it has some benefits:

Developers don't need to care about those renamed chunk fields, they can just use ChunkField.SIZE instead of knowing whether they need "size" or "mchunk_size"
Minimal changes on future field renames, e.g. if "fd" becomes "mchunk_fd", developers just keep using ChunkField.FD
Developers don't need to look up chunk field names in the glibc source code, they can just check the enum

Screenshot of the function in action:

The next step is to integrate fetch_chunk_metadata() into the Chunk class, which will hopefully remove any direct references to the gdb module. Testing should be a bit easier after that, you could create fake chunks for testing by building a dictionary or mocking return values of calls to fetch_chunk_metadata()

pwndbg/heap/ptmalloc.py

gsingh93 · 2024-05-09T04:18:23Z

pwndbg/heap/ptmalloc.py

+ for field in include_only_fields:
+ if field is ChunkField.PREV_SIZE:
+ requested_fields.add(prev_size_field_name)
+ elif field is ChunkField.SIZE:
+ requested_fields.add(size_field_name)
+ elif field is ChunkField.FD:
+ requested_fields.add("fd")
+ elif field is ChunkField.BK:
+ requested_fields.add("bk")
+ elif field is ChunkField.FD_NEXTSIZE:
+ requested_fields.add("fd_nextsize")
+ elif field is ChunkField.BK_NEXTSIZE:
+ requested_fields.add("bk_nextsize")


I think this might actually be a good time to use a dictionary, mapping ChunkField.PREV_SIZE to prev_size_field_name, and so on. Then the for loop becomes simple and easy to read, and we can use the same dictionary in the next for loop, simplifying that code as well and make it less error prone.

Dictionary was my first thought, but I knew @disconnect3d would say "this dictionary is created on every function call" and I'd have to turn it into an if/elif statement 😅
So I'll let you two pick which way you think is best.

Do we actually need an enum? :P

It feels like without it we could just do include_only_fields=include_only_fields below and we seem to do the same job twice (note: we don't even validate if one passed an invalid value in include_only_fields).

My justification for the Enum is in the description old chap ☝️

Regarding the dictionary, my thought process is usually if it simplifies more than one if-statement it's worth it, otherwise usually an if-statement is better. And in this case it would simplify two.

If the issue is about constructing the dictionary inside the function (I personally don't mind), it could be constructed outside, and the size/prev_size fields could be set inside the function the first time its run. Alternatively, you can create a second function that essentially does what a dictionary would do, taking a key and returning a value, but I honestly don't know how a function call compares to creating a dictionary in terms of performance.

Regarding the enum, if we don't devs to have to think too much about it, we could also just accept all names for a parameter, i.e. size and mchunk_size. The downsides of that are mainly that this function gets a bit more complex, and that it may not be clear to the caller what's actually going to happen if the field was renamed. With an enum, it's more clear that we'll handle the field names for the caller.

I'm fine either way, will leave it to @disconnect3d.

…_as_dictionary

disconnect3d · 2024-05-22T11:58:27Z

Needs fixing conflicts :<

CptGibbon · 2024-05-22T20:32:04Z

So if I keep the import statement for typing.Set the linter complains, and if I remove it it complains...
What am I missing?

@gsingh93

Thanks to @gsingh93 for figuring out what I'd done wrong here: capitalized "Set" must be used for compatibility with Python 3.8

CptGibbon · 2024-05-23T18:42:02Z

@disconnect3d Conflicts resolved & linter is happy 👍

CptGibbon added 4 commits May 8, 2024 15:51

Add resolve_renamed_struct_field()

f2c3ca5

Add ChunkField enum

8f354f3

Raise ValueError when field name not found

8e5dc08

Add fetch_chunk_metadata()

7f718d2

gsingh93 reviewed May 9, 2024

View reviewed changes

CptGibbon added 3 commits May 9, 2024 11:57

Use None for include_only_fields argument

0f0ed8b

Use None as default value for include/exclude filters in fetch_struct…

608475a

…_as_dictionary

Resolve C408

baa6e53

CptGibbon added 2 commits May 22, 2024 09:14

Merge branch 'dev' into fetch-chunk-metadata

f05ed94

Remove unused typing.Set import

582f7ea

CptGibbon added 3 commits May 23, 2024 10:04

Correct use of Set in type hints

50bcecd

Thanks to @gsingh93 for figuring out what I'd done wrong here: capitalized "Set" must be used for compatibility with Python 3.8

Correct use of Set in type hints

9c051de

Import Set from Typing in ptmalloc.py

de6b28c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fetch chunk metadata #2152

Fetch chunk metadata #2152

CptGibbon commented May 8, 2024

gsingh93 May 9, 2024

CptGibbon May 9, 2024

disconnect3d May 9, 2024

CptGibbon May 9, 2024

gsingh93 May 9, 2024

gsingh93 May 9, 2024

disconnect3d commented May 22, 2024

CptGibbon commented May 22, 2024

CptGibbon commented May 23, 2024

Fetch chunk metadata #2152

Are you sure you want to change the base?

Fetch chunk metadata #2152

Conversation

CptGibbon commented May 8, 2024

gsingh93 May 9, 2024

Choose a reason for hiding this comment

CptGibbon May 9, 2024

Choose a reason for hiding this comment

disconnect3d May 9, 2024

Choose a reason for hiding this comment

CptGibbon May 9, 2024

Choose a reason for hiding this comment

gsingh93 May 9, 2024

Choose a reason for hiding this comment

gsingh93 May 9, 2024

Choose a reason for hiding this comment

disconnect3d commented May 22, 2024

CptGibbon commented May 22, 2024

CptGibbon commented May 23, 2024