Improve bin corruption checks #1564

gsingh93 · 2023-02-03T01:26:58Z

The existing bin corruption checks would mark some corrupted bins as empty instead of corrupted. There are now some additional checks, including checking if the fd and bk pointers are in the libc data section.
Handle the case where the chain is less than 2 elements
Check the full list integrity only if we've fully traversed the fd and bk lists (addresses some of Heap chain corruption check is incorrect for long free lists #1196)
Make the list integrity check more clear
In the case we can't traverse the entire lists, do a basic sanity check that chunk->bk->fd = chunk

gsingh93 · 2023-02-03T01:29:25Z

pwndbg/heap/ptmalloc.py

+ chunk = Chunk(chain_fd[0])
+ bk = Chunk(chunk.bk)
+ if bk.fd != chain_fd[0]:
+ is_chain_corrupted = True


I'm wondering instead of a boolean, this could be one of three values: corrupted, not corrupted, not sure.

gsingh93 · 2023-02-03T01:30:14Z

pwndbg/heap/ptmalloc.py

+ # chain, so we can't check the integrity of the entire chain, but we can at
+ # least check if `chunk->bk->fd = chunk`
+ chunk = Chunk(chain_fd[0])
+ bk = Chunk(chunk.bk)


I wonder if chunk.bk could return a Chunk instead of an int?

You could add a previous_chunk() method for example, since this is meant to mimic the malloc_chunk struct's fields. If someone wanted the bk value itself, that is lost as soon as you get the previous chunk. That would also be in line with the current next_chunk() method, but I suppose it would always return None for singly-linked lists.

Sorry if I'm misunderstanding, but Chunk.next_chunk seems to return the next chunk in memory, whereas I'm wondering if Chunk.fd and Chunk.bk should just return a Chunk for convenience. If we'd like to keep this inline with the structs fields, maybe Chunk.fd_chunk and Chunk.bk_chunk could just be implemented as def fd_chunk(self): return Chunk(self.fd).

It's not a major difference, but I'd prefer if the Chunk class took care of this for me.

I agree that Chunk should handle this, I can put together something like you suggest 👍

On revisiting this, it's not as simple as it might seem.

Since tcache fd's are treated differently to other bin fd's (they point to user data rather than chunk metadata) the Chunk object needs to check if it represents a chunk that's in a tcachebin, so it can either do Chunk(chunk.fd) or Chunk(chunk.fd - (size_sz*2)).

This could involve checking the size field, and if it's in tcache range searching the appropriate tcachebin for this chunk's address. But assuming corruption requires searching every tcachebin for this chunk.

Since the bin_at() method only operates on the "normal" bins, there's no need to do this calculation. I think in this case using Chunk(chunk.bk) avoids unnecessary reads from gdb whilst still being clear as to what it's doing.

disconnect3d · 2023-03-05T22:13:46Z

What's the status here? Seems that some heap tests are failing

gsingh93 · 2023-03-19T18:28:44Z

Will be very busy for a bit longer, I think I can complete this before the next release.

gsingh93 · 2023-03-29T06:06:58Z

@CptGibbon tests are passing now, should be ready for review

lebr0nli · 2023-03-29T12:51:29Z

pwndbg/heap/ptmalloc.py

+ # corrupted. If we can't find any mapping for the pointer or it's in
+ # some other mapping, we assume the bin is corrupted
+ page = pwndbg.gdblib.vmmap.find(chain_fd[0])
+ if page and page.rw and "libc" in page.objfile:


I do not very understand the logic behind this yet, so probably I misunderstood something:

I have some concerns about the check here:

What if vmmap does not work well(e.g. for the native qemu-user), will it make pwndbg think the empty bins are corrupted bins?

If we are debugging a binary with statically linked libc, will it make pwndbg think the empty bins are corrupted bins since "libc" in page.objfile is never be True?

IIRC, bins in non-main arena might not in someone's .data sectioin, does this check also true for the bins not in main_arena?

Seems like it might pass the check if the address is on the same page as the .data section, but not actually "in" the .data section? Maybe parsing info files will be better if we want to know if the address is in .data section or not?

What if vmmap does not work well(e.g. for the native qemu-user), will it make pwndbg think the empty bins are corrupted bins?

Hmm, I think you're right. Maybe we should have a function that returns whether the vmmap output can be trusted or not?

If we are debugging a binary with statically linked libc, will it make pwndbg think the empty bins are corrupted bins since "libc" in page.objfile is never be True?

I didn't think about this, you're right.

IIRC, bins in non-main arena might not in someone's .data sectioin, does this check also true for the bins not in main_arena?

Good point. I can add a check here to only do this logic if it's the main arena?

Seems like it might pass the check if the address is on the same page as the .data section, but not actually "in" the .data section? Maybe parsing info files will be better if we want to know if the address is in .data section or not?

Sure, I'm more worried though about a non-corrupted chain being marked as corrupted, not a corrupted chain being marked as not corrupted. The latter is just best effort on our part, while the former would be considered a bug.

I can add a check here to only do this logic if it's the main arena?

Hmm, instead of still checking the fd_chain/bk_chain point to the .data section or not, could we check the front and back pointers instead of the fd_chain/bk_chain?
I mean, something like this:

--- a/pwndbg/heap/ptmalloc.py +++ b/pwndbg/heap/ptmalloc.py @@ -1200,13 +1200,7 @@ class GlibcMemoryAllocator(pwndbg.heap.heap.MemoryAllocator): is_corrupted = True elif len(chain_fd) == len(chain_bk) == 2 and chain_fd == chain_bk and chain_fd[-1] == 0: # Check if bin[index] points to itself (is empty) - - # If the fd pointer is in libc data section, we can be more - # confident that this list is actually empty and not just - # corrupted. If we can't find any mapping for the pointer or it's in - # some other mapping, we assume the bin is corrupted - page = pwndbg.gdblib.vmmap.find(chain_fd[0]) - if page and page.rw and "libc" in page.objfile: + if front == back and front == current_base: chain_fd = [0] chain_bk = [0] else:

Is this correct? (This is entirely from my observation, I'm not very sure :p)

CptGibbon · 2023-03-29T18:43:36Z

@CptGibbon tests are passing now, should be ready for review

I can take a thorough look next week old chap 👍

CptGibbon · 2023-04-02T22:19:14Z

pwndbg/heap/ptmalloc.py

+ elif len(chain_fd) < 2 or len(chain_bk) < 2:
+ # Chains containing less than two entries are corrupted, as the smallest
+ # chain (an empty bin) would look like something like `[main_arena+88, 0]`.
+ is_corrupted = True


I may be missing something, but I don't see this variable being used anywhere.
If that is the case, then a test for this scenario could be useful.

Sorry, I meant is_chain_corrupted here.

Yeah I guess there must be a gap in the current tests since this still passed even with an unused variable.
Probably worth a adding a test that checks for this case wouldn't you say?

disconnect3d · 2023-04-14T20:27:45Z

Ping @CptGibbon :)

disconnect3d · 2023-07-21T00:00:37Z

What's the status of this? Can we re-trigger CI here? For some reason I think I can't? Maybe we should just push whatever here

gsingh93 · 2023-07-21T00:03:18Z

Was just planning on adding a test but that can be done in a future PR.

pwndbg/heap/ptmalloc.py

codecov · 2023-11-02T02:31:03Z

Codecov Report

Attention: 9 lines in your changes are missing coverage. Please review.

Comparison is base (758fb9c) 59.18% compared to head (c36e1ea) 60.55%.
Report is 219 commits behind head on dev.

Additional details and impacted files

@@            Coverage Diff             @@
##              dev    #1564      +/-   ##
==========================================
+ Coverage   59.18%   60.55%   +1.37%     
==========================================
  Files         175      180       +5     
  Lines       21019    22236    +1217     
  Branches     1859     2088     +229     
==========================================
+ Hits        12441    13466    +1025     
- Misses       7977     8062      +85     
- Partials      601      708     +107

Files	Coverage Δ
pwndbg/heap/ptmalloc.py	`71.07% <57.14%> (-0.47%)`	⬇️

... and 180 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

gsingh93 added the heap label Feb 3, 2023

gsingh93 commented Feb 3, 2023

View reviewed changes

gsingh93 marked this pull request as draft February 3, 2023 01:39

gsingh93 marked this pull request as ready for review March 1, 2023 15:06

gsingh93 added this to the 2023.06 milestone Mar 19, 2023

gsingh93 force-pushed the corrupted-lists branch 2 times, most recently from 2b27003 to e7699c8 Compare March 29, 2023 05:58

lebr0nli reviewed Mar 29, 2023

View reviewed changes

CptGibbon reviewed Apr 2, 2023

View reviewed changes

Improve bin corruption checks

f21354e

gsingh93 force-pushed the corrupted-lists branch from e7699c8 to f21354e Compare April 19, 2023 03:29

gsingh93 modified the milestones: 2023.06, Future Jun 20, 2023

disconnect3d reviewed Nov 2, 2023

View reviewed changes

pwndbg/heap/ptmalloc.py Outdated Show resolved Hide resolved

Update pwndbg/heap/ptmalloc.py

c36e1ea

gsingh93 modified the milestones: 2024.01, Future Apr 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve bin corruption checks #1564

Improve bin corruption checks #1564

gsingh93 commented Feb 3, 2023 •

edited

gsingh93 Feb 3, 2023

gsingh93 Feb 3, 2023

CptGibbon Feb 4, 2023

gsingh93 Mar 29, 2023

CptGibbon Mar 29, 2023

CptGibbon Apr 19, 2023

disconnect3d commented Mar 5, 2023

gsingh93 commented Mar 19, 2023

gsingh93 commented Mar 29, 2023

lebr0nli Mar 29, 2023

gsingh93 Mar 29, 2023

lebr0nli Mar 30, 2023 •

edited

CptGibbon commented Mar 29, 2023

CptGibbon Apr 2, 2023 •

edited

gsingh93 Apr 19, 2023

CptGibbon Apr 19, 2023

disconnect3d commented Apr 14, 2023

disconnect3d commented Jul 21, 2023

gsingh93 commented Jul 21, 2023

codecov bot commented Nov 2, 2023 •

edited

Improve bin corruption checks #1564

Are you sure you want to change the base?

Improve bin corruption checks #1564

Conversation

gsingh93 commented Feb 3, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

disconnect3d commented Mar 5, 2023

gsingh93 commented Mar 19, 2023

gsingh93 commented Mar 29, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lebr0nli Mar 30, 2023 • edited

Choose a reason for hiding this comment

CptGibbon commented Mar 29, 2023

CptGibbon Apr 2, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

disconnect3d commented Apr 14, 2023

disconnect3d commented Jul 21, 2023

gsingh93 commented Jul 21, 2023

codecov bot commented Nov 2, 2023 • edited

Codecov Report

gsingh93 commented Feb 3, 2023 •

edited

lebr0nli Mar 30, 2023 •

edited

CptGibbon Apr 2, 2023 •

edited

codecov bot commented Nov 2, 2023 •

edited