Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional type hints #2120

Merged
merged 9 commits into from May 13, 2024
Merged

Additional type hints #2120

merged 9 commits into from May 13, 2024

Conversation

peace-maker
Copy link
Contributor

Add type annotations to the heap code and bunch of other code that's used by the heap.
Activate vermin --eval-annotations to catch invalid type hints for Python 3.8.

Use typing_extensions.ParamSpec to avoid hiding function arguments through decorators.

There are a couple of instances where pwndbg.heap.current is expected to be GlibcMemoryAllocator in common code. I've worked around this by adding explicit isinstance checks, but this probably requires more work if other allocators are to be supported.

@peace-maker

This comment was marked as resolved.

Copy link
Member

@gsingh93 gsingh93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks for this :)

I did a quick skim, but didn't get through it all. Despite the size I don't think this will take too long to finish reviewing, as type change reviews are pretty easy.

That being said, I want to make sure #2001 gets merged first so they don't need to do any additional rebase work. After it's merged, once this gets rebased I can finish the review.

@@ -146,7 +155,7 @@ def add_custom_structure(custom_structure_name: str) -> None:


@OnlyWhenStructFileExists
def edit_custom_structure(custom_structure_name: str, custom_structure_path: str) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the default argument initialization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is weird with having the OnlyWhenStructFileExists decorator filling in the second argument transparently for the decorated function to use if it was called without it. The code mostly called the functions without explicitly passing a custom_structure_path argument and relying on the decorator to fill it in, but mypy complained about missing required arguments.

pwndbg/commands/cymbol.py: note: In function "edit_custom_structure":
pwndbg/commands/cymbol.py:187: error: Missing positional argument "custom_structure_path" in call to "__call__" of
"_OnlyWhenStructFileExists"  [call-arg]
        load_custom_structure(custom_structure_name)
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I guess another way to solve this is to explictly pass "" as second argument when calling the functions instead of specifying a default argument. I can go that route too if you'd like.

pwndbg/commands/leakfind.py Outdated Show resolved Hide resolved
@gsingh93
Copy link
Member

@peace-maker needs a rebase

peace-maker and others added 3 commits April 25, 2024 15:33
Activate `vermin --eval-annotations` to catch invalid type hints for Python 3.8.

Use `typing_extensions.ParamSpec` to avoid hiding function arguments through decorators.
@peace-maker
Copy link
Contributor Author

@gsingh93 done

pwndbg/gdblib/memory.py Outdated Show resolved Hide resolved
Comment on lines +95 to +101
C2GDB_MAPPING: Dict[Type[ctypes.c_char], gdb.Type] = { # type: ignore[no-redef]
k.__ctype_le__: v for k, v in C2GDB_MAPPING.items()
}
else:
C2GDB_MAPPING = {k.__ctype_be__: v for k, v in C2GDB_MAPPING.items()}
C2GDB_MAPPING: Dict[Type[ctypes.c_char], gdb.Type] = { # type: ignore[no-redef]
k.__ctype_be__: v for k, v in C2GDB_MAPPING.items()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking aloud here, we could get rid of the no-redef ignore by moving the if into the dictionary comprehension, but it might look a bit messy...

@@ -123,7 +131,7 @@ def __init__(

class CStruct2GDB:
code = gdb.TYPE_CODE_STRUCT
_c_struct: pwndbg.gdblib.ctypes.Structure = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not looked at this code in some time, but I think it was intentional to make this a class variable that gets overridden in child classes. I don't know what unintended effects there might be by changing that. cc: @CptGibbon in case you have thoughts on this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lebr0nli I shall defer to you old chap

Copy link
Contributor

@lebr0nli lebr0nli Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my original type hint was wrong, thanks @peace-maker for fixing it here!
Since _c_struct is a type instead of an instance of pwndbg.gdblib.ctypes.Structure, also ctypes.Structure is the parent class of pwndbg.gdblib.ctypes.Structure(pwndbg.gdblib.ctypes.Structure should be ctypes.LittleEndianStructure or ctypes.BigEndianStructure), the way it's written in this commit makes more sense to me.

I have not looked at this code in some time, but I think it was intentional to make this a class variable that gets overridden in child classes.

Yep, it's intentional.

I don't know what unintended effects there might be by changing that.

Removing the = None here should not cause unintended effects I think. (At least all the tests related to this code passed :p)

pwndbg/lib/gcc.py Outdated Show resolved Hide resolved
pwndbg/profiling.py Outdated Show resolved Hide resolved
pwndbg/gdblib/ctypes.py Outdated Show resolved Hide resolved
pwndbg/gdblib/abi.py Outdated Show resolved Hide resolved
pwndbg/gdblib/proc.py Show resolved Hide resolved
pwndbg/gdblib/elf.py Show resolved Hide resolved
@@ -327,17 +343,17 @@ def get_ehdr(pointer: int):
ei_class = pwndbg.gdblib.memory.byte(base + 4)

# Find out where the section headers start
Elfhdr = read(Ehdr, base)
Elfhdr: Elf32_Ehdr | Elf64_Ehdr | None = read(Ehdr, base) # type: ignore[type-var]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the ignore[type-var]?

Comment on lines +21 to +51
pc: str

#: Stack pointer register
stack: str | None = None
stack: str

#: Frame pointer register
frame: str | None = None

#: Return address register
retaddr: Tuple[str, ...] | None = None
retaddr: Tuple[str, ...]

#: Flags register (eflags, cpsr)
flags: Dict[str, BitFlags] | None = None
flags: Dict[str, BitFlags]

#: List of native-size general-purpose registers
gpr: Tuple[str, ...] | None = None
gpr: Tuple[str, ...]

#: List of miscellaneous, valid registers
misc: Tuple[str, ...] | None = None
misc: Tuple[str, ...]

#: Register-based arguments for most common ABI
regs = None
args: Tuple[str, ...]

#: Return value register
retval: str | None = None
retval: str | None

#: Common registers which should be displayed in the register context
common: list[str] = []
common: List[str] = []

#: All valid registers
all: set[str] | None = None
all: Set[str]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually wondering if we need any of these if they don't need to be class variables... I think the typing in __init__ should be sufficient, and the comments here aren't very useful. Maybe we could just delete them?

import pwndbg.gdblib.symbol
import pwndbg.heap.heap
from pwndbg.color import message
from pwndbg.gdblib.config import config
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we do from pwndbg.gdblib.config import config in pwndbg/gdblib/__init__.py. Maybe there's a way to get the types to check without modifying the code here? Like adding an additional type annotation for config in pwndbg/gdblib/__init__.py?

Copy link
Contributor Author

@peace-maker peace-maker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, I've changed most of it. I'll try to go over the other comments when I have time.

pwndbg/gdblib/ctypes.py Outdated Show resolved Hide resolved
pwndbg/gdblib/elf.py Show resolved Hide resolved
pwndbg/gdblib/proc.py Show resolved Hide resolved
pwndbg/lib/gcc.py Outdated Show resolved Hide resolved
Copy link
Member

@gsingh93 gsingh93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pretty much looks good to me minus a few nits. @CptGibbon and @lebr0nli, you could you just verify the comments I tagged you in look good? If you would also like more time to skim through ptmalloc.py and heap.py, let me know. If not, that's fine, I'll go ahead and merge this once the other changes are made.

Comment on lines +83 to +84
else: # BinType.NOT_IN_BIN
return []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CptGibbon @lebr0nli are you fine with this behavior of returning an empty list for BinType.NOT_IN_BIN or would you prefer the old behavior of None?

pwndbg/heap/ptmalloc.py Outdated Show resolved Hide resolved
pwndbg/heap/ptmalloc.py Show resolved Hide resolved
renames = {
"mchunk_size": "size",
"mchunk_prev_size": "prev_size",
}
if isinstance(pwndbg.heap.current, DebugSymsHeap):
val = pwndbg.gdblib.typeinfo.read_gdbvalue("struct malloc_chunk", addr)
if isinstance(pwndbg.heap.current.malloc_chunk, gdb.Type):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like different behavior to me, but I'm not very familiar with this code. @CptGibbon or @lebr0nli, can you verify this is correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a two-in-one check. pwndbg.heap.current.malloc_chunk is only of type gdb.Type if the allocator is DebugSymsHeap. Otherwise it would be pwndbg.heap.structs.CStruct2GDB. I can explicitly keep the previous check for DebugSymsHeap too for clearity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is from an older iteration of the ptmalloc typings where the properties like malloc_chunk had a union return type. Since they have the correct one now the old check for DebugSymsHeap is enough. I'll revert, good catch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or were you commenting on the change of

val = pwndbg.gdblib.typeinfo.read_gdbvalue("struct malloc_chunk", addr)

to

val = pwndbg.gdblib.memory.poi(pwndbg.heap.current.malloc_chunk, addr)

memory.poi was used in ptmalloc and it seemed better to not resolve the type again so I went with the poi way. Both lines do the same thing though.

self._gdbValue = pwndbg.gdblib.memory.poi(pwndbg.heap.current.malloc_chunk, addr)



def format_bin(bins: Bins, verbose=False, offset=None):
def format_bin(bins: Bins, verbose: bool = False, offset: int | None = None) -> List[str]:
assert isinstance(pwndbg.heap.current, GlibcMemoryAllocator)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for this PR, but we should really think about how we can remove all these assert isinstance(pwndbg.heap.current, GlibcMemoryAllocator) checks in this PR. Maybe this will happen naturally with the jemalloc GSoC project implementing a new allocator, not sure.

pwndbg/gdblib/elf.py Show resolved Hide resolved
@gsingh93
Copy link
Member

gsingh93 commented May 9, 2024

Also needs another rebase, but should just be a small one.

@disconnect3d
Copy link
Member

@gsingh93 go merge?:)

@gsingh93
Copy link
Member

LGTM, merging.

@gsingh93 gsingh93 merged commit 90dc42e into pwndbg:dev May 13, 2024
13 checks passed
@peace-maker peace-maker deleted the more-types branch May 13, 2024 22:10
@fidgetingbits fidgetingbits mentioned this pull request May 17, 2024
fidgetingbits added a commit to fidgetingbits/pwndbg that referenced this pull request May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants