Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add return address stack location to context backtrace #1288 #1740

Open
wants to merge 9 commits into
base: dev
Choose a base branch
from

Conversation

dmur1
Copy link
Contributor

@dmur1 dmur1 commented May 24, 2023

  • Add pwndbg.gdblib.find_addr_on_stack

  • Change coloration of context backtrace to use pwnlib.color.memory

  • Remove now defunct address color config from context

  • Update test_context_backtrace_show_proper_symbol_names regex

* Add pwndbg.gdblib.find_addr_on_stack

* Change coloration of context backtrace to use pwnlib.color.memory

* Remove now defunct address color config from context

* Update test_context_backtrace_show_proper_symbol_names regex
@dmur1 dmur1 marked this pull request as draft May 24, 2023 13:18
pwndbg/commands/context.py Show resolved Hide resolved
@@ -136,3 +136,27 @@ def is_executable() -> bool:
nx = True

return not nx


def find_addr_on_stack(addr):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method is basically copied from here https://github.com/pwndbg/pwndbg/blob/dev/pwndbg/commands/stack.py#L26

i wonder if both should use this centralised approach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think it makes sense to centralize it. Also I believe there's a small edge case bug, it should be:

-while addresses and start < sp < stop:
+while addresses and start <= sp < stop:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll make another issue to centralise that in another change i think ( and i'll fix that edge case and note it in the centralisation issue )

@@ -201,21 +201,19 @@ def test_context_backtrace_show_proper_symbol_names(start_binary):
== "─────────────────────────────────[ BACKTRACE ]──────────────────────────────────"
)

assert re.match(r".*0 0x[0-9a-f]+ A::foo\(int, int\)", backtrace[2])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how important is it to tightly fit this regex to the entire output in a test that is mainly of proper symbol names?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more accurate we are the less regressions we will have :). I am fine with more relaxed assertions: its better to have some tests than none and its hard to test for everything.

@dmur1 dmur1 marked this pull request as ready for review June 17, 2023 09:29
@codecov-commenter
Copy link

codecov-commenter commented Jun 17, 2023

Codecov Report

Merging #1740 (5f7190a) into dev (9e85d0d) will decrease coverage by 0.12%.
Report is 57 commits behind head on dev.
The diff coverage is 69.81%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##              dev    #1740      +/-   ##
==========================================
- Coverage   60.61%   60.50%   -0.12%     
==========================================
  Files         175      179       +4     
  Lines       21300    22024     +724     
  Branches     1909     2029     +120     
==========================================
+ Hits        12912    13326     +414     
- Misses       7747     8025     +278     
- Partials      641      673      +32     
Files Changed Coverage Δ
pwndbg/ui.py 56.06% <23.07%> (-18.94%) ⬇️
pwndbg/gdblib/kernel/__init__.py 79.14% <68.57%> (-1.47%) ⬇️
pwndbg/commands/context.py 62.83% <100.00%> (-3.91%) ⬇️
pwndbg/commands/stack.py 100.00% <100.00%> (ø)
pwndbg/gdblib/stack.py 80.48% <100.00%> (+8.02%) ⬆️

... and 173 files with indirect coverage changes

Copy link
Member

@gsingh93 gsingh93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work overall, just needs a few minor changes.

pwndbg/commands/context.py Outdated Show resolved Hide resolved
pwndbg/commands/context.py Outdated Show resolved Hide resolved
pwndbg/commands/stack.py Show resolved Hide resolved
pwndbg/gdblib/stack.py Outdated Show resolved Hide resolved
pwndbg/commands/context.py Outdated Show resolved Hide resolved
@disconnect3d
Copy link
Member

@dmur1 can u show a screenshot before and after?

@gsingh93 gsingh93 modified the milestone: 2023.10 Jul 17, 2023
Copy link
Contributor Author

@dmur1 dmur1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take a look at the newest stuff!

@dmur1
Copy link
Contributor Author

dmur1 commented Jul 27, 2023

@dmur1 can u show a screenshot before and after?

before:
image

after:
image

@@ -136,3 +136,27 @@ def is_executable() -> bool:
nx = True

return not nx


def find_addr_on_stack(addr) -> int:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit worried that this may be (way) too slow.

  • We are reading the whole stack memory: what if stack is super big?
  • What if SP does not point to mapped memory? This will crash
  • This can be optimized out in two ways:
  1. We could leverage the existing search functionality we have which will do a find using GDB's API which is written in C and so should be faster
  2. We could read the whole stack memory and use simple .find(...) to find the offset of the bytes of the addr value

The 2) will rather be slower than 1) but if we read the stack memory just once, then we can find the fetched bytes multiple times to look for different return addresses --- which would probably be faster overall?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this I am not sure if we really want to look for those values on the stack... we are making context slower this way. We can however make this configurable.

@gsingh93 thoughts?

symbol = c.symbol(pwndbg.gdblib.symbol.get(int(frame.pc())))
frame_pc = frame.pc()
frame_pc_on_stack = pwndbg.gdblib.stack.find_addr_on_stack(frame_pc)
padded_text = "0x" + hex(frame_pc_on_stack)[2:].zfill(pwndbg.gdblib.arch.ptrsize * 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hex(...) returns a 0x... string, so we don't need to [2:] it and prepend a 0x string

Also nitpick, but we fetch and compute pwndbg.gdblib.arch.ptrsize * 2 in each loop iteration :<

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hex(1234).zfill(8)
'0000x4d2'

i was trying to avoid this - perhaps theres a way to zfill an already 0x prefixed string?

@disconnect3d

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will hoist the computation out of the loop

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this zfill. Like, do we really want to display those zeroes? :P They are redundant

frame_pc_on_stack = pwndbg.gdblib.stack.find_addr_on_stack(frame_pc)
padded_text = "0x" + hex(frame_pc_on_stack)[2:].zfill(pwndbg.gdblib.arch.ptrsize * 2)
stackaddr = M.get(frame_pc_on_stack, padded_text)
addrsz = "(" + stackaddr + ") " + M.get(frame_pc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use f-string literals here and wherever we can?

@disconnect3d
Copy link
Member

@dmur1 can u show a screenshot before and after?

before: image

after: image

Random thoughts about the display:

  1. I don't like the 0x<zeros> prepended
  2. Maybe we should display <sp register>+0x<OFFSET> instead of whole addresses?

...or maybe this should be configurable, ugh. cc: @gsingh93

@dmur1
Copy link
Contributor Author

dmur1 commented Aug 3, 2023

@dmur1 can u show a screenshot before and after?

before: image
after: image

Random thoughts about the display:

  1. I don't like the 0x<zeros> prepended
  2. Maybe we should display <sp register>+0x<OFFSET> instead of whole addresses?

...or maybe this should be configurable, ugh. cc: @gsingh93

i have to admit i'm also not a fan especially as it contrasts with the other aligned addresses in the other context sections

@dmur1 dmur1 requested a review from disconnect3d August 3, 2023 13:09
@dmur1
Copy link
Contributor Author

dmur1 commented Feb 22, 2024

bumping this - i'm still not sure i like it what do you folks think?

@disconnect3d
Copy link
Member

I think prefixing with additional zeroes is not great. What else do u ask about? :P

@dmur1
Copy link
Contributor Author

dmur1 commented Feb 22, 2024

I think prefixing with additional zeroes is not great. What else do u ask about? :P

so maybe if just padded as much as was needed to keep everything aligned?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants