Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve internal register API #699

Open
eteran opened this issue Mar 16, 2019 · 6 comments
Open

Improve internal register API #699

eteran opened this issue Mar 16, 2019 · 6 comments

Comments

@eteran
Copy link
Owner

eteran commented Mar 16, 2019

A while ago I checked in the file RegisterRef with the goal of improving the Register API to make it more robust and involve less copying.

Essentially it is kinda analogous to what string_view is to strings. It is a non-owning view of part of the PlatformState object.

The idea is that we can unify all of the "get_xxx_register" functions to always return a RegisterRef regardless of the register type, since it will simply point to the right bytes in the State object and know how many bytes are part of that register.

The example I have checked in needs some work as it is clunky and requires a state object. I think a pointer can be even better and avoid this requirement. I think we don't even really have to worry about alignment issues since the underlying structures that we'd be pointing to would require that the data members be aligned.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@eteran
Copy link
Owner Author

eteran commented Dec 5, 2019

@10110111 I've started to more seriously prototype this out in a branch and sometime soon, I'd love some feedback.

Basically, our State API isn't amazing for supporting multiple arches. And it requires that the Register object basically be like a variant in that it is as big as our biggest register (which are getting pretty big with the ZMM stuff and beyond!).

The idea is that a RegisterRef would basically be a name, pointer, and a length and like a string_view or span would not own the bytes, but simply be a view into the state.

This should allow us to vastly simplify our IState/PlatformState code in the long run because right now, we have a lot of inconsistent ways of reading registers. Sometimes they return reg_t, sometimes they return address_t, sometimes they return Register, others value80... it's a mess.

I'd like to have a single API where we can just say:

auto reg1 = state.getRegister(edb::string_hash("fpu"), St0);
auto reg2 = state.getRegister(edb::string_hash("gp"), Eax);

etc..

The main benefit being that the state API becomes entirely arch independent.

Thoughts?

@10110111
Copy link
Contributor

10110111 commented Dec 5, 2019

You'd still need the whole Register when you want to modify it, wouldn't you?

@eteran
Copy link
Owner Author

eteran commented Dec 5, 2019

Not necessarily. A write operation would allow for overloading (unlike a read operation since we can't overload on return type). So we could have an API that looks like this:

void setRegisterValue(const RegisterRef &reg, const edb::value32 value);
void setRegisterValue(const RegisterRef &reg, const edb::value64 value);
// etc...

This would allow the underlying PlatformState to do any sanity checks it needs to before modify the actual bytes in the state.

@eteran
Copy link
Owner Author

eteran commented Dec 5, 2019

Another option which I like less... is to instead just use inheritance with polymorphism. Have the state return a heap-allocated object which has an opaque platform-agnostic API, but is an instance of a platform-specific register class.

@10110111
Copy link
Contributor

10110111 commented Dec 5, 2019

void setRegisterValue(const RegisterRef &reg, const edb::value32 value);
void setRegisterValue(const RegisterRef &reg, const edb::value64 value);

This variant sounds good.

Have the state return a heap-allocated object which has an opaque platform-agnostic API, but is an instance of a platform-specific register class.

Isn't this mostly the same as returning Register, just with additional overhead of heap allocation and indirections?

In any case, I don't have more thoughts currently. I need to see the changes you're going to do as concrete code.

@eteran
Copy link
Owner Author

eteran commented Dec 5, 2019

Yea, for writing values, the path seems pretty easy, it's the reading values without so much inefficiency that's hard.

I suppose... We could have the getRegisterValue take a template parameter saying "I expect this register to have the type I've specified, error if it isn't".

Less user-friendly, but would be optimally efficient (as far as we continue to copy data around goes). The "Ref" version would be copy-less for reads.... but we'd want to copy when the user asks for a concrete editable value, so maybe that's not a huge win overall?

I'll play with it some more and see if I get to a point where it makes sense to push a branch to look at.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants