Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Feature IFDEFs and unions/bitfields in public API structs #280

Open
flobernd opened this issue Dec 7, 2021 · 5 comments
Open
Labels
A-build Area: Build system C-enhancement Category: Enhancement of existing features

Comments

@flobernd
Copy link
Member

flobernd commented Dec 7, 2021

Our internal structs are using IFDEFs to enable/disable feature specific fields on deman (e.g. AVX-512 related fiels). As well, bitfields are used to optimize struct sizes.

While bindings prevents us to do the same optimizations for the public API structs, we could introduce a new ZYDIS_SIZE_OPTIMIZED_UNSAFE_FOR_BINDINGS CMake option (naming open for discussion as well).

This option would indicate that Zydis is used in a pure C/C++ project and always built on demand (-> we know the feature defines of headers and code files are the same; this is the essential part!). This would allow us to e.g. change the ZydisDecodedOperand struct to union or to introduce bitfields in the public API.

@flobernd flobernd added C-enhancement Category: Enhancement of existing features A-build Area: Build system labels Dec 7, 2021
@athre0z
Copy link
Member

athre0z commented Dec 12, 2021

Hmmm, these compile-time settings are kind of hard to get to work when Zydis is installed from e.g. OS repositories. When building the binaries for Zydis in e.g. Debian or vcpkg repos, we'd have to make a decision. Since we'd also want bindings to work with these binaries, we'd have to disable these optimizations. Since vcpkg seems to become increasingly popular amongst our users, only a fraction of users would thus be able to really make use of this. I'm not opposed to having it as optional switch, but we can't rely on this to shrink our struct in the general case.

With regards to unions, I think we should just use them and make it the binding's problem on how to deal with that. In retrospective, I think restricting our API to the smallest common denominator of all languages that we want bindings to was a mistake. If most languages can support a feature, then we should use it.

Bitfields are still quite problematic, because you can't really support them in bindings even if you're willing to do it all manually, simply because C doesn't provide enough ABI guarantees to ever get that to work. This would thus be one of the candidates for being behind ZYDIS_SIZE_OPTIMIZED_UNSAFE_FOR_BINDINGS (we'd really need a better name of this though :p).

@flobernd
Copy link
Member Author

flobernd commented Dec 13, 2021

That's fine! Was always meant to be an optional switch (default off).

Can we make a list of languages that support unions? Just a few I know they support them:

  • C/C++ (obviously)
  • C# ([FieldOffset()] attribute, C compatible)
  • Delphi (case construct; last field of a struct only)
  • Go (cgo; ugly binary level access using getters)
  • Java (javalution; constraints?)
  • Python (ctypes, C compatible)
  • Rust (constraints?)

@bjorn3
Copy link
Contributor

bjorn3 commented Dec 13, 2021

Rust has native support for unions. (https://doc.rust-lang.org/reference/items/unions.html)

@athre0z
Copy link
Member

athre0z commented Dec 13, 2021

My idea for Rust was to reorder our structs slightly so that the tag (e.g. type field for operands) immediately precedes the union, allowing us to use the really_tagged_unions feature to model them as safe enums. This is more convenient than using Rust's language feature for unions because they require unsafe on every use, so we'd have to wrap them with safe getters. While that wouldn't be a big deal either (and might be required for the union in the raw portion of the instruction struct, since it doesn't really have a dedicated tag), it's nice to have proper enums that you can match on for the operands.

For Python bindings, we use Cython and thus support this with no additional effort (since it's essentially like writing C in a weird Python-esque syntax).

@bjorn3
Copy link
Contributor

bjorn3 commented Dec 13, 2021

My idea for Rust was to reorder our structs slightly so that the tag (e.g. type field for operands) immediately precedes the union, allowing us to use the really_tagged_unions feature to model them as safe enums.

I thought about that feature too, but couldn't find if it had already been implemented yet or not. Just got confirmation that it was already effectively implemented when the rfc got merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-build Area: Build system C-enhancement Category: Enhancement of existing features
Projects
None yet
Development

No branches or pull requests

3 participants