Skip to content
This repository has been archived by the owner on Sep 22, 2022. It is now read-only.

ARM64 AES acceleration? #43

Open
snej opened this issue Feb 26, 2021 · 4 comments
Open

ARM64 AES acceleration? #43

snej opened this issue Feb 26, 2021 · 4 comments
Labels

Comments

@snej
Copy link

snej commented Feb 26, 2021

Do you plan to add support for ARM64 AES instructions, as you did for x86?

@erthink
Copy link
Owner

erthink commented Feb 27, 2021

Ideally, this would be nice, but I'm not going to do it yet:

  1. AESNI-accelerated variants of t1ha0() ("Just Only Faster", but not portable/stable) are designed specifically for x86, so inside there are two significantly different implementations for different CPU families. Therefore, to get the appropriate performance on ARM64, I need to create a new implementatio, but not try to port one of the x86 ones.
  2. It is quite difficult to design an implementation of a pretty fast hardware-accelerated hash function for ARM64, since:
    • ARMv8.x have a lot of optional features which are useful to the t1ha0() implementation (crc, crypto, simd, sve, sve2, aes, sha2, sha3, sm3/sm4, sve2+sm4/aes/sha3), but may have different performance depending on the CPU model.
      Moveover, a particular implementation using AES or SHA2 acceleration may be significantly faster than portable t1ha2() on the one CPU model, but significantly slower on another;
    • ARMv8.x haven't any common/generalized method for determining the availability of optional features at compile time and/or at runtime. On the contrary, it is required to enable these features by compiler-depend command line options, use compiler-dened macros and (in some cases) re-check ones availability at runtime by probe and SIGILL handler;
    • Therefore, for a good result, I should develop a set of functions for different (most popular and/or promising) ARM64 families, taking into account their capabilities, while having access to the corresponding set of hardware (i.e. reasonable subset of cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, ares, exynos-m1, falkor, neoverse-n1, neoverse-v1, neoverse-e1, qdf24xx, saphira, thunderx, vulcan, xgene1, xgene2, M1, etc).
  3. This is a big and interesting job. But I don't have any projects (or any other work) related to ARM64 right now. So I don't have the time or other resources to do this.
    Moreover, for now I don't have a "vision" of ARM64 market to make decisions for optimal/reasonable choice a set of ARM64 family/vendor/features as a baseline targets.

@erthink
Copy link
Owner

erthink commented Feb 27, 2021

Related to #42

@erthink
Copy link
Owner

erthink commented Feb 27, 2021

@snej, Please do not close this issue, it will be more useful to leave it open as a FAQ.

@snej
Copy link
Author

snej commented Mar 2, 2021

Thanks for the explanation!

I'm looking at this primarily for iOS and Android, and those are probably the biggest use cases overall. (In the embedded world, ARM CPUs are almost always 32-bit Cortex.)

I am guessing that all Apple ARM CPUs have AES instructions since iOS relies heavily on file encryption, so iOS (and ARM Mac) support wouldn't require too many #ifdefs. Android of course is another story.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants