Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autodetect asm.cpu whenever possible #3747

Open
XVilka opened this issue Aug 11, 2023 · 9 comments · May be fixed by #4196
Open

Autodetect asm.cpu whenever possible #3747

XVilka opened this issue Aug 11, 2023 · 9 comments · May be fixed by #4196
Labels

Comments

@XVilka
Copy link
Member

XVilka commented Aug 11, 2023

It is common to have ELF for ARM Cortex-M profile but it's not shown in the ELF header:

iorw     false
block    0x100
type     EXEC (Executable file)
arch     arm
cpu      N/A
baddr    0x342a0000
binsz    0x00d01f0b
bintype  elf
bits     32
class    ELF32
compiler GCC: 
dbg_file N/A
endian   LE
hdr.csum N/A
guid     N/A
intrp    N/A
laddr    0x00000000
lang     c++
machine  ARM
maxopsz  4
minopsz  2
os       linux
cc       N/A
pcalign  2
rpath    NONE

But the CPU profile can affect analysis drastically in the case of ARM Cortex-M, for example, because of additional instructions, and being Thumb, it has some effect on the sequence of disassembly.

We should figure out a way to detect Cortex-M ELFs whenever possible. Currently you have to specify it from command line:

$ rizin -A -e asm.cpu=cortexm firmware.elf

Would be nice to autodetect cortexm/cortexa profiles whenever possible.

Quite often compilers add a special section .ARM.attributes that has that information (note the Tag_CPU_arch_profile and Tag_CPU_arch attributes):

> readelf -A cortex-a8.out                                                       
Attribute Section: aeabi
File Attributes
  Tag_conformance: "2.10"
  Tag_CPU_arch: v7
  Tag_CPU_arch_profile: Application
  Tag_ARM_ISA_use: Yes
  Tag_THUMB_ISA_use: Thumb-2
  Tag_PCS_config: Bare platform
  Tag_ABI_align_needed: 8-byte
  Tag_ABI_align_preserved: 8-byte, except leaf SP
  Tag_ABI_enum_size: small
  Tag_ABI_VFP_args: compatible
  Tag_CPU_unaligned_access: v6
  Tag_DIV_use: Not allowed

 > readelf -A cortex-m33.out
Attribute Section: aeabi
File Attributes
  Tag_conformance: "2.10"
  Tag_CPU_arch: v8-M.mainline
  Tag_CPU_arch_profile: Microcontroller
  Tag_THUMB_ISA_use: Yes
  Tag_FP_arch: FPv5/FP-D16 for ARMv8
  Tag_PCS_config: Bare platform
  Tag_ABI_align_needed: 8-byte
  Tag_ABI_align_preserved: 8-byte, except leaf SP
  Tag_ABI_enum_size: forced to int
  Tag_ABI_HardFP_use: SP only
  Tag_ABI_VFP_args: compatible
  Tag_CPU_unaligned_access: v6
  Tag_DIV_use: Not allowed

See https://stackoverflow.com/questions/70071681/how-can-i-know-if-an-elf-file-is-for-cortex-a-or-cortex-m for more information

It should be changed somewhere probably in librz/bin/format/elf/.

See file librz/bin/format/elf/elf_info.c and get_cpu_mips() function as an example.

@XVilka XVilka added the good first issue Good for newcomers label Jan 2, 2024
@valdaarhun
Copy link
Contributor

Hi. I would like to work on this issue. I think I have got an idea on how to resolve this.

@valdaarhun valdaarhun linked a pull request Feb 7, 2024 that will close this issue
5 tasks
@valdaarhun
Copy link
Contributor

Quite often compilers add a special section .ARM.attributes that has that information (note the Tag_CPU_arch_profile and Tag_CPU_arch attributes)

Hi. Just to be clear, is our intention to simply recognize the cpu profile (eg: A, M, R, etc) or the specific processor family (eg: cortex, neoverse, etc.) that the elf is expected to run on?

Based on what I have understood after reading through ARM's addenda to their ABI and this wikipedia page on the list of ARM processors, it's quite clear that the "M" profile implies the cortex-m processor family or a similar family (like SecurCore) which shares the same features.

However, the "A" cpu profile could imply the cortex-a family or the neoverse family.

I noticed the following struct in librz/asm/p/asm_arm_cs.c:

RzAsmPlugin rz_asm_plugin_arm_cs = {
	.name = "arm",
	.desc = "Capstone ARM disassembler",
	.cpus = "v8,cortexm,arm1176,cortexA72,cortexA8",
	.platforms = "bcm2835,omap3430",
	.features = "v8",
	.license = "BSD",
	.arch = "arm",
	.bits = 16 | 32 | 64,
	.endian = RZ_SYS_ENDIAN_LITTLE | RZ_SYS_ENDIAN_BIG,
	.disassemble = &disassemble,
        ...
}

The cpus field is hard coded to a specific processor (eg: cortexA8) or a family (eg: cortexm). How do I go about dealing with other families such as Neoverse?

@XVilka
Copy link
Member Author

XVilka commented Feb 23, 2024

@valdaarhun for now detecting profile is enough, but since Rizin ARM decoding is based on Capstone, only those make sense for autodetection (https://github.com/capstone-engine/capstone/blob/next/include/capstone/arm.h#L1638):

// Architecture-specific groups
	// generated content <ARMGenCSFeatureEnum.inc> begin
	// clang-format off

	ARM_FEATURE_IsARM = 128,
	ARM_FEATURE_HasV5T,
	ARM_FEATURE_HasV4T,
	ARM_FEATURE_HasVFP2,
	ARM_FEATURE_HasV5TE,
	ARM_FEATURE_HasV6T2,
	ARM_FEATURE_HasMVEInt,
	ARM_FEATURE_HasNEON,
	ARM_FEATURE_HasFPRegs64,
	ARM_FEATURE_HasFPRegs,
	ARM_FEATURE_IsThumb2,
	ARM_FEATURE_HasV8_1MMainline,
	ARM_FEATURE_HasLOB,
	ARM_FEATURE_IsThumb,
	ARM_FEATURE_HasV8MBaseline,
	ARM_FEATURE_Has8MSecExt,
	ARM_FEATURE_HasV8,
	ARM_FEATURE_HasAES,
	ARM_FEATURE_HasBF16,
	ARM_FEATURE_HasCDE,
	ARM_FEATURE_PreV8,
	ARM_FEATURE_HasV6K,
	ARM_FEATURE_HasCRC,
	ARM_FEATURE_HasV7,
	ARM_FEATURE_HasDB,
	ARM_FEATURE_HasVirtualization,
	ARM_FEATURE_HasVFP3,
	ARM_FEATURE_HasDPVFP,
	ARM_FEATURE_HasFullFP16,
	ARM_FEATURE_HasV6,
	ARM_FEATURE_HasAcquireRelease,
	ARM_FEATURE_HasV7Clrex,
	ARM_FEATURE_HasMVEFloat,
	ARM_FEATURE_HasFPRegsV8_1M,
	ARM_FEATURE_HasMP,
	ARM_FEATURE_HasSB,
	ARM_FEATURE_HasDivideInARM,
	ARM_FEATURE_HasV8_1a,
	ARM_FEATURE_HasSHA2,
	ARM_FEATURE_HasTrustZone,
	ARM_FEATURE_UseNaClTrap,
	ARM_FEATURE_HasV8_4a,
	ARM_FEATURE_HasV8_3a,
	ARM_FEATURE_HasFPARMv8,
	ARM_FEATURE_HasFP16,
	ARM_FEATURE_HasVFP4,
	ARM_FEATURE_HasFP16FML,
	ARM_FEATURE_HasFPRegs16,
	ARM_FEATURE_HasV8MMainline,
	ARM_FEATURE_HasDotProd,
	ARM_FEATURE_HasMatMulInt8,
	ARM_FEATURE_IsMClass,
	ARM_FEATURE_HasPACBTI,
	ARM_FEATURE_IsNotMClass,
	ARM_FEATURE_HasDSP,
	ARM_FEATURE_HasDivideInThumb,
	ARM_FEATURE_HasV6M,

As rizin doesn't have a way to select particular features, only CPUs with sets of particular features are possible for now.

cc @Rot127

@XVilka
Copy link
Member Author

XVilka commented Feb 23, 2024

@valdaarhun if you check disasssemble() function in the librz/asm/p/asm_arm_cs. you will see that only CS_MODE_MCLASS and CS_MODE_V8 are used. Thus, it's fine to detect just those for now.

@valdaarhun
Copy link
Contributor

I see. In that case, I'll just focus on these two classes.

@valdaarhun
Copy link
Contributor

Hi. The functions get_cpu_mips or get_cpu_arm in librz/bin/format/elf/elf_info.c simply print the cpu name. How do I get rizin to actually make sense of it before disassembly?

In librz/arch/p/asm_arm_cs:disassemble(), it checks the value of a->cpu. I am guessing it needs to figure out a way to set a->cpu to "cortexm" or "v8". But where is this actually set?

When rizin is run with -e asm.cpu=cortexm, it calls rz_config_eval(). I think this sets the value in r->config. Should I use the same/similar approach in get_cpu_arm()?

@XVilka
Copy link
Member Author

XVilka commented Mar 14, 2024

Hmm, I thought this value is used somewhere, my bad. Ok, you need to pass it to the config somehow, yes. It's probably should be done somewhere in librz/core/cbin.c

@valdaarhun
Copy link
Contributor

Thank you for your response. I'll take a look at cbin.c.

@Rot127
Copy link
Member

Rot127 commented Mar 15, 2024

@valdaarhun Sorry, I missed the mention above from @XVilka. It's fine, if for now it can only check for armv8 or the M-profile. Although, please ensure it is easily extendible. So when we add toggles for all the other CPU features (e.g. see list above), it takes only minimal effort.
In the best case implement your solution only for armv8 and add coretx-m toggle afterwards. So you can check if it is actually easy to add a feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants