Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RISCV: remove used of rdW and sign extend in fcvt/fmv #6492

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Sleigh-InSPECtor
Copy link
Contributor

As part of a research project testing the accuracy of the sleigh specifications compared to real hardware, we observed an unexpected behaviour in the fcvt.w.s/d, fcvt.wu.s/d and fmv.x.w instructions. According to section 11.7 of the 20191213 unprivileged specification, the expected behaviour is to sign extend the result to the width of the destination register. In the case of fmv.x.w the expected behaviour is to fill the upper bit with copies of the floating point numbers sign bit. While the current behaviour of these instructions writes no values to the destination register as they are written to the temporary export by rdW instead.

Example Instruction

fcvt.w.s a0,fa0 (fa0 = 0xFFFFFFFF_C0A00000)

Current behaviour
Before: a0 = 0x9DCFE753_DAEC613E, Hardware after: a0 = 0xFFFFFFFF_FFFFFFFB, Ghidra after: 0x9DCFE753_DAEC613E.

After fix
Before: a0 = 0x9DCFE753_DAEC613E, Hardware after: a0 = 0xFFFFFFFF_FFFFFFFB, Ghidra after: 0xFFFFFFFF_FFFFFFFB.

@GhidorahRex
Copy link
Collaborator

What's the difference between fcvt.w.s and fcvt.wu.s? If there's still a TODO there, then it must be doing something incorrect, right?

@GhidorahRex GhidorahRex self-assigned this May 9, 2024
@GhidorahRex GhidorahRex added Type: Bug Something isn't working Feature: Processor/RISC-V Status: Triage Information is being gathered labels May 9, 2024
@Sleigh-InSPECtor
Copy link
Contributor Author

Yes, the fcvt.wu.s instruction still isn't fully correct so I left the todo note. fcvt.w.s converts a single floating point number into a signed 32 bit integer while fcvt.wu.s converts a single floating point number to an unsigned 32 bit integer. We believe the best way to properly handle this case would be to introduce a new pcode operator which handles a float to unsigned int conversion, this would also be beneficial to other architectures.

@GhidorahRex
Copy link
Collaborator

I think trunc(abs(val)) should be the correct way to handle the unsigned case.

@Sleigh-InSPECtor
Copy link
Contributor Author

trunc(abs(val)) wont work as if the value was -5.0 it would first be converted 5.0 with abs and then converted to the integer 5 with trunc. The desired behaviour here is for it to saturate to the lower bounds 0. An if statement could be used to handle when the value is less than 0 but this messes with decompilation. If this was just RISCV then it might be ok to put up with but it is common across other architectures like AARCH with the FCVTZU instruction which has the same saturating behaviour.

@GhidorahRex
Copy link
Collaborator

How does the sign-extension work if a negative result is truncated to 0?

What about:

	local tmp:4 = ((frs1D f<= 0) * 0) + (frs1D f> 0) * trunc(frs1D));
	rd = sext(tmp);

@Sleigh-InSPECtor
Copy link
Contributor Author

Sleigh-InSPECtor commented May 14, 2024

The sign extension is from the line "For XLEN> 32, FCVT.W[U].S sign-extends the 32-bit result to the destination register width" under section 11.7. This is ambiguous but was asked in the issue riscv/riscv-isa-manual#69 and clarified to be propagating bit 31. This also reflects the behaviour of the T-Head TH1520 processor which we tested against.

Using the branches conditions does work but still shows up in decompilation.

Decompilation with incorrect truncation

uint float_maths(float param_1,float param_2)
{
  gp = &__global_pointer$;
  return (int)(param_1 + param_2) * 5;
}

Decompilation with conditional pcode fix

uint float_maths(float param_1,float param_2)
{
  gp = &__global_pointer$;
  return (uint)(0.0 < param_1 + param_2) * (int)(param_1 + param_2) * 5;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • fcvt.w.d looks good
  • fcvt.wu.d likely needs a new pcode operation trunc_unsigned if you are going to trust emulation. If all you need is the decompiler view, then it's good enough as is.

Tested on whisper.cpp exemplar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Processor/RISC-V Status: Triage Information is being gathered Type: Bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants