Decompiler erroneously swapping operators without changing the operand in floating-point comparisons where data is marked constant (x86) #6528

K1TCH · 2024-05-16T23:39:58Z

Describe the bug
When a float variable is being compared with a double variable using one of the FCOM instructions, if the double is made constant, the comparator will flip directions, rendering the comparison incorrect.

For example:

if (x < const_1.5) {
    return 0;
}

will become

if (1.5 < x) {
    return 0;
}

To Reproduce
Steps to reproduce the behavior:

Load the attached binary (I used x86:LE:32:default)
Make the value at address 0x0 a double
Make the value at address 0x8 a float
Observe the decompiler
Right-click the double in the listing view
Select Data -> Settings
Click 'normal' in the Mutability settings
Click constant
Click OK
Observe the decompiler again. The operands should be flipped, while the operator stays the same.

Expected behavior
I expected the constant to be placed last in the comparison, flipping the operands and operators as necessary to make the statement still be correct.

Screenshots
Original:

After marking the double value as constant:

Attachments
fcompError

Environment (please complete the following information):

OS: Ubuntu 23.04
Java Version: openjdk 17.0.8 2023-07-18
Ghidra Version: 10.2.3
Ghidra Origin: Official Github release

Additional context
I've also seen the bug on Windows 10 with versions 10.3.2, 10.4, 11.0, and 11.0.2

The text was updated successfully, but these errors were encountered:

The decompiler rule `subflow_convert` would sometimes swap the inputs to the P-Code ops `FLOAT_LESS` and `FLOAT_LESSEQUAL` if the float that was traced happened to be the second input of the operation, because the transformed operation had its inputs hardcoded: the traced float would always be the first input. While this also affected `FLOAT_EQUAL` and `FLOAT_NOTEQUAL`, it does not matter in those cases, because swapping the inputs for those operations is still logically equivalent. Fixes NationalSecurityAgency#6528.

LukeSerne · 2024-05-19T21:08:57Z

Reproducing on Ghidra 11.0.3

I loaded the attached file as x86:LE:32:default:gcc on Ghidra 11.0.3 (Github release) on Ubuntu 22.04.

Here's the decompilation at step 4 (the double is not constant):

Here, the decompiled condition for the case in which 0xffffffff is returned is in fact equivalent to (Y > X) && !NAN(Y) && !NAN(X), as the following derivation shows, with comments between parentheses. It seems the decompiler is not able to simplify this further.

    (byte)(DOUBLE_00000000 < (double)FLOAT_00000008 | (byte)((ushort)((ushort)(NAN(DOUBLE_00000000) || NAN((double)FLOAT_00000008)) << 10) >> 8) | (byte)((ushort)((ushort)(DOUBLE_00000000 == (double)FLOAT_00000008) << 0xe) >> 8)) == 0
 = { '(byte)x' == 'x & 0xFF'  and  let 'X' := '(double)FLOAT_00000008'  and  let 'Y' := 'DOUBLE_00000000' }
    (Y < X | ((ushort)((ushort)(NAN(Y) || NAN(X)) << 10) >> 8) & 0xFF | ((ushort)((ushort)(Y == X) << 14) >> 8) & 0xFF) & 0xFF == 0
 = { 'a | b | c == 0' => '(a == 0) && (b == 0) && (c == 0)' }
    (Y < X) == 0 && (((ushort)((ushort)(NAN(Y) || NAN(X)) << 10) >> 8) & 0xFF == 0) && (((ushort)((ushort)(Y == X) << 14) >> 8) & 0xFF == 0)
 = { ('(a < b) == 0' => '! (a < b)')  and  ('((a << b) >> c)' == 'a << (b - c)' if c < b)  and  removing of unnecessary casts }
    !(Y < X) && (((NAN(Y) || NAN(X)) << 2) == 0) && (((Y == X) << 6) == 0)
 = { 'a << b == 0' => 'a == 0'  and  '(a == b) == 0' => '! (a == b)' }
    !(Y < X) && ((NAN(Y) || NAN(X)) == 0) && !(Y == X)
 = { 'a && b && c' => 'a && c && b' }
    !(Y < X) && !(Y == X) && ((NAN(Y) || NAN(X)) == 0)
 = { ('! (a < b)' => 'a >= b')  and  ('!(a == b)' => 'a != b') }
    (Y >= X) && (Y != X) && ((NAN(Y) || NAN(X)) == 0)
 = { '(a >= b) && (a != b)' => 'a > b' }
    (Y > X) && ((NAN(Y) || NAN(X)) == 0)
 = { '(a || b) == 0' => '!a && !b' }
    (Y > X) && !NAN(Y) && !NAN(X)

And here's the decompilation at step 10 (the double is constant):

Here, the decompiled condition for the case in which 0xffffffff is returned is in fact equivalent to (X > 1.5) && !NAN(X), as the following derivation shows, with comments between parentheses. It seems the decompiler is not able to simplify this further.

	(byte)(FLOAT_00000008 < 1.5 | (byte)((ushort)((ushort)NAN(FLOAT_00000008) << 10) >> 8) | (byte)((ushort)((ushort)(FLOAT_00000008 == 1.5) << 0xe) >> 8)) == 0
 = { '(byte)x' == 'x & 0xFF'  and  let 'X' := 'FLOAT_00000008' }
	(X < 1.5 | ((ushort)(NAN(X) << 10) >> 8) & 0xFF | ((ushort)((X == 1.5) << 14) >> 8) & 0xFF) & 0xFF == 0
 = { 'a | b | c == 0' => '(a == 0) && (b == 0) && (c == 0)' } 
	((X < 1.5) == 0) && (((ushort)(NAN(X) << 10) >> 8) & 0xFF == 0) && (((ushort)((X == 1.5) << 14) >> 8) & 0xFF == 0)
 = { ('(a < b) == 0' => '! (a < b)')  and  ('((a << b) >> c)' == 'a << (b - c)' if c < b)  and  removing of unnecessary casts }
	!(X < 1.5) && (NAN(X) << 2 == 0) && ((X == 1.5) << 6 == 0)
 = { 'a << b == 0' => 'a == 0' }
	!(X < 1.5) && !NAN(X) && !(X == 1.5)
 = { 'a && b && c' => 'a && c && b' }
	!(X < 1.5) && !(X == 1.5) && !NAN(X)
 = { ('! (a < b)' => 'a >= b')  and  ('!(a == b)' => 'a != b') }
	(X >= 1.5) && (X != 1.5) && !NAN(X)
 = { '(a >= b) && (a != b)' => 'a > b' }
	(X > 1.5) && !NAN(X)

So, if the DOUBLE_00000000 is not set as constant (and ignoring the possibility that either float might be NaN), the condition is equivalent to (DOUBLE_00000000 > FLOAT_00000008). If the DOUBLE_00000000 is set as constant, the condition is equivalent to (FLOAT_00000008 > DOUBLE_00000000). So, the bug still exists in Ghidra version 11.0.3.

I also created two xml files for debugging the decompilation, one with the double's mutability set to normal, and one with the mutability set to constant:

fcomp_error_normal.xml
fcomp_error_constant.xml

Tracking down the responsible decompiler rule

Loading the "constant" file in DecompVis, we find that the 49th rule application (of rule subflow_convert) flips the order of the operands to the FLOAT_LESS (incorrectly shown as INT_LESS in the below screenshots because #5063 hasn't been merged yet).

Before the application of rule subflow_convert:

After the application of rule subflow_convert:

Finding and fixing the bug in `subflow_convert`

The subflow_convert rule is implemented by the class RuleSubfloatConvert, which uses the class SubfloatFlow to do the analysis and data flow transformation. The documentation of this rule mentions:

/// \brief Class for tracing changes of precision in floating point variables
///
/// It follows the flow of a logical lower precision value stored in higher precision locations
/// and then rewrites the data-flow in terms of the lower precision, eliminating the
/// precision conversions.

Looking through the implementation, we find this suspicious piece of code in traceForward (code from latest master at the time of writing):

ghidra/Ghidra/Features/Decompiler/src/decompile/cpp/subflow.cc

Lines 2640 to 2660 in 8b5f6da

 case CPUI_FLOAT_EQUAL: 

 case CPUI_FLOAT_NOTEQUAL: 

 case CPUI_FLOAT_LESS: 

 case CPUI_FLOAT_LESSEQUAL: 

 { 

 int4 slot = op->getSlot(vn); 

 TransformVar *rvn2 = setReplacement(op->getIn(1-slot)); 

 if (rvn2 == (TransformVar *)0) return false; 

 if (rvn == rvn2) { 

 list<PcodeOp *>::const_iterator ourIter = iter; 

 --ourIter; // Back up one to our original iterator 

 slot = op->getRepeatSlot(vn, slot, ourIter); 

 } 

 if (preexistingGuard(slot, rvn2)) { 

 TransformOp *rop = newPreexistingOp(2, op->code(), op); 

 opSetInput(rop, rvn, 0); 

 opSetInput(rop, rvn2, 1); 

 terminatorCount += 1; 

 } 

 break; 

 }

On lines 2655 and 2656, a transform operation is created, and rvn and rvn2 are set to use slot 0 and 1 respectively, regardless of what their input slot is in the current operation. Changing this locally to use slot and 1-slot and then recompiling the decompiler, we find that this indeed fixes the issue - the 1.5 constant is on the left of the < operator:

$ ./decomp_dbg
[decomp]> restore /tmp/fcomp_error_constant.xml.txt
/tmp/fcomp_error_constant.xml.txt successfully loaded: Intel/AMD 32-bit x86
[decomp]> lo fu FUN_0000000c
Function FUN_0000000c: 0x0000000c
[decomp]> decompile
Decompiling FUN_0000000c
Decompilation complete
[decomp]> print C

undefined4 FUN_0000000c(void)

{
  undefined4 uVar1;
  
  if ((byte)(1.5 < FLOAT_00000008 | (byte)((ushort)((ushort)NAN(FLOAT_00000008) << 10) >> 8) |
            (byte)((ushort)((ushort)(FLOAT_00000008 == 1.5) << 0xe) >> 8)) == 0) {
    uVar1 = 0xffffffff;
  }
  else {
    uVar1 = 1;
  }
  return uVar1;
}

I created a PR with this change (#6540) that fixes this issue.

LukeSerne linked a pull request May 19, 2024 that will close this issue

Decompiler: Fix inputs sometimes being swapped for operations FLOAT_LESS and FLOAT_LESSEQUAL in rule subflow_convert #6540

Open

ryanmkurtz assigned caheckman May 21, 2024

ryanmkurtz added Feature: Decompiler Status: Triage Information is being gathered labels May 21, 2024

LukeSerne mentioned this issue May 25, 2024

Decompiler: RuleIgnoreNan is too aggressive when removing NaN branches #6580

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decompiler erroneously swapping operators without changing the operand in floating-point comparisons where data is marked constant (x86) #6528

Decompiler erroneously swapping operators without changing the operand in floating-point comparisons where data is marked constant (x86) #6528

K1TCH commented May 16, 2024

LukeSerne commented May 19, 2024

Decompiler erroneously swapping operators without changing the operand in floating-point comparisons where data is marked constant (x86) #6528

Decompiler erroneously swapping operators without changing the operand in floating-point comparisons where data is marked constant (x86) #6528

Comments

K1TCH commented May 16, 2024

LukeSerne commented May 19, 2024

Reproducing on Ghidra 11.0.3

Tracking down the responsible decompiler rule

Finding and fixing the bug in subflow_convert

Finding and fixing the bug in `subflow_convert`