Chapter 17: Floating-Point — LLVM Validation

LLVM version tested: 23.0.0git (build at ~/llvm-project/build/bin)

Already handled correctly

Pattern	IR	LLVM output	Notes
Float sign test (`f < 0`)	`bitcast; icmp slt 0`	`movmskps + andl $1`	2 insns, optimal
Float NaN test via `fcmp uno`	`fcmp uno %f, %f`	`ucomiss %xmm0,%xmm0; setp`	2 insns, optimal
Float-to-int via `fptosi`	`fptosi float to i32`	`cvttss2si`	1 insn, optimal
§17-3 sign-magnitude precondition	`ashr $31; lshr $1; xor; sub`	7 insns	Sequential data-dependent chain — no better alternative
§17-4 fast inv sqrt (integer step)	`bitcast; lshr $1; sub $magic`	5 insns	Faithful — should not fold to `rsqrtss` (different precision)
§17-4 fast inv sqrt + Newton	integer step + `fmul/fadd` Newton	11 insns	Correct — LLVM does not (and should not) fold to `rsqrtss` without `-ffast-math`
Float comparison via integer (§17-3)	preconditon both operands + `icmp slt`	10 insns (vectorised)	LLVM correctly does NOT fold to `ucomiss` — NaN semantics differ

Missed optimization: integer NaN check not folded to `fcmp uno`

Pattern

§17-3 gives a table of IEEE field tests expressed as pure integer operations on the float’s bit pattern. The NaN test is:

(*(int*)&f & 0x7FFFFFFF) > 0x7F800000  // true iff f is NaN

In LLVM IR:

%i = bitcast float %f to i32
%a = and  i32 %i, 2147483647   ; 0x7FFFFFFF
%r = icmp ugt i32 %a, 2139095040  ; 0x7F800000 = bits(+∞)

This is semantically equivalent to fcmp uno float %f, %f.

Instruction counts

Target	Current	Optimal	Miss
Any x86-64	4	2 (`ucomiss + setp`)	2

Root cause

opt -O2 partially folds the pattern: it recognizes bitcast & 0x7FFFFFFF as fabs and rewrites to bitcast(fabs(%f)) ugt bits(+∞). But it does not take the final step of recognizing that result as fcmp uno %f, %f.

The fix is in InstCombine: bitcast(fabs(f)) ugt bits(+∞) → fcmp uno f, f.

Comparison: `float_is_inf` is NOT a miss

The corresponding infinity check (bits & 0x7FFFFFFF) == 0x7F800000 similarly folds to fcmp oeq fabs(f), +Inf in opt, but llc correctly keeps it as a 4-instruction integer sequence. The reason: ucomiss(NaN, +Inf) sets ZF=1 and PF=1, so sete alone gives the wrong answer (1) for NaN inputs. The integer path handles NaN correctly at the same cost.

For the NaN check, ucomiss %xmm0, %xmm0 has no such ambiguity — setp reads exactly the PF set by an unordered compare. So 4 insns → 2 is a genuine win.

Test files

File	What it tests
`ch17_float.ll`	Sign-magnitude precondition, fast inv sqrt, float_is_nan (int vs fcmp), float_is_inf, float_is_negative
`bug-float-nan-int-check.md`	Bug report: integer NaN check not folded to `fcmp uno`

Takashi's Notes

Explorer

README

Chapter 17: Floating-Point — LLVM Validation

Already handled correctly

Missed optimization: integer NaN check not folded to `fcmp uno`

Pattern

Instruction counts

Root cause

Comparison: `float_is_inf` is NOT a miss

Test files

Graph View

Table of Contents

Takashi's Notes

Explorer

README

Chapter 17: Floating-Point — LLVM Validation

Already handled correctly

Missed optimization: integer NaN check not folded to fcmp uno

Pattern

Instruction counts

Root cause

Comparison: float_is_inf is NOT a miss

Test files

Graph View

Table of Contents

Missed optimization: integer NaN check not folded to `fcmp uno`

Comparison: `float_is_inf` is NOT a miss