Chapter 13: Gray Code — LLVM Validation
LLVM version tested: 23.0.0git (build at ~/llvm-project/build/bin)
Already handled correctly
| Pattern | IR | LLVM output | Notes |
|---|---|---|---|
| binary → Gray (§13-1) | B ^ (B >> 1) | shrl + xorl | 2 instructions, optimal |
| Gray → binary (§13-1) | 5-round prefix-XOR fold | 10 ops (+ movs) | sequential optimal; no hardware intrinsic |
~B & (B+1) with BMI1 (§13-2) | (~B) & (B+1) | leal + andnl | 2 instructions, correctly uses ANDNOT |
Minor miss: ~B & (B+1) without BMI1 is 4 insns instead of 3
Pattern
~B & (B+1) computes the mask of the lowest clear bit of B (= blsi(~B)).
It appears in the efficient Gray-coded integer increment (Figure 13-2 of Hacker’s
Delight §13-2): M = ~B & (B+1); G' = G ^ M.
Current output (no BMI1) — 4 instructions
leal 1(%rdi), %eax ; B+1 (reads %rdi into %eax)
movl %edi, %ecx ; ← extra copy of B
notl %ecx ; ~B in %ecx
andl %ecx, %eax ; ~B & (B+1)Optimal (no BMI1) — 3 instructions
leal 1(%rdi), %eax ; B+1 (reads %rdi; %rdi not yet clobbered)
notl %edi ; ~B (clobbers %rdi — legal because LEA already read it)
andl %edi, %eax ; ~B & (B+1)The leal and the notl access different registers (%eax and %edi).
Since leal only reads %rdi and writes %eax, executing notl %edi after
the leal is valid. LLVM’s register allocator inserts an unnecessary copy
(movl %edi, %ecx) rather than scheduling notl %edi after leal.
With BMI1 (already optimal — 2 instructions)
leal 1(%rdi), %eax
andnl %eax, %edi, %eax ; ANDN: (~%edi) & %eax = ~B & (B+1)LLVM correctly uses andn with -mattr=+bmi.
Severity
Minor — 1 instruction saved (4 → 3) without BMI1. The BMI1 path is already optimal. The miss is a register-allocation scheduling issue, not an algorithmic one.
Test files
| File | What it tests |
|---|---|
ch13_gray.ll | bin2gray, gray2bin, gray_inc (all already-correct patterns) |
ch13_blsi_miss.ll | The ~B & (B+1) 4→3 insn miss (no BMI1) |