Chapter 13: Gray Code — LLVM Validation

LLVM version tested: 23.0.0git (build at ~/llvm-project/build/bin)


Already handled correctly

PatternIRLLVM outputNotes
binary → Gray (§13-1)B ^ (B >> 1)shrl + xorl2 instructions, optimal
Gray → binary (§13-1)5-round prefix-XOR fold10 ops (+ movs)sequential optimal; no hardware intrinsic
~B & (B+1) with BMI1 (§13-2)(~B) & (B+1)leal + andnl2 instructions, correctly uses ANDNOT

Minor miss: ~B & (B+1) without BMI1 is 4 insns instead of 3

Pattern

~B & (B+1) computes the mask of the lowest clear bit of B (= blsi(~B)). It appears in the efficient Gray-coded integer increment (Figure 13-2 of Hacker’s Delight §13-2): M = ~B & (B+1); G' = G ^ M.

Current output (no BMI1) — 4 instructions

leal  1(%rdi), %eax          ; B+1  (reads %rdi into %eax)
movl  %edi,    %ecx          ; ← extra copy of B
notl  %ecx                   ; ~B in %ecx
andl  %ecx,    %eax          ; ~B & (B+1)

Optimal (no BMI1) — 3 instructions

leal  1(%rdi), %eax          ; B+1  (reads %rdi; %rdi not yet clobbered)
notl  %edi                   ; ~B   (clobbers %rdi — legal because LEA already read it)
andl  %edi,    %eax          ; ~B & (B+1)

The leal and the notl access different registers (%eax and %edi). Since leal only reads %rdi and writes %eax, executing notl %edi after the leal is valid. LLVM’s register allocator inserts an unnecessary copy (movl %edi, %ecx) rather than scheduling notl %edi after leal.

With BMI1 (already optimal — 2 instructions)

leal    1(%rdi), %eax
andnl   %eax, %edi, %eax     ; ANDN: (~%edi) & %eax = ~B & (B+1)

LLVM correctly uses andn with -mattr=+bmi.

Severity

Minor — 1 instruction saved (4 → 3) without BMI1. The BMI1 path is already optimal. The miss is a register-allocation scheduling issue, not an algorithmic one.


Test files

FileWhat it tests
ch13_gray.llbin2gray, gray2bin, gray_inc (all already-correct patterns)
ch13_blsi_miss.llThe ~B & (B+1) 4→3 insn miss (no BMI1)