Chapter 10: Integer Division by Constants — LLVM Validation

LLVM version tested: 23.0.0git (build at ~/llvm-project/build/bin)

Run the validation:

./run-validation.sh

Already handled correctly

PatternIRLLVM outputNotes
Exact div odd constant (§10-16)sdiv exact i32 %n, 7single imull $invmod-inverse trick
Exact div even constant (§10-16)sdiv exact i32 %n, 14sarl $1; imull $invshift out 2-factor first
Divisibility test, odd d (§10-17)urem i32 %n, 7; icmp eq 0imull + cmpl + setb3 insns
Divisibility test, even d (§10-17)urem i32 %n, 100; icmp eq 0imull + rorl $2 + cmpl + setbrotate trick for 2-factor
Divisibility test, d=4·7urem i32 %n, 28; icmp eq 0imull + rorl $2 + cmpl + setbgeneralises correctly
Signed divisibility test (§10-17)srem i32 %n, 7; icmp eq 0imull + addl + cmpl + setbsigned bias added
Specific remainder test (§10-17)urem i32 %n, 7; icmp eq 3imull + addl + cmpl + setb
”Best program” divisor (§10-11)udiv i32 %n, 641movl + imulq $6700417 + shrq $32641·6700417=2³²+1, no add needed
”Best program” divisor (§10-11)udiv i32 %n, 6700417movl + imulq $641 + shrq $32symmetric
W=64 “best” divisor (§10-11)udiv i64 %n, 274177movabsq $67280421310721 + mulq274177·67280421310721=2⁶⁴+1
sdiv pow-of-2, non-negative nsdiv i32 range(0,2^31) %n, 8shrl $3range metadata used
srem pow-of-2, non-negative nsrem i32 range(0,2^31) %n, 8andl $7
sdiv -1sdiv i32 %n, -1negl
srem 1srem i32 %n, 1xorl %eax,%eax
(n/d)·d + n%d identity(sdiv n,7)*7 + srem n,7movl %edi,%eaxperfectly folded
(udiv n, 2^k) · 2^k(udiv i32 %n, 8) * 8andl $-8
div+rem sharingsdiv n,7 and srem n,7 togetherone imulq, quotient and remainder derived from itno double multiply

Missed optimization: sdiv i32 by constants with positive magic number

Problem

For sdiv i32 %n, d where the magic number M < 2^31 (positive), LLVM generates 6 instructions but the minimum is 5.

Affected divisors (W=32, from Table 10-1): 3, 5, 6, 9, 10, 11, 12, 25, 125, 625, and any d whose signed magic fits in a positive 32-bit value.

What LLVM generates (sdiv i32 %n, 3)

movslq  %edi, %rax                  ; sign-extend n → rax
imulq   $1431655766, %rax, %rax     ; rax = n · M  (M = 0x55555556)
movq    %rax, %rcx                  ; ← copy product
shrq    $63, %rcx                   ; ← extract sign bit of product (2 insns)
shrq    $32, %rax                   ; quotient before correction
addl    %ecx, %eax                  ; + sign correction

Optimal (5 insns)

movslq  %edi, %rax                  ; reads %edi, does NOT modify it
shrl    $31, %edi                   ; sign bit of n (0 or 1) — %edi still holds n
imulq   $1431655766, %rax, %rax
shrq    $32, %rax
addl    %edi, %eax

Why it is correct

  • movslq %edi, %rax reads %edi without writing it, so %edi still holds n after the instruction.
  • M = 0x55555556 is positive (bit 31 = 0), so sign(n · M) = sign(n) for all n. Extracting the correction bit from n (shrl $31, %edi) is therefore equivalent to extracting it from the product (movq + shrq $63), saving one instruction.
  • Verified correct for all n in [−2³¹, 2³¹) by exhaustive spot-check.

The same sign-extraction argument applies to all d in the affected list since all their magic numbers are < 2^31.

Why the 64-bit case does not benefit

sdiv i64 %n, 3 uses the 1-operand imulq %rcx form (128-bit product in rdx:rax), which always requires a final movq %rdx, %rax. That extra move offsets any saving, so 6 instructions remain optimal there.


Test files

FileWhat it tests
ch10_basic.llAll the “already handled” cases
ch10_sdiv_posM.llThe positive-magic sdiv cases (3, 5, 6, 9, 10, 12)
bug-sdiv-positive-magic.mdBug report draft