Chapter 10: Integer Division by Constants — LLVM Validation
LLVM version tested: 23.0.0git (build at ~/llvm-project/build/bin)
Run the validation:
./run-validation.shAlready handled correctly
| Pattern | IR | LLVM output | Notes |
|---|---|---|---|
| Exact div odd constant (§10-16) | sdiv exact i32 %n, 7 | single imull $inv | mod-inverse trick |
| Exact div even constant (§10-16) | sdiv exact i32 %n, 14 | sarl $1; imull $inv | shift out 2-factor first |
| Divisibility test, odd d (§10-17) | urem i32 %n, 7; icmp eq 0 | imull + cmpl + setb | 3 insns |
| Divisibility test, even d (§10-17) | urem i32 %n, 100; icmp eq 0 | imull + rorl $2 + cmpl + setb | rotate trick for 2-factor |
| Divisibility test, d=4·7 | urem i32 %n, 28; icmp eq 0 | imull + rorl $2 + cmpl + setb | generalises correctly |
| Signed divisibility test (§10-17) | srem i32 %n, 7; icmp eq 0 | imull + addl + cmpl + setb | signed bias added |
| Specific remainder test (§10-17) | urem i32 %n, 7; icmp eq 3 | imull + addl + cmpl + setb | |
| ”Best program” divisor (§10-11) | udiv i32 %n, 641 | movl + imulq $6700417 + shrq $32 | 641·6700417=2³²+1, no add needed |
| ”Best program” divisor (§10-11) | udiv i32 %n, 6700417 | movl + imulq $641 + shrq $32 | symmetric |
| W=64 “best” divisor (§10-11) | udiv i64 %n, 274177 | movabsq $67280421310721 + mulq | 274177·67280421310721=2⁶⁴+1 |
| sdiv pow-of-2, non-negative n | sdiv i32 range(0,2^31) %n, 8 | shrl $3 | range metadata used |
| srem pow-of-2, non-negative n | srem i32 range(0,2^31) %n, 8 | andl $7 | |
| sdiv -1 | sdiv i32 %n, -1 | negl | |
| srem 1 | srem i32 %n, 1 | xorl %eax,%eax | |
| (n/d)·d + n%d identity | (sdiv n,7)*7 + srem n,7 | movl %edi,%eax | perfectly folded |
| (udiv n, 2^k) · 2^k | (udiv i32 %n, 8) * 8 | andl $-8 | |
| div+rem sharing | sdiv n,7 and srem n,7 together | one imulq, quotient and remainder derived from it | no double multiply |
Missed optimization: sdiv i32 by constants with positive magic number
Problem
For sdiv i32 %n, d where the magic number M < 2^31 (positive), LLVM generates
6 instructions but the minimum is 5.
Affected divisors (W=32, from Table 10-1): 3, 5, 6, 9, 10, 11, 12, 25, 125, 625, and any d whose signed magic fits in a positive 32-bit value.
What LLVM generates (sdiv i32 %n, 3)
movslq %edi, %rax ; sign-extend n → rax
imulq $1431655766, %rax, %rax ; rax = n · M (M = 0x55555556)
movq %rax, %rcx ; ← copy product
shrq $63, %rcx ; ← extract sign bit of product (2 insns)
shrq $32, %rax ; quotient before correction
addl %ecx, %eax ; + sign correctionOptimal (5 insns)
movslq %edi, %rax ; reads %edi, does NOT modify it
shrl $31, %edi ; sign bit of n (0 or 1) — %edi still holds n
imulq $1431655766, %rax, %rax
shrq $32, %rax
addl %edi, %eaxWhy it is correct
movslq %edi, %raxreads%ediwithout writing it, so%edistill holdsnafter the instruction.M = 0x55555556is positive (bit 31 = 0), sosign(n · M) = sign(n)for all n. Extracting the correction bit fromn(shrl $31, %edi) is therefore equivalent to extracting it from the product (movq + shrq $63), saving one instruction.- Verified correct for all n in [−2³¹, 2³¹) by exhaustive spot-check.
The same sign-extraction argument applies to all d in the affected list since all their magic numbers are < 2^31.
Why the 64-bit case does not benefit
sdiv i64 %n, 3 uses the 1-operand imulq %rcx form (128-bit product in rdx:rax),
which always requires a final movq %rdx, %rax. That extra move offsets any saving,
so 6 instructions remain optimal there.
Test files
| File | What it tests |
|---|---|
ch10_basic.ll | All the “already handled” cases |
ch10_sdiv_posM.ll | The positive-magic sdiv cases (3, 5, 6, 9, 10, 12) |
bug-sdiv-positive-magic.md | Bug report draft |