Chapter 12: Unusual Bases for Number Systems — LLVM Validation
LLVM version tested: 23.0.0git (build at ~/llvm-project/build/bin)
Already handled correctly
| Pattern | IR | LLVM output | Notes |
|---|---|---|---|
| negabinary→binary, Form 3 (§12-1) | (N ^ 0xAAAAAAAA) - 0xAAAAAAAA | xorl + leal | 2 instructions, optimal |
| binary→negabinary (§12-1) | (B + 0xAAAAAAAA) ^ 0xAAAAAAAA | leal + xorl | 2 instructions, optimal |
Missed optimization: (x & ~C) - (x & C) not folded to (x ^ C) - C
Problem
The three Schroeppel forms for negabinary-to-binary conversion are all equivalent but generate different instruction counts. LLVM does not simplify forms 1 and 2 to the 2-instruction form 3.
| Form | Expression | Current insns | Optimal |
|---|---|---|---|
| 1 | (N & 0x55555555) − (N & 0xAAAAAAAA) | 4 | 2 |
| 2 | N − ((N & 0xAAAAAAAA) << 1) | 5 | 2 |
| 3 | (N ^ 0xAAAAAAAA) − 0xAAAAAAAA | 2 | 2 ✓ |
Algebraic equivalence
With C = 0xAAAAAAAA, E = N & ~C (even-index bits), O = N & C (odd-index bits):
- Form 1:
E − O - Form 2:
(E+O) − 2O = E − O - Form 3:
(N ^ C) − C = (C − O + E) − C = E − O✓
All three compute the same function; verified exhaustively for all 32-bit inputs.
Missing InstCombine rule
(x & ~C) - (x & C) => (x ^ C) - C
for any constant C. The general case is: when C1 | C2 = allones and C1 & C2 = 0,
(x & C1) - (x & C2) = (x ^ C2) - C2.
Context
These conversions appear in:
- Base −2 (negabinary) arithmetic (§12-1, Hacker’s Delight)
- Separating “positive-weight” and “negative-weight” bits in alternating-sign encoding
- Related patterns in any context where even/odd bit positions have opposite signs
Test files
| File | What it tests |
|---|---|
ch12_negabinary.ll | All three negabinary-to-binary forms + binary-to-negabinary |
bug-negabinary-fold.md | Bug report draft |