LLVM CLMUL Validation
This directory checks how current-head LLVM lowers the generic
llvm.clmul.* intrinsic to x86_64.
It covers:
- scalar
i64,i32, andi16 - vector
v2i64,v4i64,v4i32, andv8i32 - derived high-half and reverse forms built from legal IR patterns
Run:
bash ./run-clmul-validation.shCurrent findings:
- Scalar
i64lowers directly to onepclmulqdqorvpclmulqdq. - Scalar
i32andi16are widened through the same instruction and then truncated on return, which is ABI-correct and reasonable. v2i64andv4i64lower cleanly using the expected lane masks.v4i32andv8i32are scalarized into multiplepclmulqdqoperations plus shuffles and inserts. That is functionally correct, though noticeably more instruction-heavy.- A widened
i128CLMUL followed by>> 64is recognized well on x86 and lowers to onepclmulqdqplus a lane extract. - The bitreverse form that semantically corresponds to CLMULR also lowers well:
one
pclmulqdqplus a merge of the low/high halves.