Other

rotorquant

KV cache compression via block-diagonal rotation. Beats TurboQuant: better PPL (6.91 vs 7.07), 28% faster decode, 5.3x faster prefill, 44x fewer params. Drop-in llama.cpp integration.

OtherEmerging

GitHub Website

Stars

—

Forks

—

Contributors

Last push

3mo ago

Recent commits

Latest commits.

Add RaBitQ module + comprehensive 1-bit/2-bit benchmark suite
616cb93John D. Pope3mo ago
Update CLAUDE.md with llama.cpp status, PPL results, and TODOs
1ba8989John D. Pope3mo ago
Credit @ParaMind2025 for PlanarQuant and IsoQuant
d3eae68John D. Pope3mo ago
Update README: speed benchmarks, architecture evolution, commit history
7511721John D. Pope3mo ago
Update README: symmetric 3-bit PPL results beat TurboQuant
61154aeJohn D. Pope3mo ago

Top contributors

Builders behind this project.