4/21/2025 at 1:25:17 AM
The OP looks like good work, but it's definitely not a quick read. The authors claim theoretical breakthroughs that enable:* a data-free LLM quantization method which they claim outperforms all prior data-free approaches, including NF4; and
* a method which they claim is optimal for finding non-uniform per-layer quantization levels which match a given compression constraint in the "medium bitwidth" regime.
They demonstrate improved accuracy-compression trade-offs on popular LLMs.
Thank you for sharing this on HN.
by cs702