alt.hn

4/19/2026 at 11:35:23 AM

High-Fidelity KV Cache Summarization Using Entropy and Low-Rank Reconstruction

https://jchandra.com/posts/hae-ols/

by jchandra

4/19/2026 at 11:50:30 AM

Interesting Approach. Curious about the latency tradeoff: OLS + SVD are much heavier than Top-K.Have you benchmarked end-to-end inference latency?

by vivahir215

4/19/2026 at 11:57:37 AM

[dead]

by jchandra

4/19/2026 at 11:36:37 AM

[dead]

by jchandra