5/19/2026
at
4:40:27 PM
KV Sharing, MHC, and Compressed Attention
https://magazine.sebastianraschka.com/p/recent-developments-in-llm-architectures
by gmays
5/19/2026
at
7:27:48 PM
cool stuff. my comp sci major feels almost completely redundant in this new vibecoding era and i feel like the only way to stay relevant as a programmer is to learn all these compute primitives and become an LLM systems guy.
by nibab
5/19/2026
at
9:06:02 PM
Has anyone seen a similar deep dive but that looks a little bit more closely at the infrastructure building blocks that power each of the components. I mean something a bit more physically grounded like how much compute would go to each portion to serve a Frontier Model?
by redwood