7/1/2026 at 1:56:38 PM
these LFM 2.5 models are crazy fast. the (biggest in series) 8B-A1B model produces 35-40 t/s on an aged 6-core CPU using llama.cpp. it's my go-to model for whenever i need fast local inference. it's also pretty good at toolcalling. would love to see more finetunes on HF, but it appears not many people discovered it yet.by potus_kushner
7/1/2026 at 3:52:33 PM
[dead]by mpfect