3/2/2026 at 5:55:22 AM
This pretty cool, and useful but I only wish this was a website. I don’t like the idea of running an executable for something that can perfectly be done as a website. (Other than some minor features, tbh even you can enable Corsair and still check the installed models from a web browser).Sounds like a fun personal project though.
by BloondAndDoom
3/2/2026 at 11:50:01 AM
>I only wish this was a website. I don’t like the idea of running an executable for something that can perfectly be done as a website.The tool depends on hardware detection. From https://github.com/AlexsJones/llmfit?tab=readme-ov-file#how-... :
How it works
Hardware detection -- Reads total/available RAM via sysinfo, counts CPU cores, and probes for GPUs:
NVIDIA -- Multi-GPU support via nvidia-smi. Aggregates VRAM across all detected GPUs. Falls back to VRAM estimation from GPU model name if reporting fails.
AMD -- Detected via rocm-smi.
Intel Arc -- Discrete VRAM via sysfs, integrated via lspci.
Apple Silicon -- Unified memory via system_profiler. VRAM = system RAM.
Ascend -- Detected via npu-smi.
Backend detection -- Automatically identifies the acceleration backend (CUDA, Metal, ROCm, SYCL, CPU ARM, CPU x86, Ascend) for speed estimation.
Therefore, a website running Javascript is restricted by the browser sandbox so can't see the same low-level details such as total system RAM, exact count of GPUs, etc,To implement your idea so it's only a website and also workaround the Javascript limitations, a different kind of workflow would be needed. E.g. run macOS system report to generate a .spx file, or run Linux inxi to generate a hardware devices report... and then upload those to the website for analysis to derive a "LLM best fit". But those os report files may still be missing some details that the github tool gathers.
Another way is to have the website with a bunch of hardware options where the user has to manually select the combination. Less convenient but then again, it has the advantage of doing "what-if" scenarios for hardware the user doesn't actually have and is thinking of buying.
(To be clear, I'm not endorsing this particular github tool. Just pointing out that a LLMfit website has technical limitations.)
by jasode
3/2/2026 at 1:47:41 PM
That’s like like 4 or 5 fields to fill in on a form. Way less intrusive than installing this thingby CoolGuySteve
3/2/2026 at 2:13:21 PM
It can become complicated when you run it inside a container.by amelius
3/2/2026 at 2:23:06 PM
Why would it need to be a container?by bilekas
3/2/2026 at 3:00:02 PM
My ollama and GPU are in k8s.by riddley
3/2/2026 at 2:56:39 PM
Are you asking why people run things in a container?by amelius
3/2/2026 at 5:19:38 PM
No, I'm asking why a website that someone could fill in a few fields and result in the optimized llm for you would need to run in a container? It's a webform.by bilekas
3/2/2026 at 2:36:26 PM
I just discovered the other day the hugging face allows you to do exactly this.With the caveat that you enter your hardware manually. But are we really at the point yet where people are running local models without knowing what they are running them on..?
by seemaze
3/3/2026 at 7:49:48 AM
> But are we really at the point yet where people are running local models without knowing what they are running them on..?I can only speak for myself: it can be daunting for a beginner to figure out which model fits your GPU, as the model size in GB doesn't directly translate to your GPU's VRAM capacity.
There is value in learning what fits and runs on your system, but that's a different discussion.
by mongrelion
3/2/2026 at 5:31:12 PM
The other nice part of huggingface’s setup is you can add theoretical hardware and search that way too.by roxolotl
3/2/2026 at 8:46:05 PM
People out there are probably vibecoding their username / passwords for websites. Don't under estimate dumb people.by mmmlinux
3/2/2026 at 8:31:13 AM
Came across a website for this recently that may be worth a look https://whatmodelscanirun.comby Trigg3r
3/2/2026 at 11:41:28 AM
It's wildly inaccurate for me.by Tepix
3/2/2026 at 7:08:55 AM
Huggingface has it built in.by hhh
3/2/2026 at 7:15:56 AM
Where?by azinman2
3/2/2026 at 9:31:55 AM
In your preferences there is a local apps and hardware, I guess it's a little different because I just open the page of a model and it shows the hardware I've configured and shows me what quants fit.by hhh
3/3/2026 at 1:28:14 AM
I haven't seen a page on HF that'll show me "what models will fit", it's always model by model. The shared tool gives a list of a whole bunch of models, their respective scores, and an estimated tok/s, so you can compare and contrast.I wish it didn't require to run on the machine though. Just let me define my spec on a web page and spit out the results.
by Twirrim
3/3/2026 at 1:15:22 PM
i wouldn't mind a set of well-known unix commands that produce a text output of your machine stats to paste into this hypothetical website of yours (think: neofetch?)by Natfan
3/2/2026 at 10:51:49 AM
here's an website for a community-ran db on LLM models with details on configs for their token/s: https://inferbench.com/by binsquare
3/3/2026 at 8:10:02 AM
Great idea of inferbench (similar to geekbench, etc.) but as of the time of writing, it's got only 83 submissions, which is underwhelming.by mongrelion
3/2/2026 at 3:31:10 PM
The whole point is to measure your hardware capability. How would you do that as a website?by hidelooktropic
3/2/2026 at 6:49:24 AM
always liked this website that kinda does something similar https://apxml.com/tools/vram-calculatorby kristopolous