4/21/2025 at 7:39:48 PM
Yikes what's the bar for dead simple these days? Even my totally non-technical gamer friends are messing around with ollama because I just have to give them one command to get any of the popular LLMs up and running.Now of course "non technical" here is still a pc gamer that's had to fix drivers once or twice and messaged me to ask "hey how do i into LLM, Mr. AI knower", but I don't think twice these days about showing any pc owner how to use ollama because I know I probably won't be on the hook for much technical support. My sysadmin friends are easily writing clever scripts against ollama's JSON output to do log analysis and other stuff.
by thot_experiment
4/21/2025 at 8:00:33 PM
By "too hard" I do not mean getting started with them to run inference on a prompt. Ollama especially makes that quite easy. But as an application developer, I feel these platforms are too hard to build around. The main issues being: getting the correct small enough task specific model and how long it takes to download these models for the end user.by aazo11
4/21/2025 at 8:42:53 PM
I guess it depends on expectations, if your expectation is an CRUD app that opens in 5 seconds, then sure, it's definitely tedious. People do install things though, the companion app for DJI action cameras is 700mb (which is an abomination, but still). Modern games are > 100gb on the high side, downloading 8-16gb of tensors one time is nbd. You mentioned that there are 663 different models of dsr1-7b on huggingface, sure, but if you want that model on ollama it's just `ollama run deepseek-r1`As a developer the amount of effort I'm likely to spend on the infra side of getting the model onto the user's computer and getting it running is now FAR FAR below the amount of time I'll spend developing the app itself or getting together a dataset to tune the model I want etc. Inference is solved enough. "getting the correct small enough model" is something that I would spend the day or two thinking about/testing when building something regardless. It's not hard to check how much VRAM someone has and get the right model, the decision tree for that will have like 4 branches. It's just so little effort compared to everything else you're going to have to do to deliver something of value to someone. Especially in the set of users that have a good reason to run locally.
by thot_experiment