6/12/2026 at 5:48:52 PM
For pure, weird, late-night LLM chats, I've recently started using Qwen3.6-35B-A3B-Uncensored just running with llama-cli and it is a very refreshing chat experience.Uncensored model means it will not deny any requests (at least I have yet to come across one), if you grew up in the 90s it sort of feels like coming across the anarchist cookbook for the first time (though with more accurate content). Using llama-cli means the session is entirely local and entirely ephemeral. As a bonus all the reasoning steps are fully visible to the user.
The base Qwen3.6-35B-A3B is more than adequate for "weird late night brainstorming chats" and I've really started to dislike the natural tendency to self-censor when the model is willing to refuse (and potentially report) any requests it feels is "inappropriate" and all these private thoughts are stored on someone else's server.
by crystal_revenge
6/12/2026 at 6:17:46 PM
Even for work questions about sensitive IP/code Qwen3.6-35B-A3B is a great option on macOS (35t/s) when you don't want info leaving your laptop. I'm using it with oMLX.by Xeoncross
6/12/2026 at 7:24:09 PM
I switched to oMLX today from lm studio. Really nice but I have found qwen3.6 sometimes failing to call tools correctly.by vorticalbox
6/12/2026 at 9:46:38 PM
I use oMLX and also have this issue with qwen. Not sure what it is about oMLX + qwen but they don't seem to play well together.by nozzlegear
6/12/2026 at 6:27:39 PM
I run a Gemma 4 32b abliteration (int8) and it's remarkably good. It's been a real step up from Qwen in my experience.by bastawhiz
6/14/2026 at 1:33:23 AM
you can even use stuff to uncensor any foss modelby tough