alt.hn

5/30/2026 at 9:05:40 PM

Rotary GPU: Exploring Local Execution for Large MoE Models Under Limited VRAM

by dryarzeg

5/31/2026 at 1:18:43 AM

Why is this a paper? It's just using the n-cpu-moe option on llama.cpp? What am I missing here?

by martinald

5/31/2026 at 1:59:16 AM

It's amazingly vacuous isn't it? I think the most interesting read was the fact that they were surprised llama.cpp crashed when they used a bad set of commandline arguments.

Although in the section immediately above the observation they claimed that they ran 10 whole completions with 100% success rate. So who knows.

I have to admit I slightly miss the flood of AI-psychosis research papers that seemed to be popping up a couple of months ago. Good to know there's still one or two new ones floating around.

by Farmadupe

5/31/2026 at 2:42:53 AM

Apparently the author has a patent about it, too.

by LoganDark

5/31/2026 at 1:06:45 AM

Um, doesn't the 4060 laptop card have the ability to share system memory?

Wait... My mistake. Google AI says the 4060 mobile can access system memory but tech sheets say no.

by sandworm101