alt.hn

6/22/2026 at 7:45:05 PM

Self-Harness: Harnesses That Improve Themselves

https://arxiv.org/abs/2606.09498

by jonnonz

6/25/2026 at 10:35:03 AM

I like this idea, i ask codex to build Pi extension after reading this paper.

https://github.com/skorotkiewicz/nano-agent/blob/main/pi_ext...

by modinfo

6/24/2026 at 8:07:01 PM

What else is new? Put it in emacs and let the model improve the harness over time.

by behnamoh

6/24/2026 at 10:17:42 PM

Was surprised and somewhat disappointed that the article doesn’t appear to evaluate how well the models work when running in the harnesses optimized for the other models. Do they still do better than with the baseline harness? Does each model do worse with a harness optimized (by this process) for the other models, than it does for the harness optimized for itself?

by drdeca

6/24/2026 at 11:29:27 PM

Not really an article, but yeah, I was hoping they went into the underlying mechanism a bit deeper. This paper could be confirmation of what localllamaians have been saying for months; Keep your harness surface small, allow the model to use the harness to build _your workflow_.

I have been doing a LOT of work around this with Qwen3.6 and its been super fun. There are some neat benchmarks that help guide, but nothing beats reading the output... and there is a lot of output to read when trying different quants, etc. Which leads me too...

The other thing I have learned is the "harness" is only as good as the model tuning that goes into it. If your prompt(s) are buggered from the beginning, you are going to have a bad time. The prompt structure and special tokens can be a PITA or really help depending on how much you know.

I don't know how agentic harnesses can work without being optimized for the models running within them. This is the biggest insight into working with agents for me. First thing I have always looked at were the prompts and parameters... everything else is orchestration to me.

by monkmartinez

6/25/2026 at 1:41:57 PM

Where would I find a good write up on where to start with this?

by clickety_clack

6/24/2026 at 8:46:09 PM

Pretty obvious stuff; see Terminator for the conclusion (SkyNet). Or the Matrix. We really need more work on model alignment, trustworthiness, and control.

by 7e

6/24/2026 at 9:01:01 PM

[flagged]

by tlarkworthy

6/23/2026 at 5:44:43 AM

[dead]

by mncharity