6/7/2026 at 11:00:04 AM
I have a MA system setup for personal use.You give it a problem, you then refine that problem where a fast, cheaper model asks you questions which you answer to get a better input prompt. You then choose a MA strategy for example take problem break up to sections then final judge concludes or you do multi turn where agents debate then judge summarises debate.
The best approach is what I call 'all angles' where all these strategies run in parallel the final meta-judge synthesise the response - the most useful part of this which I recently added is a view to see the variance in each strategy.
Been using this for life stuff - housing search, schools, family challenges!
Perhaps I should make a video of it in action if people in HN community interested let me know.
by monkeydust
6/7/2026 at 4:12:23 PM
Right here is the video demo of what I built - https://streamable.com/e49cgtby monkeydust
6/8/2026 at 4:12:38 PM
Details and repo post on ShowHN here - https://github.com/monkeydust/rightmindby monkeydust
6/7/2026 at 6:10:41 PM
I have also developed a similar system not focused on the exploratory refinement of prompt(s). But more focused on feedback loops cybernetic style, so focused on the maintaining of stability of the prompt outputs by a growing library of deterministic checks and autofixes. Anything that is a "problem" which isn't covered by that library is surfaced to the human driving the process.by ethanwillis
6/7/2026 at 12:30:10 PM
You mention cost in one of the replies. Can you elaborate on the cost profile (ballpark) for various problem types? I would also be curious to understand the strategies employed and what the costs look like across each.by chrisss395
6/7/2026 at 12:21:12 PM
Definitely interested, would love to see a video :)by Folcon
6/7/2026 at 2:09:06 PM
Sure let me do that. Can I post this as a ShowHN if its just video? The rules say people need to try out but that will cost me a small fortune :) ...could perhaps post on Github and people can setup the repo themselves with their own Openrouter key if that works. Have never done a ShowHN but would be fun to try it.by monkeydust
6/7/2026 at 6:03:18 PM
The cheap models may ask subpar questions leading to subpar solutionsby whattheheckheck
6/7/2026 at 11:10:16 AM
So what harness are you using? And what LLM’sby uxhacker
6/7/2026 at 11:30:20 AM
Homebrew harness and all frontier ones plus deepseek. All via Openrouter at the moment. Works well enough but can get expensive so use for real high value challenges. Interestingly the refine feature has been most useful to me and people I have shown, essentially people are lazy when expressing the initial problem (me included!), refine asks relevant questions to initial problem then refines the initial statement, user can accept/reject/edit before submitting.by monkeydust
6/7/2026 at 1:09:52 PM
I came to a similar conclusion. I think the default options in many IDEs (Ask/Plan/Agent) are limited... 'Refine' feels like an improved 'Plan' in that it doesn't just jump right into building a list of tasks based on the initial prompt, because who knows what sort of flaws or deficiencies were present in the initial prompt! Can't always get everything right in the first try. XPI don't think a specific harness is even necessary to get a boost from 'Refine'. Even a simple custom agent is portable enough... it's easy enough to take the existing 'Plan' agent definition present in VS Code and tweak it to be 'Refine' instead.
by Cherub0774
6/8/2026 at 1:17:26 AM
There is a 5 line skill I’ve been using for refinement called grill-me that works quite wellby SOLAR_FIELDS
6/7/2026 at 7:02:56 PM
[flagged]by flowbarai
6/8/2026 at 1:04:27 PM
The problem with these kinds of systems (they have been well studied), is that that the overall output is ultimately anchored to the dumbest models used.I.e. you cannot end up having a more intelligent output by using more dumber models (that is: dumber than the most intelligent model used).
It's generally always best to refine your prompt and send it (at most) to the two smartest frontier models possible. And then have the smartest model review the output from the second smartest.
by saberience