1/19/2026 at 3:35:40 AM
I went down (continue to do down) this rabbit hole and agree with the author.I tried a few different ideas and the most stable/useful so far has been giving the agent a single run_bash tool, explicitly prompting it to create and improve composable CLIs, and injecting knowledge about these CLIs back into it's system prompt (similar to have agent skills work).
This leads to really cool pattens like: 1. User asks for something
2. Agent can't do it, so it creates a CLI
3. Next time it's aware of the CLI and uses it. If the user asks for something it can't do it either improves the CLI it made, or creates a new CLI.
4. Each interaction results in updated/improved toolkits for the things you ask it for.
You as the user can use all these CLIs as well which ends up an interesting side-channel way of interacting with the agent (you add a todo using the same CLI as what it uses for example).
It's also incredibly flexible, yesterday I made a "coding agent" by having it create tools to inspect/analyze/edit a codebase and it could go off and do most things a coding agent can.
by binalpatel
1/19/2026 at 12:28:20 PM
Every individual programmer having locally-implemented idiosyncratic versions of sed and awk with imperfect reconstruction between sessions sounds like a regression to meby bandrami
1/19/2026 at 1:53:28 PM
Why would it recreate sed and awk? The screenshot from the repo even shows it using sed.by cocoflunchy
1/19/2026 at 1:16:29 PM
I already treat awk syntax as something idiocratic, so not much would change for me.by whatevaa
1/19/2026 at 5:47:31 PM
But -- I think, only because of the friction of having to read and parse what they did, which, to me could greatly be alleviated by AI itself.Put differently -- for those who'd like to share, yes, give me your locally implemented idosyncraticness with a little AI to help explain to me what's going on, and I feel like that's a sweet spot between "AI do the thing" and "give me raw code"
by jrm4
1/19/2026 at 7:15:15 AM
I've been on a similar path. Will have 1000 skills by the end of this week arranged in an evolving DAG. I'm loving the bottoms-up emergence of composable use cases. It's really getting me to rethink computing in general.by fudged71
1/19/2026 at 9:45:11 AM
Interesting. Could you provide a bit more detail on how the DAG emerges?by Garlef
1/19/2026 at 2:55:09 PM
2026 paper titled Evolving Programmatic Skill Networks, operationalized in Claude Codeby fudged71
1/20/2026 at 8:23:44 AM
how are they stored?by actionfromafar
1/19/2026 at 7:16:10 AM
Have you done a comparison on token usage + cost? I'd imagine there would be some level of re-inventing the wheel (i.e. rewriting code for very similar tasks) for common tasks, or do you re-use previously generated code?by meander_water
1/19/2026 at 7:23:13 AM
It reuses previously generated code, so tools it creates persists from session to session. It also lets the LLM avoid actually “seeing” the tokens in some cases since it can pipe directly between tools/write to disk instead of getting returned into the LLMs context window.by binalpatel
1/19/2026 at 7:52:21 AM
The point where that breaks down is “next time it’s aware of the CLI and uses it”. That only really works well inside the same session, and often the next session it will create a different tool and use that one.by rcarmo
1/19/2026 at 9:04:16 AM
> That only really works well inside the same sessionThat was already "fixed" by people adding snippets to agents.md and it worked. Now it's even more streamlined with skills. You can even have cc create a skill after a session (i.e. prompt it like "extract the learnings from this session and put them into a skill for working with this specific implementation of sqlite"). And it works, today.
by NitpickLawyer
1/19/2026 at 2:26:38 PM
I beg to differ: https://taoofmac.com/space/notes/2026/01/14/0830by rcarmo
1/19/2026 at 3:39:22 PM
> I prefer the more deterministic behavior of MCP for complex multi-step tasks, and the fact that I can do it effectively using smaller, cheaper models is just icing on the cake.Yeah, that makes sense. That's not what the person that I replied was talking about, tho. Skills work fine for "loading context pertinent to one type of task", such as working on a feature without "forgetting" what was done in the previous session.
The article deals with specific, somewhat predefined workflows.
by NitpickLawyer
1/19/2026 at 8:12:52 AM
Even if you document the tool and tells what it can do?by actionfromafar
1/19/2026 at 8:13:18 PM
Hey that sounds a lot like the project I’m working on, with the twist that it’s containerized. It’s still in dev https://github.com/brycewcole/capsule-agentsby trackspike
1/19/2026 at 6:30:15 AM
That’s pretty cool. Is it practical? What have you used it for?by skybrian
1/19/2026 at 7:06:38 AM
I've been using it daily, so far it's built CLIs for hackernews, BBC news, weather, a todo manager, fetching/parsing webpages etc. I asked it to make a daily briefing one that just composes some of them. So the first thing it runs when I message it in the morning is the daily briefing which gives me a summary of top tech news/non-tech news, the weather, my open tasks between work/personal. I can ask for follow ups like "summarize the top 5 stories on HN" and it can fetch the content and show it to me in full or give me a bullet list of the key points.Right now I'm thinking through how to make it more "proactive" even if it's just a cron that wakes it up, so it can do things like query my emails/calendar on an ongoing basis + send me alerts/messages I can respond to instead of me always having to message it first.
by binalpatel