2/24/2026 at 7:57:37 PM
Skills in CC have been a bit frustrating for me. They don't trigger reliably and the emphasis on "it's just markdown" makes it harder to have them reliably call certain tools with the correct arguments.The idea that agent harnesses should primarily have their functionality dictated by plaintext commands feels like a copout around programming in some actually useful, semi-opinionated functionality (not to mention that it makes capability-discoverability basically impossible). For example, Claude Code has three modes: plan, ask about edits, and auto-accept edits. I always start with a plan and then I end up with multiple tasks. I'd like to auto-accept edits for a step at a time and the only way to do that reliably is to ask CC to do that, but it's not reliable—sometimes it just continues to go into the next step. If this were programmed explicitly into CC rather than relying on agent obedience, we could ditch the nondeterminism and just have a hook on task completion that toggles auto-complete back to "off."
by daturkel
2/24/2026 at 9:10:55 PM
The saving grace of Claude Code skills is that when writing them yourself, you can give them frontmatter like "use when mentioning X" that makes them become relevant for very specific "shibboleths" - which you can then use when prompting.Are we at an ideal balance where Claude Code is pulling things in proactively enough... without bringing in irrelevant skills just because the "vibes" might match in frontmatter? Arguably not. But it's still a powerful system.
by btown
2/25/2026 at 6:14:11 AM
For manual prompting, I use a "macro"-like system where I can just add `[@mymacro]` in the prompt itself and Claude will know to `./lookup.sh mymacro` to load its definition. Can easily chain multiple together. `[@code-review:3][@pycode]` -> 3x parallel code review, initialize subagents with python-code-guide.md or something. ...Also wrote a parser so it gets reminded by additionalContext in hooks.Interestingly, I've seen Claude do `./lookup.sh relevant-macro` without any prompting by me. Probably due it being mentioned in the compaction summary.
by winwang
2/25/2026 at 8:11:21 PM
Compaction includes all user prompts from the most recent session verbatim, so that's likely what's happening!by btown
2/25/2026 at 11:56:15 PM
Fun fact, it can miss! I've seen it miss almost half my messages, including some which were actually important, haha.by winwang
2/24/2026 at 8:52:49 PM
> idea that agent harnesses should primarily have their functionality dictated by plaintext commands feels like a copoutI think it's more along the lines of acknowledging the fast-paced changes in the field, and refusing to cast into code something that's likely to rapidly evolve in the near future.
Once things settle down into tested practices, we'll see more "permanent" instrumentation arise.
by btbuildem
2/24/2026 at 8:56:29 PM
Surely this logic doesn't apply if we're to believe that "code is cheap" now :pby daturkel
2/25/2026 at 2:09:26 PM
"Code is cheap" has two interpretations here: one, that's its no longer seen as the artisanally-crafted fine product, now it's "manufactured". Two, though, is that it's cheaper in ops -- once the criteria are fully discovered, once no more new paths for the agents to roam, things that have been cast into code consume minimal resources (in AI scale of things), they're doggedly deterministic, and are free of heavy dependencies.So yeah, I believe "it's a phase" but in a sense that it's a development phase, just like planning or prototyping.
by btbuildem
2/24/2026 at 8:30:27 PM
I think unless you're doing simple tasks, skills are unreliable. For better reliability, I have the agent trigger APIs that handles the complex logic (and its own LLM calls) internally. Has anyone found a solid strategy for making complex 'skills' more dependable?by Frannky
2/24/2026 at 9:54:49 PM
In my experience, all text “instruction” to the agent should be taken on a prayer. If you write compact agent guidance that is not contradictory and is local and useful to your project, the agent will follow it most of the time. There is nothing that you can write that will force the agent to follow it all of the time.If one can accept failure to follow instructions, then the world is open. That condition does not really comport with how we think about machines. Nevertheless, it is the case.
Right now, a productive split is to place things that you need to happen into tooling and harnessing, and place things that would be nice for the agent to conceptualize into skills.
by selridge
2/24/2026 at 11:37:27 PM
Yeah, that's my experience tooby Frannky
2/24/2026 at 8:47:27 PM
My only strategy is what used to be called slash-commands but are also skills now, I.e I call them explicitly. I think that actually works quite well and you can allow specific tools and tell it to use specific hooks for security of validation in the frontmatter properties.by plufz
2/25/2026 at 4:21:42 AM
I found interrupting and insisting on the skill use the easiest way...got to be better ways like thisby triage8004
2/24/2026 at 8:59:50 PM
Is it that the skills aren't being triggered reliably, or that they get triggered but the skill itself is complex and doesn't work as expected?by chickensong
2/24/2026 at 9:11:08 PM
bothby Frannky
2/24/2026 at 9:47:09 PM
I haven't done a lot with skills yet, but maybe try and leverage hooks to enforce skill usage, and move most of the skill's logic and complexity into a script so the agent only needs to reason about how to call the script.by chickensong
2/25/2026 at 1:15:33 AM
I think I'll wait until they are more reliable. For now, I use skills, but they just specify which endpoint to call. It should be also safer, different vps, no access to credentials but the bearer token.by Frannky
2/25/2026 at 1:43:42 AM
Having the skill be "call this script with these args" seems to reduce the amount of stuff that goes wrongby Rebelgecko
2/25/2026 at 2:16:18 PM
I view them as more idiosyncratic docs, but focused on how to write code (there is so much huggingface code floating around the internet, the models do quite well with it already).I have not had much success with skills that have tree based logic (if a do x, else do y), they just tend to do everything in the skill (so will do both x and y).
But just as "hey follow this outline of steps a,b,c" it works quite well in my experience.
by apwheele
2/24/2026 at 8:11:36 PM
You can publish scripts with skills you author, right? With carefully constructed markdown that should allow the agent to call tools the right way.by PantaloonFlames
2/24/2026 at 8:56:41 PM
Are you using either CLAUDE.md or .claude/INSTRUCTIONS.md to direct Claude about the different agents?Also, be aware that when you add new instructions if you don't tell claude to reread these files, it will NOT have it in its context window until you tell it to read them OR you make a new CC session. This was a bit frustrating for me because it was not immediately obvious.
by giancarlostoro
2/24/2026 at 9:15:26 PM
https://scottspence.com/posts/measuring-claude-code-skill-ac... works very wellby conception
2/24/2026 at 8:52:48 PM
> sometimes it just continues to go into the next stepUse a structured workflow that loops on every task and includes a pause for user confirmation at the end. Enforce it with a hook. I'm not sure if you can toggle auto-accept this way, but I think the end result is what you're asking for.
I use this with great success, sometimes toggling auto-accept on when confidence is high that Claude can complete a step without guidance, and toggling off when confidence is low and you want to slow down and steer, with Claude stopping between the steps. Now that prompt suggestions are a thing, you can just hit enter to continue on the suggested prompt to continue.
by chickensong
2/25/2026 at 12:38:58 AM
Behavior trees. They are precisely what we need. Somebody just needs to go build the damn thing.by ctoth
2/24/2026 at 8:37:07 PM
You can write skills that have an associated js/python/whatever script.by DarmokJalad1701
2/24/2026 at 8:54:49 PM
> Skills in CC have been a bit frustrating for me. They don't trigger reliablyReferencing them in AGENTS/CLAUDE.md has increased their usage for me.
by siquick