This is a very information-dense post. Took some time to read it in detail. Here's my thoughts.> OpenAI published a concept in February 2026 that captured what we'd been doing. They called it harness engineering: the primary job of an engineering team is no longer writing code. It is enabling agents to do useful work. When something fails, the fix is never "try harder." The fix is: what capability is missing, and how do we make it legible and enforceable for the agent?
This is what I've at least suspected for a while from working on my personal projects. Thanks for laying it out in clear terms.
> A production system needs to be stable, reliable, and secure. You need a system that can guarantee those properties when AI writes the code. You build the system. The prompts are disposable.
I agree, and the implication is that the primary bottleneck in any engineering project today is actually AI workflow design more than anything else, even working on the project itself. Because having the right AI workflow/scaffold/process lets you 10x the productivity of everything else on the project while keeping things production-ready, and keeping things production-ready is really hard.
> The Product Management Bottleneck
> The QA Bottleneck
So now, not only do devs become software architects that design dev processes and high-level direction more than doing the development themselves, PM and QA also need to become PM/QA architects that design PM/QA processes and product direction to stay relevant. lol.
> The Headcount Bottleneck
I think it's still an unsolved problem whether AI will reduce cooperation bottleneck between people (through new cooperation technologies like knowledge consolidation, and AI-driven performance measurement which is harder to game) or increase the bottleneck (through deep individual knowledge becoming more important since everyone is an architect). I'd guess the latter for the short term and possibly the former for the long term.
> I had to unify all the code into a single monorepo. One reason: so AI could see everything.
I wonder whether it's better for Git history cleanliness purposes to do one of the following instead:
- Use a "hub" monorepo that uses Git submodules to link to all the other repos in the project. The hub repo contains documentation and AI agent configurations, but the individual project files stay in their respective repos.
- Use an agent harness system that natively wraps over multiple repositories. (More precisely, it would make a temp folder and put the worktrees of multiple repos in that folder. Perhaps it can unpack some documentation and AI agent configs in the root too, with the root repository simply gitignore-ing the individual repo folders instead of using submodules.)
> Every pull request triggers three parallel AI review passes using Claude Opus 4.6:
Pass 1: Code quality. Logic errors, performance issues, maintainability.
Pass 2: Security. Vulnerability scanning, authentication boundary checks, injection risks.
Pass 3: Dependency scan. Supply chain risks, version conflicts, license issues.
I agree that automated PR review with AI agents is very important. Good list of topics, I think this will help with my own implementation.
> One hour later, the triage engine runs. It clusters production errors from CloudWatch and Sentry, scores each cluster across nine severity dimensions, and auto-generates investigation tickets in Linear. Each ticket includes sample logs, affected users, affected endpoints, and suggested investigation paths.
This is cool, advanced stuff. Though I kind of think that instead of Linear, we need an AI-centric ticketing system designed from the ground up to make it easier for AIs to handle the tickets and for the humans to monitor said AIs. I've used some AI coding kanban board tools and found them to be very helpful (compared to using a separate Forgejo kanban board + AI agent), and maybe a more general AI-powered ticket management tool would be the next step.
> Each tool handles one phase. No tool tries to do everything.
I think the key is to have separate agents handling each phase. They could all be in the same tool. I agree that having one AI agent handle the entire thing isn't going to be enough for the kind of reliability one is looking for here.
> Graphite's merge queue rebases, re-runs CI, merges if green.
This is a tool I hadn't heard of before and the merge queue seems like a very useful concept. I wonder if it handles automatically resolving trivial rebase conflicts with AI. The stacked PR feature sounds pretty good too.
> People assume we're trading quality for speed. User engagement went up. Payment conversion went up. We produce better results than before, because the feedback loops are tighter. You learn more when you ship daily than when you ship monthly.
Obviously lofty claims but intuitively I think this is possible. AI output isn't perfect but current engineering teams are far from perfect either. And I think AI is more amenable to process design than people are, simply because you can change the AI prompt instantly (and perhaps even AB-test it with LLMs as judge?) but people need time to train for a new process.
> At CREAO, we pushed AI-native operations into every function:
Product release notes: AI-generated from changelogs and feature descriptions.
Feature intro videos: AI-generated motion graphics.
Daily posts on socials: AI-orchestrated and auto-published.
Health reports and analytics summaries: AI-generated from CloudWatch and production databases.
Using AI for public-facing announcements is a bit of a minefield to be honest. I think it's valuable to have knowledgeable humans do most of this. But maybe AI can be acceptable if you clearly label that it's AI and you genuinely don't have the human bandwidth to do it anymore.
> I believe one-person companies will become common. If one architect with agents can do the work of 100 people, many companies won't need a second employee.
Oh boy.
> the CTO working 18-hour days
This is actually the least believable part of this post to me. I'd somewhat believe if you said 14-16 hours, but working a 18 hour day seems like a straight up bad idea. Even assuming you value absolutely nothing else in life other than work, you'd get more done in 14-16 hrs w/ more leisure and sleep than in 18 without it.