6/25/2026 at 1:18:04 PM
You can't unit test for taste if you haven't written down what you mean by taste. If you can externalize it, then you can.Follow this line of thinking, and the AI-friendly answer is easy: we just have to externalize everything we know, so Claude can implement what I want.
Except that I can't fully externalize myself. Debugging a system takes more resources than running the system. If I could write down everything I know and hand it to a machine, I'd do that, but it impossible.
People aren't books or hashmaps. If you want to build something, you need to use the tools, not teach the tools to use you.
[edit: I'm trying to figure out if there's something to be done about this. Email me if you want to chat -- tr at tern dot sh]
by trjordan
6/25/2026 at 1:21:32 PM
It can't be written down as code, that's the point.I am more familiar with taste in coding and it can at best be described—that the resulting code is too subtly different from something else in the codebase, that you're masking a different bug, that you're not following what the code tells you. The good part is that while this cannot be unit tested, you can write documentation and code comments about it that tell people what they need to know.
But for taste of the kind described in the article there's not even a definition. The logic ended up being "trust a bunch of opaque weights the most"
by bonzini
6/25/2026 at 3:40:43 PM
Apple's human interface guidelines says that some things can be written down though. It's a very thurough look at UX and while they don't adhere to them perfectly themselves, it's very much a north star to a some ideals. You can't unit test for taste, but you can integration test that bad tastes haven't happened.by fragmede
6/25/2026 at 5:00:19 PM
I think Apple lost a bit of credibility after the round-corner fiasco that still persists on Tahoe.by sscaryterry
6/25/2026 at 8:00:52 PM
They wrote the HIG before Alan came in and trashed the place.by InsideOutSanta
6/25/2026 at 10:02:47 PM
Indeed, I'm sure Steve Jobs is rolling in his grave.by sscaryterry
6/26/2026 at 12:09:52 AM
Steve Jobs was also responsible for brilliant bits of usability like puck mice, and the need to have two functioning hands in order to right-click.by vkou
6/26/2026 at 7:54:50 AM
As somebody who uses a claw grip, I loved the puck mouse. Now the stupid mouse where the charger plugs in at the bottom, that one actually sucks.by InsideOutSanta
6/26/2026 at 11:03:11 AM
.... and ensuring the entire UI did not require right click to function. Everything was visible to click.The usability of iPhones and iPads is a great example of how he was right. They're very easy to use and no functionality was hidden in a right click menu: it had to be visible somewhere.
Right click was still always available as a shortcut for advanced users.
by usef-
6/26/2026 at 12:43:49 PM
Yeah, I think people who didn't use Macs at the time misunderstand the whole "second mouse button"/"context menu" thing. If you were on Windows, you literally couldn't use the computer without context menus. But Mac OS at the time was designed such that every action the user could access was visible in the regular UI, either through a button or through the menubar.When the context menu was introduced, it was initially designed as a shortcut to actions that were already available elsewhere in the UI.
by InsideOutSanta
6/26/2026 at 5:57:42 PM
By the time the context menu was ubiquitous, their mice still did not have two buttons.Just because the feature is available somewhere else in the UI doesn't mean that the shortcut for it must be a two-handed one.
by vkou
6/26/2026 at 9:11:14 PM
I've been able to plug in any two buttoned mouse for the 22 years since I first used a Mac. Their own trackpads and mice allow two finger tap to be enabled for advanced users (but on a laptop one finger can press ctrl while the other taps). I don't know how far back you're talking about when you imply no support for them.But I remember noticing years ago a large room of tech professionals and 100% of the Windows users had mice plugged into their laptops, and zero percent of Mac users did. It was a failure of the Windows ecosystem that people needed those imho.
by usef-
6/26/2026 at 10:05:58 PM
This is only because IMHO, the trackpad is something you can "live with" (edit: on a Mac) temporarily. It beats carrying a mouse around. Having said that, I know a UX designer that only uses the trackpad. Boggles my mind.by sscaryterry
6/26/2026 at 11:16:27 PM
It's not temporary: Mac trackpads are precise and the multitouch gestures are integrated well with the system. Mice don't support them.by usef-
6/26/2026 at 10:19:31 PM
> I don't know how far back you're talking about when you imply no support for them.It's not 'no support', it was an insane default. For all the talk of 'easy to use', there's a reason context menus exist. You can't just cram every context-specific interaction into an omnibar or a leftclick. Non-trivial software is complicated. Adding that friction to its use does nobody any favours.
Yes, in the decades since... Trackpads have gotten a lot better, but at the time Jobs was pushing for that nonsense, they simply weren't good enough. (And didn't exist at all for non-laptop computers.)
by vkou
6/26/2026 at 11:25:52 PM
Defaults are for the normal consumer, non trivial software is not, I think? What's something you think must only exist in a context menu?Note that in non-trivial or professional software it's typical to have a hand on the keyboard, because not even a second mouse button is enough. Hold 'q' while dragging to adjust exposure in capture one, etc. Or they have dedicated input hardware like mixing consoles. Or they plug in a speciality mouse.
by usef-
6/27/2026 at 3:58:02 AM
All software is non-trivial. You weren't buying a $3000 computer in 1999 to only use 'trivial' software.by vkou
6/27/2026 at 7:02:19 AM
Ok. So what's an example of something that should only exist in a right click context menu, for the average consumer?by usef-
6/25/2026 at 5:23:48 PM
Wasn’t it introduced on Tahoe? (Perhaps my memory is failing me here.) Do you mean it still persists on Golden Gate? They seem to have addressed the majority of issues I heard about - unless you mean the issue is that rounded corners exist at all.by computomatic
6/25/2026 at 5:30:35 PM
See: https://medium.com/@makalin/reclaiming-the-screen-a-develope...by sscaryterry
6/25/2026 at 10:41:24 PM
AI written article with two fullscreen popups?by trumpdong
6/25/2026 at 9:59:36 PM
Apple lost all credibility in UI around the time they introduced colorful vomit instead of app icons.by koiueo
6/25/2026 at 1:46:49 PM
Technically, AI is code, just very complex code.I'd say there are "simple" simple things you can do though, like take automated screenshots and detect colours for jarring colourschemes.
by Chris2048
6/26/2026 at 7:25:33 PM
Must have hit some nerves.by Chris2048
6/25/2026 at 4:17:01 PM
You absolutely cannot unit test for taste.I had this experience doing a port from Big Query to Postgres using Opus. I had unit tests to guarantee parity with the original code, and Opus insisted on building this bespoke query builder (e.g. `def _where(very_complicated_params)`) on top of sqlglot.
Even with the original code being straightforward and legible and repeated instructions to match, I had to fight with it to get close.
In the end, I ended up doing things the "old fashion way" where I copied chunks code into Claude proper and gave explicit instructions for each piece.
I clearly had externalized the requirements, and yet that wasn't sufficient. The only way to unit test further would be to use an AST to evaluate the output against metrics I couldn't even encode.
by fny
6/25/2026 at 5:51:23 PM
The bigger problem I have as a worker is that, once I externalize it (by writing a skill or whatever), it becomes a work-for-hire whose copyright is owned by my employer. Technically this is true of a few other things I do for work, like my .emacs and .bashrc files, small scripts I keep in ~/bin on my workstation, etc., but no employer cares to assert this unless they're being assholes for some unrelated reason. Agent skill files, especially ones that seem to semi-reliably do what they say on the tin (the white whale!), are not like that at all, and I can see them pursuing you if you try to use them at a future employer.by ElevenLathe
6/25/2026 at 11:51:21 PM
This is a solid point and the only answer to it I can think of is that execution is 99x harder than ideas. Even if you enumerate everything someone else trying to use it is still going to muck it upby hammock
6/25/2026 at 2:16:15 PM
What's kind of funny is this is how I implemented "gates" for the ticketing system I built for Claude, because Beads would just close tickets without validation. I have tickets that are literally "Human validation" tier, so it will work on the next available thing until I personally tell the model to close it. So, in that spirit, yeah, you can unit test for taste, if you implement external validation.Unit test runs, waits for human input before passing or failing, which might seem out of the norm, but we already have QA do manual testing.
by giancarlostoro
6/25/2026 at 10:03:29 PM
>You can't unit test for taste if you haven't written down what you mean by taste. If you can externalize it, then you can.If you can externalize it, you only captured the small part of taste that can be externalized in concrete rules.
You can of course pretend anything else doesn't exist, like a person denying anything that can't be measured by their instruments.
by coldtea
6/25/2026 at 2:43:26 PM
Randomized trial. Half of them pledge to use AI freely and liberally, half of them to never use it, compare via surveys and off-AI tests after X months. Could even flip it so then the non-users used it for X months and vice versa, see if losses/gains are stable.by Dumblydorr
6/26/2026 at 3:09:34 PM
Gets into the Hard Problem of Consciousness vs AGI which is a discussion that needs to happen in AI.Subjective "taste" and "feel" are experiences one has, rather than language one predicts out. Language is only produced to report on the experience, like "Wow, that's an ugly couch".
A vision model doesn't model how it experiences or feels (internally) about the image, just objective information about features of the image itself (external).
There are layers to aesthetics - part of it is functionality, utility, the environment vs your needs, but a big part of your style is directly related to your personality, memories, experiences, and how you physically fit with it. It's not correct/incorrect, it's optimizing for the entire circumstance, internal and external.
It can be hard to find the words to explain why an aesthetic works, or feels right (or wrong). What's even more important is when another person agrees. When you can have cohorts, trends, cliques, and hype.
AI can't do any of these inter/intra social activities, and so, like other acts of creation it can never operate at the cutting edge the way a human mind can. But with better and better vision models paired with good language models, synthetic subjectivity will do the job soon enough for most intents and purposes.
by playorizaya
6/25/2026 at 2:31:56 PM
I remember reading an interview with a fireman who described a time when his buddy evacuated a team because he "felt" that a floor would collapse imminently.He couldn't articulate why but they trusted his gut and it did collapse.
A lot of software engineering relies on that kind of intuition and on a good team you can integrate it and benefit from it and avoid all manner of floor collapses.
by pydry
6/25/2026 at 2:57:06 PM
To play devil’s advocate, intuition is still a physical response to stimuli mixed with knowledge of past experience. Hypothetically it could be modeled- the problem here comes down to how to encode it.by dyarosla
6/25/2026 at 3:13:44 PM
"Encoding" implies some GOFAI symbolic formal rule machinery.I'd argue that transformers are a pretty good indication that intelligence isn't "encodable" in the way we think it means. Usually, most "model" vocabulary means that we can explain and constrain the "data" from the "rules". Except the mere "data" is trillions of interacting weights.
That may be encoding in a physical sense, but that still doesn't explain the intuition in any legible way to humans.
Cynically, we've been able to encode everything already by just saying everything's a transition in a huge lookup table. Not very informative though.
by sigbottle
6/25/2026 at 2:29:48 PM
> You can't unit test for taste if you haven't written down what you mean by taste. If you can externalize it, then you can.I'm not so sure. For instance, you can write down what it means for a program to be free of XSS and other injection vulnerabilities. Now, how would you unit test for that property?
by tmoertel
6/25/2026 at 2:02:15 PM
You may be able to effectively externalize taste by "hot or not" style pair testing. Enough comparisons and I'd expect ML to be able to mimic human taste by latching on to features we're not well aware of influencing us.by delichon
6/25/2026 at 2:08:06 PM
This is RL, right? Like, this is exactly why models have mostly converged around obvious style, because we train them literally on thumbs-up/thumbs-down data of what good behavior and good code looks like.And that's why it's so hard to get a model to reproduce the specific taste of a person or an organization. My taste is different than yours, so if we dump our aggregate preferences into RL, in averages out to nothing interesting.
For the code-writing case, this means you end up reviewing every line of code, looking for places where you'd thumbs-down the code. Not every line of code contains a real decision, though, so it feels like a waste of time.
by trjordan
6/25/2026 at 2:13:18 PM
This is, in short, the big current problem with AI.LLMs are built for scale so they've given up on the kind of online learning / "long term memory" processes that would individualize them.
The LLM is permanently locked to being a really cracked engineer on their first day at your company, looking at your codebase for the first time.
You can scaffold a bit with .md files, but at the moment they lack the ability to do what humans do: go to sleep, encode things from short to long term memory, and wake up the next day with more specific knowledge baked in.
by paytonjjones
6/25/2026 at 2:19:42 PM
100%. The problem with them isn't making sure they're doing the right thing, it's making sure they're not making bad assumptions.IMHO this is where code review goes until we fix the individualized model thing: you need to review the decisions the agent made, where you didn't steer. Most will be right. A few will be disastrously wrong. But decision-by-decision is a lot less to review than line-by-line of code.
by trjordan
6/26/2026 at 4:38:04 PM
how are you getting some reviewable artifact with the decisions in it?by monknomo
6/25/2026 at 3:29:17 PM
Yea, individual learning is super expensive at this point and scale is the only way for paying for training at this point. Maybe at some point in the future we'll get this.by pixl97
6/25/2026 at 2:19:42 PM
> LLMs are built for scale so they've given up on the kind of online learning / "long term memory" processes that would individualize them.I wonder if this is even desirable from a product perspective. You probably don't want online learning in a product that you are selling because you can't guarantee a consistent quality of the product.
by plastic-enjoyer
6/25/2026 at 2:38:58 PM
You could say the same thing about employees!And to be fair, the ability to fire employees and hire new ones is pretty important for that reason. In cases where you can't easily fire employees (e.g. unions), you encounter the very problem you're describing, and it often leads to companies preferring more consistent automations.
by paytonjjones
6/25/2026 at 3:27:34 PM
It’s supervised learning rather than RL, you’re just training to labels. It doesn’t work (doesn’t generalize) because there is no guarantee or even expectation that any causal relationship is learned, it’s just whatever convenient pattern gets the lowest loss. There is lots of research on this for those unaware.by andy99
6/25/2026 at 3:13:00 PM
Yes and no.If I were to ask you - what convention you want to follow for your database columns - camelcase or snakecase? There's no correct global answer. There's no overarching truth that should apply to all databases in existence (even if you'll focus on a certain type of database). Hence the no.
But yes, because in the context of existing system there is a convention. If it's snakecase, you create new tables with snakecase column names.
LLMs will generally follow conventions, but sometimes they will not, because indeed - global truths (or at least, the "last article it read" truths) sometimes win over (I assume)
by eithed
6/25/2026 at 2:57:56 PM
Wouldn't this style of training suffer from the AI learning things the user didn't intend? I may thumbs down something for a specific detail I don't like, while other things in it are great. Certain traits that tend to occur together go along for the ride. We see similar things happen in natural selection, where mates may be chosen for 1 specific feature, and other less desirable things come along for the ride.Outside of AI, I run into this issue when taking basic personality tests. A question may be written for a specific reason, which influences the results, but the reason for my answer may be completely unrelated to the reason intended by the person who made the test.
by al_borland
6/25/2026 at 3:16:46 PM
This can usually be solved by scale alone (in all three contexts: RL, evolution, and IRT / psychometric testing)The co-occurence thing is often not a bug of the algorithm but a genuine part of the stochastic landscape that must be solved. Evolution isn't "failing" when sickle cell vulnerability is ported along with malaria resistance; it's just a real tradeoff being made in the current biological landscape.
by paytonjjones
6/26/2026 at 9:21:44 AM
This problem predates AI. If we could externalize such a fickle thing such as good taste it wouldn't be such a valuable skill. And God have people tried. Golden ratios, style guides, naming rules, linters, formatters, templates, margin ratios, color palettes, and on and on and on. And yet here we are.We can quantize some of the basics, and make a not half bad style guide, but we'll never be able to fully actualize a set of rules to match what humans find generally tasteful. Its too contextual and a moving target.
by xboxnolifes
6/25/2026 at 7:00:01 PM
" I'm trying to figure out if there's something to be done about this."Yes, it is called accepting the concept of "good enough".
If you go for perfection, with the help of AI or not - you will never be done, at least not if your concept of perfect is like mine.
And more concretely here, well you can feed the LLM with enough context about you, so it can better guess what you want. And in some years maybe use a brain computer interface. But I doubt there is a magic bullet here. Just better tools, that we can build. But they won't be perfect either (hard for me to write that, as I set out building the perfect tools).
by lukan
6/25/2026 at 3:03:37 PM
I agree and indeed externalize everything you know *that matters*.Want to follow certain pattern, or convention - define it, ie active record vs repository pattern, stick is as an ADR! You don't know what you want? Look at what Claude produces and then acquire taste, mark this as convetion that future sessions will follow, but stick to *one* convention!
Treat your LLMs as junior developers willing to apply various patterns willy nilly, caring only about fulfilling the ACs of given task and not about the longevity or well being of the system in general. They will not look at bigger picture to check if given pattern applies globally, or even if there are any other patterns.
by eithed
6/26/2026 at 9:08:44 AM
> You can't unit test for taste if you haven't written down what you mean by taste. If you can externalize it, then you can.> Follow this line of thinking, and the AI-friendly answer is easy: we just have to externalize everything we know, so Claude can implement what I want.
First you have to have it, and if you think this is a tasteful solution, then you didn't.
by tripzilch
6/25/2026 at 9:09:36 PM
Pattern language sites / books have existed for years.The right approach is more work out what shared patterns are, make sure a bunch of reasonable ones are post trained into the models so that it's easy to refer to them by name (e.g. "tim pope / chris beams style commit messages", or "make invalid state unrepresentable") and then you're in a world where you can define your personal tasted through labels rather than repetition of the core arguments.
by joshka
6/25/2026 at 10:26:39 PM
But pattern languages don't encode taste they encode known working solutions. Making invalid state unrepresentable is not a matter of taste it's a best practice.by RossBencina
6/25/2026 at 3:07:34 PM
You cannot externalize taste. You could perhaps mimic someone’s taste, but that’s not the taste. Knowing the taste requires actually tasting it. You can’t capture the taste, it’s already gone.by deadbabe
6/25/2026 at 3:37:55 PM
[dead]by cindyllm
6/25/2026 at 6:35:34 PM
You can externalize the things you consider as taste by writing down generalized statements, but those statements need boundary conditions and exceptions to be also specified. Except, exceptions have exceptions and when to apply the rule vs when to use exception is contextual judgement. so, whatever residual that cannot be explicitly and unambiguously and generally spelled out, we call it as taste/judgement.by vinay_ys
6/25/2026 at 2:40:45 PM
If you have enough examples you can train an AI on your preferences, then use that distilled AI as a unit test. Don’t combine multiple into one AI. If they don’t agree you want it to fail so you can decide and retrain the tests.by punnerud
6/25/2026 at 4:07:32 PM
Is there an issue of taste when generating images with AI ? or can we relatively rapidly train people to generate beautiful images with decent amount of variety ?by petra
6/25/2026 at 4:09:46 PM
ai generated images and art still seem to look cheap or untasteful to a lot of viewers, so it can't be that easy to train people on fixing that.by nemomarx
6/25/2026 at 2:05:43 PM
Exactly. Every single philosophical statement in history runs up against the issue where you can just say, "yeah, it's pretty much this. You just need to do <arbitrarily hard unspecified thing that is basically unfalsifiability>". (Including this one)And maybe that's just our limits with philosophy, modeling, assumptions, whatever. The danger is not realizing when we're in that zone.
(Fwiw I think unfalsifiability is a limit with any system - "you didn't compile in my syntax/semantics" is an gotcha that's actually valid and useful, but nobody can really determine the hard line)
by sigbottle
6/26/2026 at 12:02:16 AM
Emailed!by cadamsdotcom
6/26/2026 at 5:51:06 AM
[flagged]by zenpe
6/25/2026 at 3:13:57 PM
[flagged]by jimmypk