3/31/2025 at 3:08:44 PM
Yeah, the "book a flight" agent thing is a running joke now - it was a punchline in the Swyx keynote for the recent AI Engineer event in NYC: https://www.latent.space/p/agentI think this piece is underestimating the difficulty involved here though. If only it was as easy as "just pick a single task and make the agent really good at that"!
The problem is that if your UI involves human beings typing or talking to you in a human language, there is an unbounded set of ways things could go wrong. You can't test against every possible variant of what they might say. Humans are bad at clearly expressing things, but even worse is the challenge of ensuring they have a concrete, accurate mental model of what the software can and cannot do.
by simonw
3/31/2025 at 9:11:13 PM
> The problem is that if your UI involves human beings typing or talking to you in a human language, there is an unbounded set of ways things could go wrong. You can't test against every possible variant of what they might say.It's almost like we really might benefit from using the advances in AI for stuff like speech recognition to build concrete interfaces with specific predefined vocabularies and a local-first UX. But stuff like that undermines a cloud-based service and a constantly changing interface and the opportunities for general spying and manufacturing "engagement" while people struggle to use the stuff you've made. And of course, producing actual specifications means that you would have to own bugs. Besides eliminating employees, much interest in AI is all about completely eliminating responsibility. As a user of ML-based monitoring products and such for years.. "intelligence" usually implies no real specifications, and no specifications implies no bugs, and no bugs implies rent-seeking behaviour without the burden of any actual responsibilities.
It's frustrating to see how often even technologists buy the story that "users don't want/need concrete specifications" or that "users aren't smart enough to deal with concrete interfaces". It's a trick.
by photonthug
3/31/2025 at 11:23:49 PM
> concrete interfaces with specific predefined vocabularies and a local-first UXAn app? We don’t even need to put AI in it, turns out you can book flights without one.
by freeone3000
4/1/2025 at 2:21:34 AM
I see the AI push as turnkey Wall E future.by cyanydeez
4/1/2025 at 12:22:05 AM
Tech won't freeze in place exactly where it's at today even if some people want that, and even if in some cases it actually would make sense. And.. if you advocate for this I think you risk losing credibility. Especially amongst technologists it's better to think critically about structural problems with the trends and trajectories. AI is fine, change is fine.. the question now is really more like why and what for and in the interest of whom. To the extent models work locally, we'll be empowered in the end.Similarly, software eating the world was actually pretty much fine, but SaaS is/was a bit of a trap. And anyone who thought SaaS was bad should be terrified about the moats and platform lock-in that billion dollar models might mean, the enshittification that inevitably follows market dominance, etc.
Honestly we kinda need a new Stallman for the brave new world, someone who is relentlessly beating the drum on this stuff even if they come across as anticorporate and extreme. An extremist might get traction, but a call to preserve things as they are probably cannot / should not.
by photonthug
4/1/2025 at 2:49:34 PM
>And.. if you advocate for this I think you risk losing credibilityIt's a shame if new interface = credible by default. Look at all the car manufacturers (well some, probably not enough) finally after many years conceding the change to touch interfaces "because new" was a terrible idea, when the right old tool for the job was simply better...and obvious to end-users very quickly.
by PKop
4/1/2025 at 10:42:23 PM
Again in that case the newness of different tech isn’t actually the real problem and feels like the wrong critique. What’s problematic is trajectory and intent with things like planned obsolescence, subscriptions, ongoing costs in repairs after initial sale. I’d say that a new interface is barely even an issue compared with that.. although fwiw, yes, I prefer buttons rather than touch screens.by photonthug
4/2/2025 at 1:29:00 AM
>the newness of different tech isn’t actually the real problem and feels like the wrong critiqueI'm not equating new = bad. I'm saying new = good is wrong. And based on your last sentence, you do think car manufacturers all switching over to all touch controls was a problem. Almost everyone prefers buttons to touch screens, that's my point. The better more popular option was rejected because of a false premise, or false belief.
by PKop
4/1/2025 at 4:06:05 AM
If you believe in this to that extent, why can’t you be the “new Stallman”?by MichaelZuo
4/1/2025 at 6:29:52 AM
It's not about what I believe, it's about what we already know. Computing is old enough now that you don't need to be some kind of mad prophet to know things about the future, because you can just look at how things have played out already.More to the point though.. at the beginning at least, Stallman was a respected hacker and not just some random person pushing politics on a community he was barely involved with. It's gotta be that way I think, anyone who's not a respected AI/ML insider won't get far
by photonthug
4/1/2025 at 4:18:00 PM
If you are a random outsider… then how do you know there is the room and potential for such an individual?by MichaelZuo
4/1/2025 at 10:48:58 PM
I remember you now, and I would block you if I could. On the off chance you’re not doing this on purpose, read this please: https://en.m.wikipedia.org/wiki/Sealioningby photonthug
4/2/2025 at 2:21:49 AM
Regardless of whatever you believe, you still need to write the actual claim/argument down?You don’t have any more credibility than most other HN users… so just stating insinuations as if they were self evident doesn’t even make sense.
by MichaelZuo
4/1/2025 at 12:54:42 PM
I am worried about a more modest enshittification. I am already starting to encounter models that are just plain out of date in non obvious ways. It has the same feeling as trying to remember how to express someone on how to troubleshoot windows over the phone for two versions ago (e.g.: in vista this was slightly different).by xemdetia
4/1/2025 at 7:12:39 AM
> for general spying and manufacturing "engagement""Oh, there's one tiny feature that management is really really interested in, make the AI gently upsell the user on a higher tier of subscription if an opportunity presents itself."
by Terr_
4/1/2025 at 3:11:53 PM
With today's models that means it will pitch the upsell every three sentences. They're happy to comply.by genewitch
3/31/2025 at 3:26:42 PM
Perhaps the solutions(s) needs to be less focusing on output quality, and more on having a solid process for dealing with errors. Think undo, containers, git, CRDTs or whatever rather than zero tolerance for errors. That probably also means some kind of review for the irreversible bits of any process, and perhaps even process changes where possible to make common processes more reversible (which sounds like an extreme challenge in some cases).I can't imagine we're anywhere even close to the kind of perfection required not to need something like this - if it's even possible. Humans use all kinds of review and audit processes precisely because perfection is rarely attainable, and that might be fundamental.
by emn13
3/31/2025 at 4:30:00 PM
The biggest issue I’ve seen is “context window poisoning”, for lack of a better term. If it screws something up it’s highly prone to repeating that mistake. It then makes a bad fix that propagates two more errors, the says, “Sure! Let me address that,” repeating to poorly fix those rather than the underlying issue (say, restructuring code to mitigate.)It is almost impossible to produce a useful result, far as I’ve seen, unless one eliminates that mistake from the context window.
by _bin_
3/31/2025 at 4:55:44 PM
I really really wish that LLMs had an "eject" function - as in I could click on any message in a chat, and it would basically start a new clone chat with the current chat's thread history.There are so many times where I get to a point where the conversation is finally flowing in the way that I want and I would love to "fork" into several directions from that one specific part of the conversation.
Instead I have to rely on a prompt that requests the LLM to compress the entire conversation into a non-prose format that attempts to be as semantically lossless as possible; this sadly never works as in ten did [sic].
by instakill
3/31/2025 at 7:51:22 PM
This is precisely what the poorly named Edit button does in Claude.by mvdtnz
4/1/2025 at 3:16:16 PM
LM studio has a fork button on every chat part. Sorry, can't think of a better word - you can fork on any human or ai part. You can also edit, but editing isn't, it essentially creates a copy of the context with the edit, and sends the whole thing to the AI. This can overflow your context window, so it isn't recommended. Forking of course does the same thing, but it is obvious that it is doing so, whereas people are surprised to learn editing sends everything.by genewitch
3/31/2025 at 5:52:20 PM
Google UI supports branching and delete someone recently made a blog post about how great it isby tough
3/31/2025 at 7:47:16 PM
which Google UI?by marlott
4/1/2025 at 3:40:03 AM
ai.dev AI studio sorryby tough
3/31/2025 at 5:50:13 PM
You can use LibreChat which allows you to fork messages: https://www.librechat.ai/docs/features/forkby theblazehen
4/1/2025 at 2:52:07 PM
"If it screws something up it’s highly prone to repeating that mistake"Certainly true, but coaching it past sometimes helps (not always).
- roll back to the point before the mistake.
- add instructions so as to avoid the same path. "Do not try X. We tried X it does not work as it leads to Y.
- add resources that could aid a misunderstanding (api documentation, library code)
- rerun the request (improve/reword with observed details or insights)
I feel like some of the agentic frameworks are already including some of these heuristics, but a helping hand still can work to your benefit
by PeterStuer
3/31/2025 at 5:28:15 PM
I think this is one of the core issues people have when trying to program with them. If you have a long conversation with a bunch of edits, it will start to get unreliable. I frequently start new chats to get around this and it seems to work well for me.by bongodongobob
3/31/2025 at 11:15:49 PM
Yes, this definitely helps. It's just incredibly annoying because you have to dump context back into it, re-type stuff, consolidate stuff from the prior conversation, etc.by _bin_
4/1/2025 at 3:08:34 AM
Have the AI maintain a document (a local file or in canvas) with project goals, structure, setup instructions, current state, change log, todos, caveats, etc. You might need to remind it to keep it up-to-date, but I find this approach quite useful.by dr_kiszonka
3/31/2025 at 8:45:11 PM
This is what I find. If it makes a mistake, trying to get it to fix the mistake is futile and you can't "teach" it to avoid that mistake in the future.by donmcronald
4/1/2025 at 12:39:29 PM
It depends, I ran into this a lot with GPT, but less so with Claude.But then again, I know how it could avoid the mistake, so I point that out, from that point onwards it seems fine (in that chat).
by johnisgood
3/31/2025 at 7:19:35 PM
> Perhaps the solutions(s) needs to be less focusing on output quality, and more on having a solid process for dealing with errors. Think undo, containers, git, CRDTsLLMs are supposed to save us from the toils of software engineering, but it looks like we're going to reinvent software engineering to make AI useful.
Problem: Programming languages are too hard.
Solution: AI!
Problem: AI is not reliable, it's hard to specify problems precisely so that it understands what I mean unambiguously.
Solution: Programming languages!
by ModernMech
3/31/2025 at 7:58:07 PM
With pretty much every new technology, society has bent towards the tech too.When smartphones first popped up, browsing the web on them was a pain. Now pretty much the whole web has phone versions that make it easier*.
*I recognize the folly of stating this on HN.
by Workaccount2
3/31/2025 at 10:35:05 PM
No it's still a pain.There's apps that open links in their embedded browser where ads aren't blocked. So I need to copy the link and open them in my real browser.
by LtWorf
4/1/2025 at 1:42:12 AM
Or my other favorite trap: an embedded browser where I'm not authenticated. Great, now I have to roll the dice about pasting a password in your "trust me, bro" looking login page because I cannot see the URL and the autofill is all "nope"by mdaniel
4/2/2025 at 5:58:56 AM
> LLMs are supposed to save us from the toils of software engineeringWell, cryptocurrency was supposed to save us from the inefficiences of the centralized banking system.
There's a lesson to be learned here, but alas our sociiety's collective context window is less than five years.
by otabdeveloper4
3/31/2025 at 3:49:48 PM
But, assuming this is a general thing not just focused on say software development, can you make the tooling around creating this easier than defining the process itself? Everyone loosely speaking sees the value in test driven development, but often I think with complex processes, writing the test is harder than writing the process.by techpineapple
3/31/2025 at 3:56:05 PM
I want to make a simple solution where data is parsed by a vision model and "engineer for the unhappy path" is my assumption from the get-go. Changing the prompt or swapping the model is cheap.by RicoElectrico
3/31/2025 at 8:40:13 PM
vision models are also faulty, and some times all paths are unhappy paths, so there's really no viable solution. Most of the times, swapping the model completely randomizes the problem space (unless you measure every single corner case, it's impossible to tell if everything got better or if some things got worse...by herval
3/31/2025 at 4:29:32 PM
[dead]by dfilppi
3/31/2025 at 4:50:24 PM
I'm old enough to remember having to talk to a (human) agent in order to book flights, and can confirm that in my experience, the modern flight booking website is an order of magnitude better UX than talking to someone about your travel plans.by yujzgzc
3/31/2025 at 5:19:01 PM
That still exists. The last time I did onsite interviews, every single company that wanted to fly me to their office to interview me asked me to talk to a human agent to book flights. But of course the human agent is just a travel agent with no budgetary power; so I ended up calling the agent to inquire about a booking, then calling the recruiter to confirm that price is acceptable, and then calling the agent book to confirm the booking.It doesn't have to be this way. Even before the pandemic I remember some companies simply gave me access to an internal app to choose flights where the only flights shown are these of the right date, right airport, and right price.
by kccqzy
3/31/2025 at 7:53:55 PM
Yeah, I much prefer using a well designed self service system than trying to explain it over the phone.The only problem with most of the flights I book now is that they're with low cost airlines and packed with dark patterns designed to push upgrades.
Would an AI salesman be any better though? At least the website can't actively try to pursuade me to upgrade.
by leoedin
4/1/2025 at 2:42:39 PM
An AI agent will likely be worse in that you would have to actively haggle with it so it doesn’t upsell you by default, which IMO is harder than circumnavigating the dark patterns.An actually useful agent is something that is totally doable with technologies even from a decade ago, which you by necessity need to host yourself, with a sizeable amount of DIY and duct tape, since it won’t be allowed to exist as a hosted product. The purveyor of goods and services cannot bargain with it so it puts useless junk into your shopping cart on impulse. You cannot really upsell it, all the ad impressions are lost on it, and you cannot phish it with ad buttons that look like the UI of your site — it goes in with the sole purpose to make your bookings/arrangements, it’s a quick in-and-out. It, by its very definition and design, is very adversarial to how most companies with Internet presences run things.
by WesolyKubeczek
4/1/2025 at 4:25:49 AM
I think what we’ll come to widely realize is that syncing state between two minds (in your example, the travel agent’s mind and your mind; more widely, AI agents and their user’s minds) is extremely expensive and slow and it’s gonna be very hard to make these systems good enough to overcome the super low latency of keeping a task contained to a single mind, your own, and just doing most stuff yourself. The CPU/GPU dichotomy as a lens for viewing the world is widely applicable, IME.by toasterlovin
3/31/2025 at 3:36:33 PM
Even operator's original demo the first thing they showed was booking restaurant reservations and ordering groceries. I understand their need to demo something intuitive but it's still debatable whether these tasks are ones that most people want delegated to black-box agents.by serjester
3/31/2025 at 6:52:29 PM
They don't. I have never once in my life wanted to talk to my smart speaker about what I wanted for dinner, not even because a smart speaker is/can be creepy, not because of social anxiety, no, it's just simpler and more straightforward to open Doordash on my damn phone, and look at a list of restaurants nearby to order from. Or browse a list of products on Amazon to buy. Or just call a restaurant to get a reservation. These tasks are trivial.And like, as a socially anxious millennial, no I don't particularly like phone calls. However I also recognize that setting my discomfort aside, a direct connection to a human being who can help reason out a problem I'm having is not something easily replaced with a chatbot or an AI assistant. It just isn't. Perfect example: called a place to make a reservation for myself, my wife and girlfriend (poly long story) and found the place didn't usually do reservations on the day in question, but the person did ask when we'd be there. As I was talking to a person, I could provide that information immediately, and say "if you don't take reservations don't worry, that's fine," but it was an off-busy hour so we got one anyway. How does an AI navigate that conversation more efficiently than me?
As a techie person I basically spend the entire day interacting with various software to perform various tasks, work related and otherwise. I cannot overstate: NONE of these interactions, not a single one, is improved one iota by turning it into a conversation, verbal or text-based, with my or someone else's computer. By definition it makes basic tasks take longer, every time, without fail.
by ToucanLoucan
3/31/2025 at 8:06:02 PM
I've more than once been on a roadtrip and realized that wanted something to help me find a meal where I'll be sometime in the next 2 hours. I have no idea what the options are and I can't find them. All too often I've taken some generic fast food when I really wanted something local but I couldn't get maps to tell me and such things are one street away where I wouldn't see it. (remember too if i'm driving I can't spend time to scroll through a list - but even when I'm navigator the interface I can find in maps isn't good)by bluGill
4/1/2025 at 3:18:53 AM
You definitely would not want the existing SEO enhanced search results. And definitely not the not-too-distant future of SEO enhanced, AI poisoned listings where every eating place proudly declares itself "most likely/probably the best burger joint".We need to go back to a more innocent time when we could ask a select group of friends and their trusted chain of friends for recommendations. Not what social media is today.
by xarope
4/1/2025 at 3:30:27 PM
I don't have friends all over the country, and I submit that, if the adage "150 people" is true, no one has friends "all over the country".I dislike driving through Texas, and so, most road trips involve McDonalds - the only time I eat the junk.
My car's inbuilt nav is 13 years out of date, so it knows major throughways but not, for instance, that the road I live on has its own interface to the "highway", and so on, up to restaurants. Phones are unreliable in a lot of the US, and at one point I had a spare phone with all of its storage dedicated to offline Google maps just so I wouldn't get stuck in the Rockies somewhere.
Microsoft used to sell trip planning software and those were the good old days.
by genewitch
3/31/2025 at 8:27:52 PM
I'm on a road trip across Utah and Colorado right now and I've been experimenting with both Gemini and OpenAI Deep Research for this kind of thing with surprisingly decent results. Here's one transcript from this morning: https://chatgpt.com/share/67e9f968-4e88-8006-b672-13381d5e95...by simonw
4/1/2025 at 6:54:41 AM
I'm curious what the problem is with that task. I'd open Google maps, find a larger place in the right direction, confirm with directions that it's about 2h away, search for "dinner/lunch/restaurant/Japanese/tacos/..." in the visible area, choose something highly rated. I've done that lots of times successfully. What part is that fails for you? (As a non-driver of course)by viraptor
4/1/2025 at 12:58:55 PM
The problem is choice. I don't care about Japanese/tacos - either would be fine, but Argentine would be better (I have no idea if it is even a thing, but if it is I want to try it). I don't want a chain (well maybe a local chain) - I have plenty of McDonald's near my house if I want that, I want something I can't get near home. Maps will put right at top all the big chains that pay for that top spot and I need to scroll through them. More than once I've seen something that might be interesting but then the map scrolls/resizes and I can't find it anymore.by bluGill
4/1/2025 at 1:07:47 PM
But you're taking as a given that the AI is going to have any better idea than Google Maps, or be subject to less interference from marketing/paid placement stuff, when like... I'd be willing to bet a small amount of money that it's going to do what you're decrying: it's going to search $localized_area for "restaurant" and if you're lucky, maybe add -chain to it. What you want here are locals notions of what's good and not, and while I absolutely respect the shit out of that (and would love it myself!) I don't really know how to facilitate that at scale without immediately caving to the same negative influences that are screwing it up right now.Like, really what you're wanting is legitimate information not bound to the whims of advertisers and marketers (and again, to be clear, don't we fucking all) but I don't think an LLM is going to do that for you. If it does it now, and that's a load-bearing if, I have a strong feeling that's because this tech, like all tech, is in it's infancy stage. It hasn't yet gotten enough attention from corporations and their slimy marketing divisions, but that's a temporary state of affairs and has been for every past tech too. Like, OpenAI just closed another funding round and it's valuation is now THREE HUNDRED BILLION. Do you REALLY think they and by extension/as a result, their competitors, are going to be thinking about editorial independence when existing established information institutions already can't?
by ToucanLoucan
4/1/2025 at 7:24:05 AM
Agreed, verbally asking for X might make it easier for Aunt "where's the Any key" Tillie to get a solution, but it doesn't necessarily give a better solution for everyone else.Or, for that matter, solutions you can trust. Remember the pitch for Amazon Dash buttons, where you press it and it maybe-reorders a product for delivery, instantly and sight-unseen? What if the price changed? What if it's not exactly the same product anymore? Wait, did someone else already press it? Maybe I can get a better deal? etc.
Actually, that spurs a random thought: Perhaps some of these smart-speaker ordering pitches land differently if someone is in a socioeconomic class where they're already accustomed to such tasks being done competently by human office-assistants, nannies, etc. Their default expectation might be higher, and they won't need to invest time pinching pennies like the rest of us.
by Terr_
4/1/2025 at 3:25:06 PM
Not to detract from your overall message; are there studies that say that millennials have more social anxiety? My wife is 9 months younger than me, and a millennial, whereas I am X. I have no social anxiety, at all - she and our kids do. Like, calling people on the phone requires a sit down and breathing exercises; I'm always the one to "run in to the store", not wanting to attend non-concert related venues that may be crowded.My parents were way older than boomers, and hers were boomers, so maybe that's it?
by genewitch
3/31/2025 at 5:37:54 PM
It's no different than the old Amazon button thing. I'm not going to automatically pay whatever price Amazon is going to charge to push-button replenish household goods. Especially in those days, where "The World's Biggest" fence would have pretty wild swings in price.If i were rich enough to have some bot fly me somewhere, I'd have a real-life minion do it for me.
by Spooky23
3/31/2025 at 4:06:11 PM
Any customer service or tech support rep can tell you that even humans can't always understand what other humans are attempting to sayby 3p495w3op495
3/31/2025 at 4:18:53 PM
It's so funny when people try to build robots imitating people. I mean part funny, part tragedy of the upcoming bust. The irony being, we would have been better off with an interoperable flight booking API standard which a deterministic headless agent could use to make perfect bookings every single time. There is a reason current user interfaces stem from a scientific discipline once called "Human-Computer Interaction".by hansmayer
3/31/2025 at 4:56:22 PM
It's a business problem, not a tech problem. We don't have a solution you described because half of the air travel industry relies on things not being interoperable. AI is the solution at the limit, one set of companies selling users the ability to show a middle finger to a much wider set of companies - interoperability by literally having a digital human approximation pretending to be the user.by TeMPOraL
3/31/2025 at 5:03:38 PM
I've been a sentient human for at least the last 15 years of tech advancement. Assuming this stuff actually works, it's only a matter of time before these AI services claw back all that value for themselves and hold users and businesses hostage to one another, just like social media and e-commerce before. https://en.wikipedia.org/wiki/EnshittificationUnless these tools can be run locally independent of a service provider, we're just trading one boss for another.
by the_snooze
3/31/2025 at 7:01:16 PM
The difference is that social media isn't special because of its hardware or software even. People are stuck on fa ebook because everyone else is on it. It's network effects. LLMs currently have no network effects. Your friends and family aren't "on" chatgpt so why use that over something else?Once performance of a local setup is on par with online ones or good enough, that'll be game over for them.
by polishdude20
4/1/2025 at 8:51:21 AM
All it takes is for the "omg AI slop!!111" and "would someone think of my copyrights?" crowd to get their way - resulting in a conventional or legal ban for using AI user-agents on the Internet without express consent of a site/service provider. From there, it will be APIs all over again: much like today, you can't easily pipe your Facebook photo to your OneDrive and make a calendar invite - but you can use (for example) Zapier with Facebook Integration, OneDrive Integration and Google Calendar Integration, we'll end up with LLM/chatbot companies whose main value is in their exclusive set of integrations they offer.So true, it's not going to be "I use PolishDude20GPTBook because my family and friends are on it". It's going to be, "I use PolishDude20GPTBook because they have contracts with Gazeta.pl, Onet, TVN24, OLX and Allegro, so I can use it to get local news and find best-priced products in a convenient way, whereas I can't use TeMPOraLxity for any of that".
Contracts over APIs, again.
As long as the "think of my copyright / AI slop oneoneone" crowd wins. It must not.
by TeMPOraL
4/2/2025 at 8:08:01 AM
The only reason that there is a "AI-slop-crowd" (as you call it) is that, well...there is a lot of (Gen-)AI slop. If the technology was as miraculous as it has been hyped up for several years now, there would be no such crowd. Everyone would just get on. If a tech just does what it says it does, everyone gets on board. Internet is a great example of this, so were the smartphones after the iPhone moment. There was never an Anti-Internet-Crowd, I wonder why that might be?by hansmayer
4/2/2025 at 9:43:36 PM
> There was never an Anti-Internet-Crowd, I wonder why that might be?You forgot the dotcom boom? :)
Existence of AI slop has nothing to do with whether the tech itself is exceeding or falling short of its hype. It exists because it's good enough for advertising, the cancer on modern society that metastasizes to every new medium and technology, defiling and destroying everything it touches.
by TeMPOraL
4/1/2025 at 5:08:08 AM
> Unless these tools can be run locally independent of a service provider, we're just trading one boss for another.Not only that, we have to be careful about all the integrations being built around it. Thankfully the MCP standard is becoming mainstream (used by Anthropic, OpenAI and next could be Google) and it's an open standard, even if started by Anthropic so we won't have e.g. Anthropic specific integrations.
by aledalgrande
4/1/2025 at 9:00:55 AM
See my replies to other comments parallel to yours. But in short: MCP doesn't help us anymore than cURL lets you replicate Zapier in a shell script - the bad future is that, like with APIs, service providers get to differentiate between humans and AI user-agents, and restrict the latter to endpoints governed by B2B contracts.by TeMPOraL
3/31/2025 at 5:16:59 PM
> Unless these tools can be run locally independent of a service provider, we're just trading one boss for another.Many of them already can be. Many more existing models will become local options if/when RAM prices decline.
But this won't necessarily prevent enshittification, as there's always a possibility of a new model being tasked with pushing adverts or propaganda. And perhaps existing models already have been — certainly some people talk as if it's so.
by ben_w
4/1/2025 at 8:56:07 AM
People are worried about the wrong side of equation. Other problems with them notwithstanding, it's not the browser wars that killed interoperability on the Web - it's everyone else. Any browser you ever used could issue the same HTTP calls (up to standards of a given time, ofc.) - but it helps you with nothing if the endpoint only works when you've signed a contract to access the private API.The same fate may come to AI, and that worries me. It won't matter whether you're using OpenAI models, Anthropic models, or locally run models, any more than it matters whether you use Firefox, Chrome or raw cURL - if the business gets to differentiate further between users and AI agents working as users, and especially if they get legal backing to doing that, you can kiss all the benefits of LLMs goodbye, they won't be yours as end-user, they'll all accrue to capitalists, who in turn will lend slivers of it to you, for a price of a subscription.
by TeMPOraL
4/1/2025 at 3:03:06 PM
> Any browser you ever used could issue the same HTTP calls (up to standards of a given time, ofc.) - but it helps you with nothing if the endpoint only works when you've signed a contract to access the private API.Oh, you mean like everyone who shows up to the Cloudflare submissions pointing out how they've been blocklisted from about 50% of the Internet, without recourse, due to the audacity to not run Chrome? In that circumstance, it's actually worse(?) because to the best of my knowledge I cannot subscribe to Cloudflare Verified to not get the :fu: I just have to hope the Eye of Sauron doesn't find me
That reminds me, it's probably time for my semi-annual Google Takeout
by mdaniel
4/1/2025 at 7:21:20 PM
Yeah, that's just an extension of what I said. After all, it's not Google/Chrome that's creating this problem - it's Cloudflare and people who buy this service from them, by making the lazy/economically prudent assumption that anyone who has an opinion on how they consume services can be bucketed together with scammers and denied access.It stems from the problem I described though - blocking you for not using Chrome is just "only illegitimate users don't use Chrome", which is the next step after "only illegitimate users would want to use our API endpoints without starting a business and signing a formal contract with us".
by TeMPOraL
3/31/2025 at 8:00:21 PM
The airlines rely on things not interoperating for you. However their agents interoperate all the time via code sharing. They don't want normal people to do this but if something goes wrong with the airplane you should be on they would prefer you to get there than not.by bluGill
4/1/2025 at 9:27:29 AM
> They don't want normal people to do thisThat's the root of the problem. That's precisely why computers are not the "bicycles for the minds" they were imagined to be.
It's not a conspiracy theory, either. Most of the tech industry makes money inserting themselves between you and your problem and trying to make sure you're stuck with them.
by TeMPOraL
3/31/2025 at 4:44:41 PM
But that's the promise of AI, right? You can't put an API on everything for human + technological reasons.by jatins
3/31/2025 at 4:48:38 PM
You can’t put an API on everything because it’d take a ton of time and money to pull that off.I can’t think of any technological reasons why every digital system can’t have an API (barring security concerns, as those would need to be case by case)
So instead, we put 100s of billions of dollars into statistical models hoping they could do it for us.
It’s kind of backwards.
by dartos
3/31/2025 at 6:12:41 PM
A web page is an Application/Human Interface. Outside of security concerns, companies can make more money if they control the Application/Human Interface, and reduce the risk of a middleman / broker extorting them for margins.If I run a flight aggregator that has a majority of flight bookings, I can start charging 'rents' by allowing featured/sponsored listings to be promoted above the 'best' result, leading to a prisoner's dilemma where airlines should pay up to their margins to keep market share.
If an AI company becomes the default application human interface, they can do the same thing. Pay OpenAI tribute or be ended as a going concern.
by datadrivenangel
4/1/2025 at 4:38:05 AM
LLMs as a natural language interface is fine.What I’m saying is that if there was a standard protocol for making travel plans over the internet, we wouldn’t need an AI agent to book a trip.
We could just create great user experiences that expose those APIs like we do for pretty much everything on the web.
by dartos
3/31/2025 at 6:52:07 PM
Exactly. It should take around 10 parameters to book a flight. Not 30,000,000,000 and a dedicated nuclear power plant.by daxfohl
3/31/2025 at 5:49:39 PM
You change who's paying.by Scene_Cast2
3/31/2025 at 6:01:18 PM
Sure, as a biz it makes sense, but as a society, it’s obviously a big failure.by dartos
3/31/2025 at 4:47:17 PM
It is a promise alright :)by hansmayer
3/31/2025 at 7:32:11 PM
Your use of the word "perfect" is doing a lot of heavy lifting. "Perfect" is a word embedded in a high dimensional space whose local maxima are different for every human on the planet.by doug_durham
4/1/2025 at 10:22:06 AM
No, it's just the intuitively perfect that comes to mind in this context, i.e. reliable and guaranteed to produce a safe outcome. Much like Amazon checkout process. I am fine giving my credit card details to near-perfect automatons like that. I will never give it to a statistical model, which may or may not hallucinate the sum it is supposed to enter into an interface built for humans, not computers.by hansmayer
3/31/2025 at 7:53:21 PM
Yep, and AI agents essentially throw up a boundary blocking the user from understanding the capabilities of the system they're using. They're like the touch screens in cars that no one asked for, but for software.by davesque
3/31/2025 at 3:21:08 PM
Case-in-point look how long it’s taken for self-driving cars to mature. And many would argue they still have a ways to go until they’re truly reliable.I think this highlights how we still haven’t cracked intelligence. Many of these issues come from the model’s very limited ability to adapt on the fly.
If you think about it every little action we take is a micro learning opportunity. A small-scale scientific process of trying something and seeing the result. Current AI models can’t really do that.
by CooCooCaCha
3/31/2025 at 6:41:30 PM
Even maps. I was driving to Chicago last week and Apple Maps insisted I take the exit for Danville. Fortunately I knew better, I only had the map on in case an accident might require rerouting. I find it hard to drive with maps navigation because they are usually correct, but wrong often enough that I don't fully trust them. So I have to double check everything they tell me with the reality in front of me, and that takes more mental effort than it ideally should.by SoftTalker
4/1/2025 at 2:38:52 AM
> double check everything they tell me with the reality in front of meI believe that's a famous Army Ranger expression: "the map is not the terrain" (I tried to find an attribution for it but it seems it comes in "the map is not the territory" flavors, too)
by mdaniel
3/31/2025 at 3:22:43 PM
Isn't the point he's making:>> Yet too many AI projects consistently underestimate this, chasing flashy agent demos promising groundbreaking capabilities—until inevitable failures undermine their credibility.
This is the problem with the 'MCP for Foo' posts that recently.
Adding a capability to your agent that the agent can't use just gives us exactly that:
> inevitable failures undermine their credibility
It should be relatively easy for everyone to agree that giving agents an unlimited set of arbitrary capabilities will just make them terrible at everything; and that promising that giving them these capabilities will make them better is:
A) false
B) undermining the credibility of agentic systems
C) undermining the credibility of the people making these promises
...I get it, it is hard to write good agent systems, but surely, a bunch of half-baked, function-calling wrappers that don't really work... like, it's not a good look right?
It's just vibe coding for agents.
I think it's quite reasonable to be say, if you're building a system, now, then:
> The key to navigating this tension is focus—choosing a small number of tasks to execute exceptionally well and relentlessly iterating upon them.
^ This seems like exceptionally good advice. If you can't make something that's actually good by iterating on it until it is good and it does work, then you're going to end up being a devin (ie. over promised, over hyped failure).
by noodletheworld
3/31/2025 at 7:48:12 PM
> Yeah, the "book a flight" agent thing is a running joke nowI literally sat in a meeting with one of our board members who used this exact example of how "AI can do everything now!" and it was REALLY hard not to laugh.
by burnte
3/31/2025 at 7:52:33 PM
Can Google Flights find the best flight dates to a destination within a time frame? E.g. get flights to LA in a up to 15 day period with ensure attendance on 17 September. Fly with SkyAlliance airlines only. Flexible with any dates but needs to be there on 17 Sept and at minimum stay of eight days or more.Love if it could help with that but I haven't figured it out with Google Flights yet. My dream is to tell an AI agent the above and let it figure out the best deal.
by wdb