alt.hn

7/5/2026 at 6:19:34 AM

sqlite-utils 4.0rc2, mostly written by Claude Fable (for about $149.25)

https://simonwillison.net/2026/Jul/5/sqlite-utils-fable/

by ognyankulev

7/5/2026 at 10:39:13 AM

> I went out to enjoy the Half Moon Bay 4th of July parade, occasionally checking in and prompting the next step for Fable from my phone.

This intensification of work will not be good for workers’ health. Like, put your phone down man. You can’t be modeling this behavior to young people.

Further, the intensification of work is probably not even good for productivity in the long term. This periodic half-thinking about things without stepping away from the problems you are working to solve will lead to more half-assed solutions. Ideas need room to breathe and dedicated focus.

by avidphantasm

7/5/2026 at 12:01:35 PM

I've been reading articles that are telling me how they're prompting from their phone at 2am(and how they know that's not great) and when they're out driving their kids to their activities etc. They say all that as if it's something to be proud of. And what for? Oh they released all these packages that no one is going to use.

Burnout is real and these people have lost track of what's important.

by scorpioxy

7/5/2026 at 2:34:25 PM

I worked for a startup in Canada, with Dutch founders, and operations in France and California.

That meme about how different countries treat out-of-office is very accurate.

by c-hendricks

7/5/2026 at 2:33:56 PM

Ideally it means a massive relaxation of the 9-5.

For my personal efforts yes I very much want on demand access, I want the thoughts to flow. It doesn't feel like that would be anything but great at my job too, if my job didn't define such narrow bounds of mandatory butts in chairs. What work says it wants is not effective, is not going to get my best work, is not going to really make sense. It didn't make sense before & it's absurdly out of pace with the Happy Warrior mode of software development today.

by jauntywundrkind

7/5/2026 at 8:08:08 PM

>>> I went out to enjoy the Half Moon Bay 4th of July parade, occasionally checking in and prompting the next step for Fable from my phone.

>> This intensification of work will not be good for workers’ health. Like, put your phone down man. You can’t be modeling this behavior to young people.

> Ideally it means a massive relaxation of the 9-5.

Why? That's the time they've purchased. They'll just demand that and more. "People like Simon Willison are prompting during their 'off time,' and that's expected under our new ways of working."

Also, being always on, always available doesn't sound like it would amount to "massive relaxation."

by palmotea

7/5/2026 at 10:58:20 AM

To late

You either do it, or you are out

by Bombthecat

7/5/2026 at 10:59:05 AM

[dead]

by dan_i

7/5/2026 at 12:13:34 PM

Idk about you but this type of intensification of my work has been extremely good for my mental health, on all points:

1. I feel genuinely more productive, spending a lot less time on boilerplate and much more of my genuine time is spent thinking and communicating the thinking process.

2. I can take a ton of breaks, basically whenever I want. "Flow" is now entirely design flow and can be interrupted much more easily without damaging it.

3. If there is anything I actively dislike in my workflow or that makes me not enjoy my time... I can fix my workflow so that either I'm not the one doing it, or the item in question is no longer necessary.

AI is crazy. I get it, if you're at a shitty job that doesn't understand how to adapt well, it's tough... but if you're working on your own (like Simon does on this project), it's absolutely amazing and you're in full control of your life.

by scrollaway

7/5/2026 at 7:50:23 AM

The problem I have with this workflow is that the models are still too eager to please. If I ask it to scan a release and note possible issues, it absolutely will find issues. If I keep running the same prompt, it will keep finding issues. I’ve spammed GitHub PR reviews and it just keep finding (or inventing?) new issues. There is never a “Nothing found, good to go!”. I have to keep reminding myself that the model will always give me what I ask for, regardless of the reality/truth.

by dreadnip

7/5/2026 at 8:00:48 AM

You didn’t do it enough. They stop finding bugs eventually. Also, different models can find different bugs (though they do find the same ones, too, which is good and expected). For best results you want to run multi model reviews in loops.

If you had multiple people look at your PRs multiple times on different days results would be very similar.

by baq

7/5/2026 at 8:46:06 AM

I've had it find bug, I asked it to make test to trigger the bug, and then it figured out it's not a bug. It will absolutely do wish fulfilment

by PunchyHamster

7/5/2026 at 8:50:55 AM

Yeah when these models find a bug i like to ask it to write a test that will fail if the bug is real and pass when the bug is solved.

It’s not perfect but usually it works pretty well, and I’ve had the model come back to me with oh actually the test passed, the bug doesn’t work exist

As a bonus, you’ve now got a test that can detect that bug if it comes up again.

by left-struck

7/5/2026 at 9:08:32 AM

It'll find a non-existent bug - fix it - figure out it broke a previously working thing - try to fix again - etc..

The "keep improving" the code base prompt have been tried and it never works. The LLM has no consciousness of where to stop and where to draw the lines of reasonableness.

by csomar

7/5/2026 at 8:06:28 AM

No, depending on the complexity of the issue models can be into loops, where they go "this is definitely an issue and must be fixed", and then the resulting fixed code gets "this is definitely an issue and must be fixed", and then the resulting fixed code has the original 'issue'.

by MallocVoidstar

7/5/2026 at 8:47:38 AM

That's a different kind of loop.

For a normal review loops you can ask the model to return with nothing found if nothing is found and not invent things and it will do a better job of exiting without anything found.

by bfjvibybd6cuvu6

7/5/2026 at 10:00:53 AM

I've ran into this in the before times with linters and static analyzers. Nothing new.

by baq

7/5/2026 at 8:47:08 AM

yeah, happened to me: "A is very wrong, you should do B", and on the next fresh review loop "B is very wrong, you should do A"

typically this means there is some ambiguity in the specification, and the model flips between alternative interpretations

by memoriyato3

7/5/2026 at 9:16:12 AM

I get this sometimes when I ask the agent on GitHub to suggestion improvements to my Julia code. It's kind of fun to watch it struggle to please. I'm reminded of the old "Doctor" mode in Emacs.

by bluenose69

7/5/2026 at 11:50:18 AM

I’ve found that Claude is really good at picking up tone of voice in prompts / queries.

If I go “find issues in this code” it will hallucinate some, but if I say “can you check the recent change, there might be some things that introduced regressions, maybe?” Then it will be more cautious.

Also especially fable but opus too can talk back and advise you against going into a direction it thinks unwise.

And I’ve had much more success in clearing out why I think that is a better approach or asking it to clarify itself, as if if I tell it my assumption, sometimes it self corrects and starts doing what I needed in the first place, it was just coming at it from a different direction before. For example assuming I don’t care about cost and providing “the best solution” or trying to make something reusable where what I needed something quick (or vice versa)

It really is best to think of it as a gradient plane where it might get stuck in local minima, or you can prime it to “teeter on the edge” and able to flow into different directions.

by seer

7/5/2026 at 9:42:26 AM

> There is never a “Nothing found, good to go!”. I have to keep reminding myself that the model will always give me what I ask for, regardless of the reality/truth.

Tell it something like:

  Before doing any commits or producing a summary for the user, you must run a verification sub-agent.
  Its goal is to adversarially and critically check your supposed findings to look out for false positives and hallucinations.
  Doing so with a separate sub-agent with relatively clean context (but with all the relevant details of the problem space that appear to be facts) should improve our confidence in the findings.
Maybe also something like:

    Try to classify each found issue as either SERIOUS, CRITICAL or NITPICK, discard nitpicks, we only care about impactful issues.
It should somewhat cut down on the useless output.

I've largely found the same in regards to generating code - the initial pass will often have bugs that the model itself can find but only when run as a separate sub-agent without the confidence poisoning in its own previous output.

by KronisLV

7/5/2026 at 11:16:55 AM

A second look is always useful when using these damn things.

by arcanemachiner

7/5/2026 at 8:04:47 AM

> There is never a “Nothing found, good to go!”.

Like when you do recursive programming, have you tried providing more/better stop conditions? If you literally just say "Continue until there are no more issues" then it'll do just that, but if you scope it better, like "Only mention issues related to X, Y or that leads to Z" and so on, you'll get less noise and more focus on issues that actually matter (to you).

by embedding-shape

7/5/2026 at 8:51:37 AM

also helps adding negative conditions like "do not nitpick", or specific bad attractors that you see "do not investigate/report anything related to symlinks, they are not a concern"

by memoriyato3

7/5/2026 at 8:48:47 AM

You need to create review skill and there define what "issue" or "good" are for you to limit sensitiviness. Otherwise you depend on model's random threshold or non of such then you get perfection chasing.

Anyway it will never match your judgemend completely unless you upload your brain dump into model.

by imhoguy

7/5/2026 at 9:45:39 AM

> There is never a “Nothing found, good to go!”

Not entirely true IME. Eventually the bug hunt will end with general design advices that may not be suitable to your use case and that you can skip.

by JodieBenitez

7/5/2026 at 7:53:03 AM

You get the same result if you pay humans a good sum of money to find issues.

by threatripper

7/5/2026 at 8:01:06 AM

Definitely not. I've never seen a human trapped in that kind of infinite loop. Humans know that if they don't stop at the end of the day, they don't get to go home to their wife, and if they don't finalize their list of issues, they never get their contract paid out.

by nvme0n1p1

7/5/2026 at 8:07:31 AM

Pay people per hour of work and even if there is no actual work, people will definitively find a way of spending hours doing things. If you've worked with contractors/outsourced roles before this will happen from time to time.

by embedding-shape

7/5/2026 at 7:57:52 AM

There is a point of diminishing returns though; the issues suggested will get speculative, or point out comment unclarity, or "defense in depth". But I agree it’s somewhat annoying to rarely get clear pushback in terms of "no, this looks good enough to me, release it"

by 9dev

7/5/2026 at 8:32:53 AM

I use Claude Code and one of the steps in my workflow is do a review loop until no issues are found and it never loops. So my experience is entirely different. Even if I say: fix all issues. So not only the critical issues.

by starquake

7/5/2026 at 7:57:34 AM

I think this was true with older models, but at least with GPT 5.5 it can genuinely tell you "no issues found" after a few passes of finding real issues.

by Tiberium

7/5/2026 at 9:32:22 AM

You could ask the model to say "nothing found" if the improvement was stylistic, or other constraints.

by mejutoco

7/5/2026 at 8:09:40 AM

If I keep running the same prompt, it will keep finding issues.

I've had the same experience, but whenever I've reviewed what it finds it's basically right. It's pedantic, and a lot of the problems aren't things I really care about, but they definitely are real problems.

I'm not sure you can blame the AI for always finding problems if a) you asked it to, and b) there are problems to find.

by onion2k

7/5/2026 at 9:11:03 AM

You need to run them in review loops, this is the only way to reduce or eliminate these issues.

by higeorge13

7/5/2026 at 8:41:38 AM

That's just plain wrong. The new models do not hallucinate as much as they used to (in my personal experience)

by Myzel394

7/5/2026 at 8:47:34 AM

> plain wrong > (in my experience) What are you even saying.

by TripolitianFish

7/5/2026 at 9:04:16 AM

That their vibes are more real than your vibes

by girvo

7/5/2026 at 9:07:34 AM

What do you mean? Are they valid flaws or not?

Would you like it to stop when there's still flaws in the code?

by knorker

7/5/2026 at 8:56:11 AM

It's not eagerness to please (that's anthropomorphising), rather it's a desire to bill you more money/use more tokens

(The fixed prices are just temporary discounts)

by gib444

7/5/2026 at 10:52:58 AM

[dead]

by dan_i

7/5/2026 at 9:53:43 AM

Great to see such write ups in particular with money.

At least 2 things the random LinkedIn post will ignore, on purpose or not :

- price today remains low (even though they might feel higher than before), Uber is the business model, no secret there, it's a VC classic

- $150 spent by an expert, a software engineer with significant practical knowledge in AI, is not equivalent to the exact same amount spent by a novice.

Yet now that a number is out, you bet it will be used. Expect alarmist posts tomorrow morning in your feed claiming building software is now as cheap as diner at the restaurant.

by utopiah

7/5/2026 at 9:59:07 AM

It's also about variance in the number.

Expert software engineers will still accidentally burn $500 or $5000 on tasks that don't work, or are not efficient. Amateurs will accidentally spend $100 to get something great.

So part of the change is a change in the risk structure of using frontier models. Before, you'd burn your quota; now, you can burn uncapped (less-capped) money.

by uniqueuid

7/5/2026 at 11:00:51 AM

[dead]

by dan_i

7/5/2026 at 9:46:45 AM

I'm kind of surprised that there is no test case that would have identified the fact that delete_where() leaves the state corrupted. There would be no need to ask Fable if the problem gets identified by the test. And having a test will also catch all future problems that might arise in the same function. So maybe instead of asking Claude what is wrong it would be wiser to invest in test coverage.

by hbplawinski

7/5/2026 at 1:38:17 PM

I also stopped reading the article there. This is major version 4 of a library about providing higher level db actions, i.e. transactions are paramount to make them atomic

by eska

7/5/2026 at 9:14:09 AM

I'm a big fan of sqlite-utils, but I really don't like how Python (particularly 3.12+) changes how sqlite's transactions work -- the native behavior explained in the sqlite docs is much better IMO. I understand why Python had to change it (to be compatible with other databases) but I don't think it's a good model for sqlite.

Therefore, I created apsw-utils, a port of sqlite-utils to the amazingly-awesome apsw lib -- which is a really idiomatic sqlite lib for python. It's here: https://answerdotai.github.io/apswutils/

I've used it in lots of projects including in significant production stuff, and it's always worked great for me. IMO if you're serious about doing sqlite in python, at some point you'll probably want to check out apsw.

by jph00

7/5/2026 at 9:45:10 AM

> changes how sqlite's transactions work

What specifically are you referring to? The apswutils website also does not explain.

by jmalicki

7/5/2026 at 10:42:36 AM

They're probably talking about the addition of the autocommit flag, which hides more fine-grained transaction control in favor of more uniform behavior across multiple databases:

https://docs.python.org/3/library/sqlite3.html#sqlite3.Conne...

You can still use previous behavior with "legacy" mode that lets you control when transactions are opened in which isolation level.

by dxdm

7/5/2026 at 1:17:34 PM

> hides more fine-grained transaction control

In what way does having autocommit=False hide more fine-grained transaction control?

autocommit=False gives full control to the programmer to do whatever they want.

by jmalicki

7/5/2026 at 3:30:45 PM

From the link in my previous post[0]:

> False: Select PEP 249-compliant transaction behaviour, implying that sqlite3 ensures a transaction is always open.

This means you don't get to control the isolation level of the transaction, because [1]:

> sqlite3 uses BEGIN DEFERRED statements when opening transactions.

If you want to use `IMMEDIATE` or `EXCLUSIVE` isolation level[2] for your sqlite transaction using the new flag, you have to set `autocommit=True` to be able to open the transaction yourself with `.execute("BEGIN IMMEDIATE")`.

However, with `autocommit=True`, the connection's `.commit()` and `.rollback()` methods will *silently do nothing* and you have to execute the respective raw SQL yourself to commit or abort your manually-opened transaction. This also concerns the context-manager behavior of the connection object, which will not commit or abort manual transactions on context exit in this case.

So, the autocommit flag becomes a little complicated and foot-gunny if you want more precise control over when exactly other readers or writers should get blocked by sqlite.

[0] https://docs.python.org/3/library/sqlite3.html#sqlite3.Conne...

[1] https://docs.python.org/3/library/sqlite3.html#sqlite3-trans...

[2] https://sqlite.org/lang_transaction.html#deferred_immediate_...

by dxdm

7/5/2026 at 3:44:42 PM

Thank you!

The link had some of the details, but not the intelligent thought linking them into a narrative, much appreciated! This is something super important I would have been foot-gunned on at some point.

by jmalicki

7/5/2026 at 10:46:14 AM

Pythons options are here: https://docs.python.org/3/library/sqlite3.html

SQLite behavior is here: https://sqlite.org/lang_transaction.html . The regular implicit transactions there plus explicit where needed aren’t supported in any python mode.

by jph00

7/5/2026 at 1:14:53 PM

> The regular implicit transactions there plus explicit where needed aren’t supported in any python mode

Specific examples would be extremely useful. You've done some work learning and deducing this stuff, others could learn if you would share and explain it.

by jmalicki

7/5/2026 at 7:44:44 AM

The title cost is only if this was raw API usage, but it was included in a subscription, so it's a small subset of the $200 plan:

> I upgraded to the Claude Max $200/month plan (I was previously on $100/month) to increase my Fable allowance for the remaining time until the July 7th Fablepocalypse, when even Claude Max subscribers will have to pay full API cost for the model.

I really wonder if Anthropic will stick with their decision to keep Fable on extra usage credits until they "get more compute", especially in the light of GPT 5.6 very likely coming out next week (it's confirmed to have the exact same pricing as GPT 5.5)

by Tiberium

7/5/2026 at 8:08:27 AM

> especially in the light of GPT 5.6 very likely coming out next week

Finally have an explanation why GPT 5.5 xhigh felt dumber and dumber these last few weeks, always the same thing when a new model release is about to come out...

by embedding-shape

7/5/2026 at 8:33:11 AM

Opus has been extremely stupid recently, reckon that's because Fable needs to look appealing?

by toxik

7/5/2026 at 9:00:10 AM

I have never noticed a degradation in either Claude or OpenAI models, and the benchmarks people set up have never shown a statistically significant deviation either: https://marginlab.ai/trackers/claude-code

Yet the same claim is being posted every single day, including new claims that the Fable 5 model has degraded compared to the initial release, guardrails aside.

by user43928

7/5/2026 at 9:02:46 AM

Almost slipping into conspiracy territory, but without insights into what the labs actually do internally, hard not to:

Anyways, heard about A/B testing before? ML people tend to like it a lot, hard to imagine neither OpenAI or Anthropic are already deep into categorizing people into buckets and running an wild amount of A/B testing all over the place, especially in the weeks leading up to new model releases, in various ways.

by embedding-shape

7/5/2026 at 9:23:33 AM

Yes, and we can see A/B testing on the ChatGPT website all the time.

They are also testing the new models in their coding tools with select customers first.

People working at OpenAI have publicly denied that they are performing any kind of hidden routing or quantization of models after release for Codex. I tend to believe them.

by user43928

7/5/2026 at 10:55:56 AM

[dead]

by dan_i

7/5/2026 at 8:01:58 AM

This is to prevent Chinese labs distilling Claude again right? And free advertising again?

by andy_ppp

7/5/2026 at 8:46:52 AM

just a note. in most parts of the world 149.25 USD can cover utilities, water, and food for a month for 1 adult person or even a family.

by 5701652400

7/5/2026 at 9:11:45 AM

Had this been a corporate environment the net saving by using one person partly and an agent as opposed to one person full time for the time it would take to implement this, would be a net saving enough to cover utilities, water and food for an entire village.

It’s silly to act like this was an added cost in a vacuum, or that any costs translate directly into charity for arbitrary families. Also in some place it would even cover rent for half a day.

by klustregrif

7/5/2026 at 11:50:36 AM

just hire someone in Bangladesh or some remote Philipines/Indonesia/Vietnam or even eastern EU.

no need for charities or any sort. plenty of people in Software and plenty of people laid off every day.

by 5701652400

7/5/2026 at 8:53:17 AM

In Sydney Australia its < 2 days of median rent.

by xyzzy123

7/5/2026 at 11:50:57 AM

severely overpriced.

by 5701652400

7/5/2026 at 8:54:13 AM

That's my electricity bill for a year, okay

by Muromec

7/5/2026 at 11:51:32 AM

I meant in total. food + electricity/water + housing = 150USD / month.

yep. same. my electircity is ~100 USD / year.

by 5701652400

7/5/2026 at 11:39:10 AM

That's my electricity bill for a month

by TiredOfLife

7/5/2026 at 11:53:36 AM

do you have like 10 AC or something? or are you in Germany?

by 5701652400

7/5/2026 at 8:54:07 AM

In others it's pizza night for family or half a bill for sushi dinner, so what?

by mirekrusin

7/5/2026 at 11:52:38 AM

severely overpriced. that's what.

by 5701652400

7/5/2026 at 2:50:11 PM

it's not overpriced but high-priced looking from other countries and mostly not a markup, normal considering median wages.

by mirekrusin

7/5/2026 at 10:54:09 AM

[dead]

by dan_i

7/5/2026 at 8:37:20 AM

Glad to see others dual wielding: “I used to think that the idea of having one model review the work of another was somewhat absurd—it felt weirdly superstitious. The problem is it really does work”

by keizo

7/5/2026 at 10:14:52 AM

Did you check the cost calculation? I wouldn’t trust it to give the correct amount if i put the amount in the prompt

by boesboes

7/5/2026 at 6:07:51 PM

Why not ask claude (or a cheaper model) to actually use this and report any unexpected outcomes? I mean, at least a few times, but better would be to do it a lot and have automated result tracking.

by spwa4

7/5/2026 at 11:54:59 AM

Brutal bugs for a release candidate.

by mike_hock

7/5/2026 at 9:07:27 AM

Skynet wants to make us poorer.

by shevy-java

7/5/2026 at 10:57:13 AM

[dead]

by dan_i

7/5/2026 at 9:29:18 AM

[flagged]

by tangsoupgallery

7/5/2026 at 7:59:17 AM

Fun fact: because AI written works don't have copyright (in the EU at least) and the level of prompting many people engage in doesn't suffice to create a copyrightable "work" and software licenses require you to actually be able to grant a license using rights you hold on a work, not only are many AI generated "works" not actually protected by copyright but by selling licenses you're actually in breach of contract law and may end up owing the licensee software you don't have.

by hnbad

7/5/2026 at 8:01:08 AM

And nothing happened and zero people got in trouble over it.

- Narrator

by vasco

7/5/2026 at 8:55:11 AM

...So far

by Muromec

7/5/2026 at 9:06:23 AM

IMO the real "fun fact" is that supposed "IP monopolists" like Microsoft and Oracle's lawyers are apparently totally fine with this stuff.

So obviously people are going to take their lead and not get legal advice from some greasy dweeb at the bottom of HN.

by Slothrop99