6/6/2026 at 4:40:34 PM
> [...] he had intervened at forces that were deploying commercially available AI tools before they had been properly assessed [...] “All forces have got a good policy on the use of Copilot,” Murray said. “All forces will have a policy that says, ‘Check everything that it produces’.”Not only are they using AI before they've properly assessed them, they also end up using Copilot which must be one of the worse AIs currently available, probably because of existing Microsoft relations. And on top of all that, they hope to be able to rely on "Please review the outputs" which obviously isn't an actual solution here, of course people will get complacent and throw stuff over the wall whenever they can.
by embedding-shape
6/6/2026 at 5:14:46 PM
> “All forces will have a policy that says, ‘Check everything that it produces’.”Everyone I talk to (including outside of tech) is going through this phase at their companies. It’s not working.
Checking the output seems like a simple request, but the question becomes: Check against what? If the police are making a document that sources from another report that another officer used AI to produce from their notes which were also run through AI and on and on, an inconsistency that leaks in at a previous step will check out when someone reviews the output against the inputs.
We’re all also discovering that many people’s idea of reviewing the output is to skim it and verify that it looks convincing enough. Checking facts is hard and takes time. These people are using AI because they want to work less, not to give themselves extra work.
by Aurornis
6/6/2026 at 5:20:27 PM
One can ask, what is a practical difference between “Check everything that it produces” and “Do all the work yourself”?It’s not typing that’s the bottleneck, at least not often, so this is essentially assuming that you can do all the needed work without actually doing it, which is obviously wishful thinking.
by prymitive
6/6/2026 at 5:29:54 PM
This is definitely the most interesting question in a ton of AI applications. I think folks should be really be spending a lot of time on figuring out how to deterministically check AI outputs in a way that's reliable in order to reduce the amount of work a human has to check, and to build tools that speed up the checking process.Thinking about all of the fake citations in legal submissions that have come up of late, it seems pretty straightforward to set up a regex that captures all forms in which a cited case might be written (I could be wrong but I'd assume there's some standard variety of formats) and search those against a database (again assuming such a database exists) to ensure they all exist.
Then for the tougher problem of making sure that the cited cases say whatever the document citing them says they do, you could have an LLM run through the document, pull out the text with the case name and text about why it's being cited, then read the case and try to determine whether the reason for citing it is valid. Rather than just give a yes/no, you'd put the doc in front of the user and let them jump from citation to citation. On each citation, it'd pop up a card that shows the literal text of why it's being cited, a judgement from the LLM of whether it matches what the case says, and snippets of text from the case as evidence + deeplinks to that text within the case.
Or maybe you wouldn't even want to give the LLM's judgement since people might rely on that without reading, but there's definitely a way to speed up the review.
I believe OpenEvidence does something like this with medical papers. If you ask it a medical question, it doesn't answer so much as link you directly to the relevant papers so you can read them and determine if they're useful. Avoids all of the potential risks of using an LLM but still hugely valuable and time-saving for docs.
by idopmstuff
6/6/2026 at 5:47:18 PM
excellent point. it is like saying computers in the 90s.remember how the bank giving your money to the wrong person was a crime? and then when "the computet" did it was just business as usual and you paid more for banking because now they had "computer fraud" insurance?
same thing. cop deliver false report, jail (hah! i know). now, it was "the Ai". so no jail, they will go back and put rules for the cop to read or something.
and we are making everything worse by the minute. One gov push back on Ai nonsense, ibm/rh cames up with all sort of lies that would make any engineer or research laugh on their faces (federated learning being for privacy, instead of cost cutting. or explainable Ai being real, and not something bolted after the inference with extra unexplainable inference. etc.) but that are good enough to fool the regulator.
by iririririr
6/6/2026 at 5:38:33 PM
[dead]by sheepscreek
6/6/2026 at 5:02:15 PM
The mindset must be that if you use AI (which I happen to advocate for) you are also responsible for the output, if you use the output publicly. AI is obviously very powerful if used responsibly - the human is responsible for it once it is used - however it’s used.by kerabatsos
6/6/2026 at 5:32:03 PM
I think the problem is that, this is practically speaking impossible adjacent. I think generally speaking writing is way easier than editing, especially at scale. This isn’t binary or all or nothing, it’s not like “you can never use AI”. But I think we need to go back to augmentation over generation.A person produces the content and AI removes barriers, and contextually accelerates the process keeping you in a flow state, rather than AI generates human edits.
by techblueberry
6/6/2026 at 5:00:24 PM
> on top of all that, they hope to be able to rely on "Please review the outputs" which obviously isn't an actual solution here, of course people will get complacent and throw stuff over the wall whenever they can.This is honestly the fundamental problem of AI as I see it
When we offload our work to a different person we can calibrate our expectations to our past experiences with that person. With AI the experience is not very consistent. To use AI effectively you basically should treat it as a low trust, brand new coworker every single time you use it
That doesn't really scale, so people have two choices: be constantly hyper vigilant for mistakes the AI makes, or become complacent and trust it more than they should
People rightly point out that humans make mistakes too, not just AI. But humans have a pretty manageable cap on the amount of output they can produce. One human can pretty thoroughly review the outputs of a small team of other humans
One human can't possibly thoroughly review the volume of output that an LLM they are prompting can produce
by bluefirebrand
6/6/2026 at 5:16:36 PM
Yeah, it's like declaring self driving safe because people are told to remain alert with their hands on the wheel, ready to take over in an instant. It's a charade.by gdulli