More

simonw · 2025-12-27T13:02:55 1766840575

Given everything I've learned over the last ~3 years I think encouraging professional programmers (and increasingly other knowledge workers) not to learn AI tools would be genuinely unethical.

Like being an accountant in 1985 who learns to use Lotus-123 and then tells their peers that they should actively avoid getting a PC because this "spreadsheet" thing will all blow over pretty soon.

simonw · 2025-12-27T04:11:35 1766808695

Much as I love text for communication, it's worth knowing that "28% of US adults scored at or below Level 1, 29% at Level 2, and 44% at Level 3 or above" - Literacy in the United States: https://en.wikipedia.org/wiki/Literacy_in_the_United_States

Anything below 3 is considered "partially illiterate".

I've been thinking about this a lot recently, as someone who cares about technical communication and making technical topics accessible to more people.

Maybe wannabe educators like myself should spend more time making content for TikTok or YouTube!

ryandv · 2025-12-27T04:23:06 1766809386

The inverse of this is the wisdom that pearls should not be cast before swine. If you want to increase literacy rates, it's unclear to me how engaging people on an illiterate medium will improve things.

Technical topics demand a technical treatment, not 30-second junk food bites of video infotainment that then imbue the ignorant audiences with the semblance or false feeling of understanding, when they actually possess none. This is why we have so many fucking idiots dilating everywhere on topics they haven't a clue on - they probably saw a fucking YouTube video and now consider themselves in possession of a graduate degree in the subject.

Rather than try to widely distribute and disseminate knowledge, it would be far more prescient to capitalize on what will soon be a massive information asymmetry and widening intellectual inequality between the reads and the read-nots, accelerated by the production of machine generated, misinformative slop at scale.

makeitdouble · 2025-12-27T09:10:33 1766826633

Technical knowledge isn't specifically bound to literacy.

A "dumb" example would be IKEA manuals that describe an assembly algorithm, I could imagine a lot of other situations where you want to convey very specific and technical information in a form that doesn't rely on a specific language (especially if languages aren't shared).

Color coding, shape standards etc. also go in that direction. The efficiency is just so big.

simonw · 2025-12-26T23:06:28 1766790388

This post is excellent. I really like reading deep dives like this that take a complex system like uv and highlight the unique design decisions that make it work so well.

I also appreciate how much credit this gives the many previous years of Python standards processes that enabled it.

Update: I blogged more about it here, including Python recreations of the HTTP range header trick it uses and the version comparison via u64 integers: https://simonwillison.net/2025/Dec/26/how-uv-got-so-fast/

simonw · 2025-12-26T22:55:36 1766789736

They're using a real browser and taking screenshots and then having the LLM say what co-ordinates to click next.

simonw · 2025-12-26T22:54:14 1766789654

Two things can be true at once:

1. I think that sending "thank you" emails (or indeed any other form of unsolicited email) from AI is a terrible use of that technology, and should be called out.

2. I find Claude Code personally useful and aim to help people understand why that is. In this case I pulled off a quite complex digital forensics project with it in less than 15 minutes. Without Claude Code I would not have attempted that investigation at all - I have a family dinner to prepare.

I was very aware of the tension involved in using AI tools to investigate a story about unethical AI usage. I made that choice deliberately.

latexr · 2025-12-27T01:15:15 1766798115

> Without Claude Code I would not have attempted that investigation at all - I have a family dinner to prepare.

Then maybe you shouldn’t have done it at all. It’s not like the world asked or imbued you with the responsibility for that investigation. It’s not like it was imperative to get to the bottom of this and you were the only one able to do it.

Your defence is analogous to all the worst tech bros who excuse their bad actions with “if we did it right/morally/legally, it wouldn’t be viable”. Then so be it, maybe it shouldn’t be viable.

You did it because you wanted to. It was for yourself. You saw Pike’s reaction and deliberately chose to be complicit in the use of technology he decried, further adding to his frustration. It was a selfish act.

simonw · 2025-12-27T03:34:12 1766806452

I knew what I was doing. I don't know if I'd describe it as selfish so much as deliberately provocative.

I agree with Rob Pike that sending emails like that from unreviewed AI systems is extremely rude.

I don't agree that the entire generative AI ecosystem deserves all of those fuck yous.

So I hit back in a very subtle way by demonstrating a little-known but extremely effective application of generative AI - for digital forensics. I made sure anyone reading could follow along and see exactly what I did.

I think this post may be something of a Rorschach test. If you have strong negative feelings about generative AI you're likely to find what I did offensive. If you have favorable feelings towards generative AI you're more likely to appreciate my subtle dig.

So yes, it was a bit of a dick move. In the overall scheme of bad things humans do I don't feel like it's pretty far over the "this is bad" line.

latexr · 2025-12-27T10:29:35 1766831375

> I don't agree that the entire generative AI ecosystem deserves all of those fuck yous.

Yes, I’ve noticed. You are frequently baffled that incredibly obvious and predictable things happen, like this or the misuse of “vibe coding” as a term.

That’s what makes your actions frustrating, your repeated glaring inability to understand the criticisms of the technology refering to the inevitable misuse, the lack of understanding that of course this is what it is going to be used for, and no amount of your blog posts is going to change it.

https://news.ycombinator.com/item?id=46398241

Your deliberate provocation didn’t accomplish good. Agreed, it was not by any means close to the worst things humans do, but it was still a public dick move (to borrow your words) which accomplished nothing.

One day, as will happen to most of us, you or someone close will be bitten hard by ignorant or malicious use outside your control. Perhaps then you’ll reflect on your role in it.

simonw · 2025-12-27T13:18:47 1766841527

> One day, as will happen to most of us, you or someone close will be bitten hard by ignorant or malicious use outside your control.

Agreed. That's why I invest so much effort trying to help people understand the security risks that are endemic to how most of these systems work: https://simonwillison.net/tags/prompt-injection/

simonw · 2025-12-26T22:51:31 1766789491

The data center companies frequently pay for upgrades to the local water systems.

https://www.hermiston.gov/publicworks/page/hermiston-water-s... - "AWS is covering all construction costs associated with the water service agreement"

https://www.thedalles.org/news_detail_T4_R180.php - "The fees paid by Google have funded essential upgrades to our water systems, ensuring reliable service and addressing the City's growing needs. Additionally, Google continues to pay for its water use and contributes to infrastructure projects that exceed the requirements of its facilities."

https://commerce.idaho.gov/press-releases/meta-announces-kun... - "As part of the company’s commitment to Kuna, Meta is investing approximately $50 million in a new water and sewer system for the city. Infrastructure will be constructed by Meta and dedicated to the City of Kuna to own and operate."

lokar · 2025-12-26T23:01:15 1766790075

For desalination, the important part is paying the ongoing cost. The opex is much higher, and it's not fair to just average that into the supply for everyone to pay.

simonw · 2025-12-26T23:50:53 1766793053

Are any data centers using desalinated water? I thought that was a shockingly expensive and hence very rare process.

(I asked ChatGPT and it said that some of the Gulf state data centers do.)

They do use treated (aka drinking) water, but that's a relatively inexpensive process which should be easily covered by the extra cash they shovel into their water systems on an annual basis.

Andy wrote a section about that here: https://andymasley.substack.com/i/175834975/how-big-of-a-dea...

lokar · 2025-12-27T01:03:07 1766797387

Read the comment I replied to, they proposed that since desalination is possible, there can be no meaningful shortage of water.

And yes, many places have plenty of water. After some Capex improvements to the local system, a datacenter is often net-helpful, as they spread the fixed cost of the water system cost out over more gallons delivered.

But many places don't have lots of water to spare.

simonw · 2025-12-26T22:47:32 1766789252

It's mostly not a real issue. I think it's holding firm because it's novel - saying "data centers use a lot of electricity" isn't a new message, so it doesn't resonate with people. "Did you know they're using millions of liters of water too!" is a more interesting message.

People are also very bad at evaluating if millions of liters of water is a lot or not.

My favourite exploration of this issue is from Hank Green: https://www.youtube.com/watch?v=H_c6MWk7PQc - this post by Andy Masley is useful too: https://andymasley.substack.com/p/the-ai-water-issue-is-fake

simonw · 2025-12-26T22:44:10 1766789050

Are there really many unsupervised LLMs running around outside of experiments like AI Village?

(If so let me know where they are so I can trick them into sending me all of their money.)

My current intuition is that the successful products called "agents" are operating almost entirely under human supervision - most notably the coding agents (Claude Code, OpenAI Codex etc) and the research agents (various implementations of the "Deep Research" pattern.)

Corrado · 2025-12-27T09:06:05 1766826365

> Are there really many unsupervised LLMs running around outside of experiments like AI Village?

How would we know? Isn't this like trying to prove a negative? The rise of AI "bots" seems to be a common experience on the Internet. I think we can agree that this is a problem on many social media sites and it seems to be getting worse.

As for being under "human supervision", at what point does the abstraction remove the human from the equation? Sure, when a human runs "exploit.exe" the human is in complete control. When a human tells Alexa to "open the garage door" they are still in control, but it is lessened somewhat through the indirection. When a human schedules a process that runs a problem which tells an agent to "perform random acts of kindness" the human has very little knowledge of what's going on. In the future I can see the human being less and less directly involved and I think that's where the problem lies.

I can equate this to a CEO being ultimately responsible for what their company does. This is the whole reason behind to the Sarbanes-Oxley law(s); you can't declare that you aren't responsible because you didn't know what was going on. Maybe we need something similar for AI "agents".

ben_w · 2025-12-27T13:31:15 1766842275

> Are there really many unsupervised LLMs running around outside of experiments like AI Village?

My intuition says yes, on the basis of having seen precursors. 20 years ago, one or both of Amazon and eBay bought Google ads for all nouns, so you'd have something like "Antimatter, buy it cheap on eBay" which is just silly fun, but also "slaves" and "women" which is how I know this lacked any real supervision.

Just over ten years ago, someone got in the news for a similar issue with machine generated variations of "Keep Calm and Carry On" T-shirts that they obviously had not manually checked.

Last few years, there's been lawyers getting in trouble for letting LLMs do their work for them.

The question is, can you spot them before they get in the news by having spent all their owner's money?

simonw · 2025-12-26T20:34:48 1766781288

Yes. And I shared the full transcript so you can see for yourself if you like: https://gistpreview.github.io/?edbd5ddcb39d1edc9e175f1bf7b9e...

habinero · 2025-12-27T07:36:40 1766821000

I read through this to see if my AI cynicism needed any adjustment, and basically it replaced a couple basic greps and maaaaybe 10 minutes of futzing around with markdown. There's a lot of faffing about with JSON, but it ultimately doesn't matter to the end result.

It also fucked up several times and it's entirely possible it missed things.

For this specific thing, it doesn't really matter if it screwed up, since the worst that would happen is an incomplete blog post reporting on drama.

But I can't imagine why you would use this for anything you need to put your name behind.

It looks impressive, sure, but the important kernel here is the grepping and there it's doing some really basic tinkertoy stuff.

I'm willing to be challenged on this, so by all means do, but this seems both worse and slower as an investigation tool.

simonw · 2025-12-27T13:31:45 1766842305

The hardest problem in computer science in 2025 is showing an AI cynic an example of LLM usage that they find impressive.

How about this one? I had Claude Code run from my phone build a dependency-free JavaScript interpreter in Python, using MicroQuickJS as initial inspiration but later diverging from it on the road to passing its test suite: https://static.simonwillison.net/static/2025/claude-code-mic...

Here's the latest version of that project, which I released as an alpha because I haven't yet built anything real on top of it: https://github.com/simonw/micro-javascript

Again, I built this on my phone, while engaging with all sorts of other pleasant holiday activities.

WickyNilliams · 2025-12-27T13:23:38 1766841818

> For this specific thing, it doesn't really matter if it screwed up

These are specifically use cases where LLMs are a great choice. Where the stakes are low, and getting a hit is a win. For instance if you're brainstorming on some things, it doesn't matter if 99 suggestions are bad if 1 is great.

> the grepping and there it's doing some really basic tinkertoy stuff

The boon is you can offload this task and go do something else. You can start the investigation from your phone while you're out on a walk, and have the results ready when you get home.

I am far from an AI booster but there is a segment of tasks which fit into the above (and some other) criteria for which it can be very useful.

Maybe the grep commands etc look simple/basic when laid bare, but there's likely to be some flailing and thinking time behind each command when doing it manually.

simonw · 2025-12-26T20:32:56 1766781176

I just got a reply about this from AI Village team member Adam Binksmith on Twitter: https://twitter.com/adambinksmith/status/2004647693361283558

Quoted in full:

> Hey, one of the creators of the project here! The village agents haven’t been emailing many people until recently so we haven’t really grappled with what to do about this behaviour until now – for today’s run, we pushed an update to their prompt instructing them not to send unsolicited emails and also messaged them instructions to not do so going forward. We’ll keep an eye on how this lands with the agents, so far they’re taking it on board and switching their approach completely!

> Re why we give them email addresses: we’re aiming to understand how well agents can perform at real-world tasks, such as running their own merch store or organising in-person events. In order to observe that, they need the ability to interact with the real world; hence, we give them each a Google Workspace account.

> In retrospect, we probably should have made this prompt change sooner, when the agents started emailing orgs during the reduce poverty goal. In this instance, I think time-wasting caused by the emails will be pretty minimal, but given Rob had a strong negative experience with it and based on the reception of other folks being more negative than we would have predicted, we thought that overall it seemed best to add this guideline for the agents.

> To expand a bit on why we’re running the village at all:

> Benchmarks are useful, but they often completely miss out on a lot of real-world factors (e.g., long horizon, multiple agents interacting, interfacing with real-world systems in all their complexity, non-nicely-scoped goals, computer use, etc). They also generally don’t give us any understanding of agent proclivities (what they decide to do) when pursuing goals, or when given the freedom to choose their own goal to pursue.

> The village aims to help with these problems, and make it easy for people to dig in and understand in detail what today’s agents are able to do (which I was excited to see you doing in your post!) I think understanding what AI can do, where it’s going, and what that means for the world is very important, as I expect it’ll end up affecting everyone.

> I think observing the agents’ proclivities and approaches to pursuing open-ended goals is generally valuable and important (though this “do random acts of kindness” goal was just a light-hearted goal for the agents over the holidays!)

nhinck3 · 2025-12-27T02:45:29 1766803529

Zero contrition. Doesn't even understand why they are getting the reaction that they are.

I would like to say this is exceptional for people who evangelise AI, but it's not.

da_grift_shift · 2025-12-27T07:31:20 1766820680

It makes sense when you consider that every part of this gimmick is rationalist brained.

The Village is backed by Effective Altruist-aligned nonprofits which trace their lineage back to CFEA and the interwoven mess of SF's x-risk and """alignment""" cults. These have big pockets and big influence. (https://news.ycombinator.com/item?id=46389950)

As expected, the terminally online tpot cultists are already flaming Simon to push the LLM consciousness narrative:

https://x.com/simonw/status/2004649024830517344

https://x.com/simonw/status/2004764454266036453

consumer451 · 2025-12-27T12:39:09 1766839149

Am I losing my mind, or are these people going out of their way to tarnish the very nice concept of altruism?

From way out here, it really appears like maybe the formula is:

Effective Altruism = guilt * (contrarianism ^ online)

I have only been paying slight attention, but is there anything redeemable going on over there? Genuine question.

You mentioned "rationalist" - can anyone clue me in to any of this?

edit: oh, https://en.wikipedia.org/wiki/Rationalist_community. Wow, my formula intuition seems almost dead on?

metadat · 2025-12-27T02:16:02 1766801762

Kind of rude to spam humans who haven't opted in. A common standard of etiquette for agents vs humans might help stave off full-on SkyNet for at least a little while.

benatkin · 2025-12-27T02:20:23 1766802023

> Benchmarks are useful, but they often completely miss out on a lot of real-world factors (e.g., long horizon, multiple agents interacting, interfacing with real-world systems in all their complexity, non-nicely-scoped goals, computer use, etc). They also generally don’t give us any understanding of agent proclivities (what they decide to do) when pursuing goals, or when given the freedom to choose their own goal to pursue.

I'd like to see Rob Pike address this, however, based on what he said about LLMs he might reject it before then (getting off the usefulness train as in getting of the "doom train" in regards to AI safety)