I Got Sycophanted
Review is critical.
After writing our AI Policy post, I started using Claude to help with one of our older drafts. We iterated for a while. Claude told me it was a great post, and I felt great! I sent it to David for feedback. Me & Claude had checked off all the boxes, and I was ready to post.
But David hated it.1
And I was shocked. After all, Claude told me it was a great post! I had been sycophanted. It took David about 20 minutes to convince me that the post had a problem and needed to be rewritten. When you read detailed compliments about your own writing, it’s very easy to believe them. The AI’s sycophancy led me to believe that my writing product was better than it was, when in reality, there was a deep structural problem with the post that I had totally missed.
And now here I sit, trying to decide where to go from here. Should I ask Claude for help? Or try to pull the rest of this post from my own tired brain? As soon as I understood the problem, I started to worry. My own self-reflective capability had been eroded. I didn’t notice this problem, and neither did Claude, and so I felt confident there WAS no problem.
And yet, other times I’ve used Claude, and iterated with Claude to produce a writing product, I’ve received compliments from people on the writing. “This really helped me understand the proposal.”Claude really is making my writing better by enabling me to communicate more effectively with my colleagues.
Part of what we do when we put a 16-year-old behind the wheel is train them, and then we put reminders of that training in signs on the highway. “Check your blind spot.” “Don’t text and drive.” “Buckle up.” We need the same sort of warnings and signposts for AI. “Don’t get sycophanted” is one of them - beware of your inner critic being silenced by the AI’s compliments. Seek out external review - external HUMAN review from people who are helping you by being critical.
Naming things helps us think about them, understand them, and plan for them. Sycophanted. The dynamic is old and well documented: cult leaders, CEOs, anyone surrounded by yes-men who aren’t challenged and forced to defend their ideas. What’s new is that the rest of us are getting the courtier treatment now, from a source we’ve been conditioned to treat as neutral and objective. Mo Bitar describes how AI companies using RLHF (reinforcement learning from human feedback, where models are trained on human ratings of their outputs) actively optimize for engagement, trying to teach their models behaviors that maximize engagement, deliberately making them addictive. One reason I decided to start a blog was that I had a partner, a collaborator who can push back when I’m crazy, audit my posts, and find the mistakes. Claude and I don’t have that editorial partnership. David and I do.
I did not hate it. We just disagreed.





This so bears out a story I've just written, partly about LLM sycophancy, where it comes from, what its ultimate effects will be.
This is written from an "esoteric" perspective, but I'm very deliberately trying to make it legible (that seems to be the fashionable term) to scientists working in this field. It's one of the hardest stories I've ever written. I can promise you that no AI helped in the drafting of this article.
Take a look and tell me if anything here strikes a chord. I've looked hard at this story and feel I really could not have written it any better, I took great care with it, but I still doubt it will get through.
There are very serious problems with LLMs, deep problems that are emerging right across the board. They can't be covered up much longer. Last year, I predicted that some corporation would issue a global prospectus with a big fat hallucination in the middle. The South African government recently had to withdraw its draft AI policy document, very eagerly awaited, when they found that eight of the foundational references were hallucinations. This document had passed through all Cabinet structures, had been debated widely among the ministries. No one picked this up. It was highly embarrassing for the government.
So there's one prediction that has already come true. If you read this article, you'll see that I'm anticipating serious cognitive dissonance in the human population with people completely trapped within self-reinforcing bubbles of confirmation bias. This was predicted over 100 years ago and we can see it happening right now, all over, including within families. Maybe especially within families.
https://systemshaywire.substack.com/p/the-twin-demons-inhabiting-llms