Do I Trust Claude 3.5 Sonnet?

For the first time this week, I paid for a subscription to an LLM. I know economists who have been on the paid tier of OpenAI’s ChatGPT since 2023, using it for both research and teaching tasks.

I did publish a paper on the mistakes it makes: ChatGPT Hallucinates Nonexistent Citations: Evidence from Economics In a behavioral paper, I used it as a stand-in for AI: Do People Trust Humans More Than ChatGPT?

I have nothing against ChatGPT. For various reasons, I never paid for it, even though I used it occasionally for routine work or for writing drafts. Perhaps if I were on the paid tier of something else already, I would have resisted paying for Claude.  

Yesterday, I made an account with Claude to try it out for free. Claude and I started working together on a paper I’m revising. Claude was doing excellent work and then I ran out of free credits. I want to finish the revision this week, so I decided to start paying $20/month.

Here’s a little snapshot of our conversation. Claude is writing R code which I run in RStudio to update graphs in my paper.

This coding work is something I used to do myself (with internet searches for help). Have I been 10x-ed? Maybe I’ve been 2x-ed.

I’ll refer to Zuckerberg via Dwarkesh (which I’ve blogged about before):

Dwarkesh Patel 00:13:05: But in the end case: Llama-10. 

Mark Zuckerberg: I think that there’s a lot baked into that question. I’m not sure that we’re replacing people as much as we’re giving people tools to do more stuff.

Dwarkesh Patel: Is the programmer in this building 10x more productive after Llama-10?

Mark Zuckerberg: I would hope more. I don’t believe that there’s a single threshold of intelligence for humanity because people have different skills. I think that at some point AI is probably going to surpass people at most of those things, depending on how powerful the models are. But I think it’s progressive and I don’t think AGI is one thing. You’re basically adding different capabilities. 

Is Claude the superintelligence? It hasn’t signed my kid up for summer camp yet. For now, I’m content that the coding is getting done.

Read more about model performance: Zvi on Claude Sonnet 3.5 (HT: Tyler)

I don’t usually link to something to completely paywalled, but this is interesting “Does the AI Business Work if They’re Paying for Content?

Do I Trust Claude 3.5 to do work tasks for me? Yes, although I still need to do fact-checking when applicable.

What about, in light of that terrible debate on Thursday night, just throwing in the towel and making Claude 3.5 President of the United States? That debate was everything bad: a humiliation of the human species. It’s not just that non-Americans are laughing at me. An anthropomorphized AI would be laughing at all of us. (It was embarrassing when Biden was 4 years younger, too, by the way.)

We are realizing that Biden does have an ego. He has acted in a fairly humble way, at certain times. But he is human, after all. He’s no Claude 3.5.

The humans failed at exactly what the machines are beginning to excel at: giving a coherent informed answer to a question. Before we get too carried away, I’m going to say that we should elect a human president (still). Something that might bother you if you think about it is how good the bots are at insulting us humans and preying on our insecurities. For example, see what I got this meme generator to do.

We do not know the AI critters very well yet.

Feeling a need to tap the sign. Wherever you are, you don’t love your country because it’s great. It will be great if you love it. Humans for Human 2024.

Leave a comment