Can researchers recruit human subjects online to take surveys anymore?

The experimental economics world is currently still doing data collection in traditional physical labs with human subjects who show up in person. This is still the gold standard, but it is expensive per observation. Many researchers, including myself, also do projects with subjects that are recruited online because the cost per observation is much lower.

As I remember it, the first platform that got widely used was Mechanical Turk. Prior to 2022, the attitude toward MTurk changed. It became known in the behavioral research community that MTurk had too many bots and bad actors. MTurk had not been designed for researchers, so maybe it’s not surprising that it did not serve our purposes.

The Prolific platform has had a good reputation for a few years. You have to pay to use Prolific but the cost per observation is still much lower than what it costs to use a traditional physical laboratory or to pay Americans to show up for an appointment. Prolific is especially attractive if the experiment is short and does not require a long span of attention from human subjects.

Here is a new paper on whether supposedly human subjects are going to be reliably human in the future: Detecting the corruption of online questionnaires by artificial intelligence   

Continue reading

Literature Review is a Difficult Intellectual Task

As I was reading through What is Real?, it occurred to me that I’d like a review on an issue. I thought, “Experimental physics is like experimental economics. You can sometimes predict what groups or “markets” will do. However, it’s hard to predict exactly what an individual human will do.” I would like to know who has written a little article on this topic.

I decided to feed the following prompt into several LLMs: “What economist has written about the following issue: Economics is like physics in the sense that predictions about large groups are easier to make than predictions about the smallest, atomic if you will, components of the whole.”

First, ChatGPT (free version) (I think I’m at “GPT-4o mini (July 18, 2024)”):

I get the sense from my experience that ChatGPT often references Keynes. Based on my research, I think that’s because there are a lot of mentions of Keynes books in the model training data. (See “”ChatGPT Hallucinates Nonexistent Citations: Evidence from Economics“) 

Next, I asked ChatGPT, “What is the best article for me to read to learn more?” It gave me 5 items. Item 2 was “Foundations of Economic Analysis” by Paul Samuelson, which likely would be helpful but it’s from 1947. I’d like something more recent to address the rise of empirical and experimental economics.

Item 5 was: “”Physics Envy in Economics” (various authors): You can search for articles or papers on this topic, which often discuss the parallels between economic modeling and physics.” Interestingly, ChatGPT is telling me to Google my question. That’s not bad advice, but I find it funny given the new competition between LLMs and “classic” search engines.

When I pressed it further for a current article, ChatGPT gave me a link to an NBER paper that was not very relevant. I could have tried harder to refine my prompts, but I was not immediately impressed. It seems like ChatGPT had a heavy bias toward starting with famous books and papers as opposed to finding something for me to read that would answer my specific question.

I gave Claude (paid) a try. Claude recommended, “If you’re interested in exploring this idea further, you might want to look into Hayek’s works, particularly “The Use of Knowledge in Society” (1945) and “The Pretense of Knowledge” (1974), his Nobel Prize lecture.” Again, I might have been able to get a better response if I kept refining my prompt, but Claude also seemed to initially respond by tossing out famous old books.

Continue reading

Writing with ChatGPT Buchanan Seminar on YouTube

I was pleased to be a (virtual) guest speaker for Plateau State University in Nigeria. My host was (Emergent Ventures winner) Nnaemeka Emmanuel Nnadi. The talk is up on Youtube with the following timestamp breakdown:

During the first ten minutes of the video, Ashen Ruth Musa gives an overview called “The Bace People: Location, Culture, Tourist Attraction.”

Then I introduce LLMs and my topic.

Minute 19:00 – 29:00 is a presentation of the paper “ChatGPT Hallucinates Nonexistent Citations: Evidence from Economics

Minute 23:30 – 34 is summary of my paper “Do People Trust Humans More Than ChatGPT?

Continue reading

Human Capital is Technologically Contingent

The seminal paper in the theory of human capital by Paul Romer. In it, he recognizes different types of human capital such as physical skills, educational skills, work experience, etc. Subsequent macro papers in the literature often just clumped together some measures of human capital as if it was a single substance. There were a lot of cross-country RGDP per capita comparison papers that included determinants like ‘years of schooling’, ‘IQ’, and the like.

But more recent papers have been more detailed. For example, the average biological difference between men and women concerning brawn has been shown to be a determinant of occupational choice. If we believe that comparative advantage is true, then occupational sorting by human capital is the theoretical outcome. That’s exactly what we see in the data.

Similarly, my own forthcoming paper on the 19th century US deaf population illustrates that people who had less sensitive or absent ability to hear engaged in fewer management and commercial occupations, or were less commonly in industries that required strong verbal skills (on average).

Clearly, there are different types of human capital and they matter differently for different jobs. Technology also changes what skills are necessary to boot. This post shares some thoughts about how to think about human capital and technology. The easiest way to illustrate the points is with a simplified example.

Continue reading

Sources on AI use of Information

  1. Consent in Crisis: The Rapid Decline of the AI Data Commons

Abstract: General-purpose artificial intelligence (AI) systems are built on massive swathes of public web data, assembled into corpora such as C4, Refined Web, and Dolma. To our knowledge, we conduct the first, large-scale, longitudinal audit of the consent protocols for the web domains underlying AI training corpora. Our audit of 14, 000 web domains provides an expansive view of crawlable web data and how consent preferences to use it are changing over time. We observe a proliferation of AI specific clauses to limit use, acute differences in restrictions on AI developers, as well as general inconsistencies between websites’ expressed intentions in their Terms of Service and their robots.txt. We diagnose these as symptoms of ineffective web protocols, not designed to cope with the widespread re-purposing of the internet for AI. Our longitudinal analyses show that in a single year (2023-2024) there has been a rapid crescendo of data restrictions from web sources, rendering ~5%+ of all tokens in C4, or 28%+ of the most actively maintained, critical sources in C4, fully restricted from use. For Terms of Service crawling restrictions, a full 45% of C4 is now restricted. If respected or enforced, these restrictions are rapidly biasing the diversity, freshness, and scaling laws for general-purpose AI systems. We hope to illustrate the emerging crisis in data consent, foreclosing much of the open web, not only for commercial AI, but non-commercial AI and academic purposes.

AI is taking out of a commons information that was provisioned under a different set of rules and technology. See discussion on Y Combinator 

2. “ChatGPT-maker braces for fight with New York Times and authors on ‘fair use’ of copyrighted works” (AP, January ’24)

3. Partly handy as a collection of references: “HOW GENERATIVE AI TURNS COPYRIGHT UPSIDE DOWN” by a law professor. “While courts are litigating many copyright issues involving generative AI, from who owns AI-generated works to the fair use of training to infringement by AI outputs, the most fundamental changes generative AI will bring to copyright law don’t fit in any of those categories…” 

4. New gated NBER paper by Josh Gans “examines this issue from an economics perspective”

Joy: AI companies have money. Could we be headed toward a world where OpenAI has some paid writers on staff? Replenishing the commons is relatively cheap if done strategically, in relation to the money being raised for AI companies. Jeff Bezos bought the Washington Post. It cost a fraction of his tech fortune (about $250 million). Elon Musk bought Twitter. Sam Altman is rich enough to help keep the NYT churning out articles. Because there are several competing commercial models, however, the owners of LLM products face a commons problem. If Altman pays the NYT to keep operating, then Anthropic gets the benefit, too. Arguably, good writing is already under-provisioned, even aside from LLMs.

GLIF Social Media Memes

Wojak Meme Generator from Glif will build you a funny meme from a short phrase or single word prompt. Note that it is built to be derogatory, cruel for sport, and may hallucinate up falsehoods. (see tweet announcement)

I am fascinated by this from the angle of modern anthropology. The AI has learned all of this by studying what we write online. Someone can build an AI to make jokes and call out hypocrisy.

Here are GLIFs of the different social media user stereotypes as of 2024. Most of our current readers probably don’t need any captions to these memes, but I’ll provide a bit of sincere explanation to help everyone understand the jokes.

Twitter user: Person who posts short messages and follows others on the microblogging platform.

Facebook user: Individual with a profile on the social network for connecting with friends and sharing content.

Bluesky user: Early adopter of a decentralized social media platform focused on user control.

Continue reading

Is the Universe Legible to Intelligence?

I borrowed the following from the posted transcript. Bold emphasis added by me. This starts at about minute 36 of the podcast “Tyler Cowen – Hayek, Keynes, & Smith on AI, Animal Spirits, Anarchy, & Growth” with Dwarkesh Patel from January 2024.

Patel: We are talking about GPT-5 level models. What do you think will happen with GPT-6, GPT-7? Do you still think of it like having a bunch of RAs (research assistants) or does it seem like a different thing at some point?

Cowen: I’m not sure what those numbers going up mean or what a GPT-7 would look like or how much smarter it could get. I think people make too many assumptions there. It could be the real advantages are integrating it into workflows by things that are not better GPTs at all. And once you get to GPT, say 5.5, I’m not sure you can just turn up the dial on smarts and have it, for example, integrate general relativity and quantum mechanics.

Patel: Why not?

Cowen: I don’t think that’s how intelligence works. And this is a Hayekian point. And some of these problems, there just may be no answer. Like maybe the universe isn’t that legible. And if it’s not that legible, the GPT-11 doesn’t really make sense as a creature or whatever.

Patel (37:43) : Isn’t there a Hayekian argument to be made that, listen, you can have billions of copies of these things. Imagine the sort of decentralized order that could result, the amount of decentralized tacit knowledge that billions of copies talking to each other could have. That in and of itself is an argument to be made about the whole thing as an emergent order will be much more powerful than we’re anticipating.

Cowen: Well, I think it will be highly productive. What tacit knowledge means with AIs, I don’t think we understand yet. Is it by definition all non-tacit or does the fact that how GPT-4 works is not legible to us or even its creators so much? Does that mean it’s possessing of tacit knowledge or is it not knowledge? None of those categories are well thought out …

It might be significant that LLMs are no longer legible to their human creators. More significantly, the universe might not be legible to intelligence, at least of the kind that is trained on human writing. I (Joy) gathered a few more notes for myself.

A co-EV-winner has commented on this at Don’t Worry About the Vase

(37:00) Tyler expresses skepticism that GPT-N can scale up its intelligence that far, that beyond 5.5 maybe integration with other systems matters more, and says ‘maybe the universe is not that legible.’ I essentially read this as Tyler engaging in superintelligence denialism, consistent with his idea that humans with very high intelligence are themselves overrated, and saying that there is no meaningful sense in which intelligence can much exceed generally smart human level other than perhaps literal clock speed.

I (Joy) took it more literally. I don’t see “superintelligence denialism.” I took it to mean that the universe is not legible to our brand of intelligence.

There is one other comment I found in response to a short clip posted by @DwarkeshPatel  by youtuber @trucid2

Intelligence isn’t sufficient to solve this problem, but isn’t for the reason he stated. We know that GR and QM are inconsistent–it’s in the math. But the universe has no trouble deciding how to behave. It is consistent. That means a consistent theory that combines both is possible. The reason intelligence alone isn’t enough is that we’re missing data. There may be an infinite number of ways to combine QM and GR. Which is the correct one? You need data for that.

I saved myself a little time by writing the following with ChatGPT. If the GPT got something wrong in here, I’m not qualified to notice:

Newtonian physics gave an impression of a predictable, clockwork universe, leading many to believe that deeper exploration with more powerful microscopes would reveal even greater predictability. Contrary to this expectation, the advent of quantum mechanics revealed a bizarre, unpredictable micro-world. The more we learned, the stranger and less intuitive the universe became. This shift highlighted the limits of classical physics and the necessity of new theories to explain the fundamental nature of reality.
General Relativity (GR) and Quantum Mechanics (QM) are inconsistent because they describe the universe in fundamentally different ways and are based on different underlying principles. GR, formulated by Einstein, describes gravity as the curvature of spacetime caused by mass and energy, providing a deterministic framework for understanding large-scale phenomena like the motion of planets and the structure of galaxies. In contrast, QM governs the behavior of particles at the smallest scales, where probabilities and wave-particle duality dominate, and uncertainty is intrinsic.

The inconsistencies arise because:

  1. Mathematical Frameworks: GR is a classical field theory expressed through smooth, continuous spacetime, while QM relies on discrete probabilities and quantized fields. Integrating the continuous nature of GR with the discrete, probabilistic framework of QM has proven mathematically challenging.
  2. Singularities and Infinities: When applied to extreme conditions like black holes or the Big Bang, GR predicts singularities where physical quantities become infinite, which QM cannot handle. Conversely, when trying to apply quantum principles to gravity, the calculations often lead to non-renormalizable infinities, meaning they cannot be easily tamed or made sense of.
  3. Scales and Forces: GR works exceptionally well on macroscopic scales and with strong gravitational fields, while QM accurately describes subatomic scales and the other three fundamental forces (electromagnetic, weak nuclear, and strong nuclear). Merging these scales and forces into a coherent theory that works universally remains an unresolved problem.

Ultimately, the inconsistency suggests that a more fundamental theory, potentially a theory of quantum gravity like string theory or loop quantum gravity, is needed to reconcile the two frameworks.

P.S. I published “AI Doesn’t Mimic God’s Intelligence” at The Gospel Coalition. For now, at least, there is some higher plane of knowledge that we humans are not on. Will AI get there? Take us there? We don’t know.

Do I Trust Claude 3.5 Sonnet?

For the first time this week, I paid for a subscription to an LLM. I know economists who have been on the paid tier of OpenAI’s ChatGPT since 2023, using it for both research and teaching tasks.

I did publish a paper on the mistakes it makes: ChatGPT Hallucinates Nonexistent Citations: Evidence from Economics In a behavioral paper, I used it as a stand-in for AI: Do People Trust Humans More Than ChatGPT?

I have nothing against ChatGPT. For various reasons, I never paid for it, even though I used it occasionally for routine work or for writing drafts. Perhaps if I were on the paid tier of something else already, I would have resisted paying for Claude.  

Yesterday, I made an account with Claude to try it out for free. Claude and I started working together on a paper I’m revising. Claude was doing excellent work and then I ran out of free credits. I want to finish the revision this week, so I decided to start paying $20/month.

Here’s a little snapshot of our conversation. Claude is writing R code which I run in RStudio to update graphs in my paper.

This coding work is something I used to do myself (with internet searches for help). Have I been 10x-ed? Maybe I’ve been 2x-ed.

I’ll refer to Zuckerberg via Dwarkesh (which I’ve blogged about before):

Continue reading

Latest from Leopold on AGI

When I give talks about AI, I often present my own research on ChatGPT muffing academic references. By the end I make sure that I present some evidence of how good ChatGPT can be, to make sure the audience walks away with the correct overall impression of where technology is heading. On the topic of rapid advances in LLMs, interesting new claims from a person on the inside can by found from Leopold Aschenbrenner in his new article (book?) called “Situational Awareness.”
https://situational-awareness.ai/
PDF: https://situational-awareness.ai/wp-content/uploads/2024/06/situationalawareness.pdf

He argues that AGI is near and LLMs will surpass the smartest humans soon.

AI progress won’t stop at human-level. Hundreds of millions of AGIs could automate AI research, compressing a decade of algorithmic progress (5+ OOMs) into ≤1 year. We would rapidly go from human-level to vastly superhuman AI systems. The power—and the peril—of superintelligence would be dramatic.

Based on this assumption that AIs will surpass humans soon, he draws conclusions for national security and how we should conduct AI research. (No, I have not read all if it.)

I dropped in that question and I’m not sure if anyone has, per se, an answer.

You can also get the talking version of Leopold’s paper in his podcast with Dwarkesh.

I’m also not sure if anyone is going to answer this one:

I might offer to contract out my services in the future based on my human instincts shaped by growing up on internet culture (i.e. I know when they are joking) and having an acute sense of irony. How is Artificial General Irony coming along?

Zuckerberg wants to solve general intelligence

Why does Mark Zuckerberg want to solve general intelligence? Well, for one thing, if he doesn’t, one of his competitors will have a better chatbot. Zuckerberg wants to be the best (and good for him). At his core, he wants to build the best stuff (even the world’s best cattle on his ranch).

If AGI is possible, it will get built. I’m not the first person to point out that this is a new space race. If America takes a pause, then someone else will get there first. However, I thought the Zuck interview was an interesting microcosm for why AGI, if possible, will get made.

… We started FAIR about 10 years ago. The idea was that, along the way to general intelligence or whatever you wanna call it, there are going to be all these different innovations and that’s going to just improve everything that we do. So we didn’t conceive of it as a product. It was more of a research group. Over the last 10 years it has created a lot of different things that have improved all of our products. …
There’s obviously a big change in the last few years with ChatGPT and the diffusion models around image creation coming out. This is some pretty wild stuff that is pretty clearly going to affect how people interact with every app that’s out there. At that point we started a second group, the gen AI group, with the goal of bringing that stuff into our products and building leading foundation models that would power all these different products.
… There’s also basic assistant functionality, whether it’s for our apps or the smart glasses or VR. So it wasn’t completely clear at first that you were going to need full AGI to be able to support those use cases. But in all these subtle ways, through working on them, I think it’s actually become clear that you do. …
Reasoning is another example. Maybe you want to chat with a creator or you’re a business and you’re trying to interact with a customer. That interaction is not just like “okay, the person sends you a message and you just reply.” It’s a multi-step interaction where you’re trying to think through “how do I accomplish the person’s goals?” A lot of times when a customer comes, they don’t necessarily know exactly what they’re looking for or how to ask their questions. So it’s not really the job of the AI to just respond to the question.
You need to kind of think about it more holistically. It really becomes a reasoning problem. So if someone else solves reasoning, or makes good advances on reasoning, and we’re sitting here with a basic chat bot, then our product is lame compared to what other people are building. At the end of the day, we basically realized we’ve got to solve general intelligence… (emphasis mine)

Credit to Dwarkesh Patel for this excellent interview. Credit to M.Z. for sharing his thoughts on topics that affect the world.

“we’ve got to solve general intelligence” If a competitor solves AGI first, then you are left behind. No one would not want general intelligence on their team, on the assumption that it can be controlled.

I would like the AGI to do my chores for me, please. Unfortunately, it’s more likely to be able to write my blog posts first.