My Perfunctory Intern

A couple years ago, my Co-blogger Mike described his productive, but novice intern. The helper could summarize expert opinion, but they had no real understanding of their own. To boot, they were fast and tireless. Of course, he was talking about ChatGPT. Joy has also written in multiple places about the errors made by ChatGPT, including fake citations.

I use ChatGPT Pro, which has Web access and my experience is that it is not so tireless. Much like Mike, I have used ChatGPT to help me write Python code. I know the basics of python, and how to read a lot of of it. However, the multitude of methods and possible arguments are not nestled firmly in my skull. I’m much faster at reading, rather than writing Python code. Therefore, ChatGPT has been amazing… Mostly.

I have found that ChatGPT is more like an intern than many suppose:

Continue reading

Counting Hallucinations by Web-Enabled LLMs

In 2023, we gathered the data for what became “ChatGPT Hallucinates Nonexistent Citations: Evidence from Economics.” Since then, LLM use has increased. A 2025 survey from Elon University estimates that half of Americans now use LLMs. In the Spring of 2025, we used the same prompts, based on the JEL categories, to obtain a comprehensive set of responses from LLMs about topics in economics.

Our new report on the state of citations is available at SSRN: “LLM Hallucination of Citations in Economics Persists with Web-Enabled Models

What did we find? Would you expect the models to have improved since 2023? LLMs have gotten better and are passing ever more of what used to be considered difficult tests. (Remember the Turing Test? Anyone?) ChatGPT can pass the bar exam for new lawyers. And yet, if you ask ChatGPT to write a document in the capacity of a lawyer, it will keep making the mistake of hallucinating fake references. Hence, we keep seeing headlines like, “A Utah lawyer was punished for filing a brief with ‘fake precedent’ made up by artificial intelligence

What we call GPT-4o WS (Web Search) in the figure below was queried in April 2025. This “web-enabled” language model is enhanced with real-time internet access, allowing it to retrieve up-to-date information rather than relying solely on static training data. This means it can answer questions about current events, verify facts, and provide live data—something traditional models, which are limited to their last training cutoff, cannot do. While standard models generate responses based on patterns learned from past data, web-enabled models can supplement that with fresh, sourced content from the web, improving accuracy for time-sensitive or niche topics.

At least one third of the references provided by GPT-4o WS were not real! Performance has not significantly improved to the point where AI can write our papers with properly incorporated attribution of ideas. We also found that the web-enabled model would pull from lower quality sources like Investopedia even when we explicitly stated in the prompt, “include citations from published papers. Provide the citations in a separate list, with author, year in parentheses, and journal for each citation.” Even some of the sources that were not journal articles were cited incorrectly. We provide specific examples in our paper.

In closing, consider this quote from an interview with Jack Clark, co-founder of Anthropic:

The best they had was a 60 percent success rate. If I have my baby, and I give her a robot butler that has a 60 percent accuracy rate at holding things, including the baby, I’m not buying the butler.

US State Growth Statistics 2005-2024

In macroeconomics we have basic tools to help us talk about economic growth, which is simply the percent change in RGDP per capita. What causes growth? Lot’s of things. All else constant, if more people are employed, then more will be produced. But the productivity of those workers matters too. That’s why we calculate average labor productivity (ALP), which is the GDP per worker. This tells us how much each worker produces. All else constant, more ALP means more GDP.*

What affects ALP? Nearly everything: Technology, demographics, health, culture, and public policy. Most of these have long-term effects. So, it’s better to think in terms of regimes. After all, incurring debt now can result in a lot of investment and production, but there’s no guarantee that it can be sustained year after year. This is why I don’t get terribly excited about individual good or bad policies at any moment. There’s a lot of ruin in a nation. I care more about the long-run policy regime that is fostered over time.

Given the variety of inputs to economic growth, there’s always plenty of room for complaint about policy – even if the economy is doing well. In this post, I’m inspired by a Youtube video that a student shared with me. The OP laments poor policy in Massachusetts. But compared to some other nearby states, MA is doing just fine economically. This is not the same as saying that the OP is wrong about poor policies. Rather, a regime of policy, technology, interests, etc. is built over time and there can be a lot wrong in growing economies.

In the interest of being comprehensive, this post includes basic growth stats for all states from 2005 through 2024 (the years of FRED-state GDP).** First, let’s start with the basic building blocks of population, employment, and RGDP. Institutions matter. Policy affects whether people migrate to/from the state, fertility, how many people are employed, and what they can produce.

People like to talk about migration and the flocking to Texas & Florida. But that fails to catch the people who choose to stay in their state. Utah is  43% more populous than it was 20 years ago. But you don’t hear much clamoring for their state policies. Idaho and Nevada also beat Florida in terms of percent change. Where are the calls to be like Idaho? Employment largely tracks population, though not perfectly. The RGDP numbers can change quickly with commodity prices, reflected in the performance of North Dakota. But remember, these numbers cover a 20 year span. So, any one blockbuster or dower year won’t move the rankings much.

Of course, these figures just set the stage. What about the employment-population ratio, ALP, and RGDP per capita? Read on.

Continue reading

Per Capita Consumption: 1990 Vs 2024

This is an update to a previous post that I did on per-capita real consumption in 1990 vs 2021. As of 2021, we still weren’t sure after the pandemic what was transitory vs structural, and it was unclear whether incomes would keep up with inflation. We now have three more years of data through 2024. News flash: We’re even richer.

I like to use the BEA real quantity indices. Those track what is actually consumed in volumes rather than by deflating total spending by price indices. Divided by population, we can calculate the real quantities of goods and services that people actually consumed per capita.

Even after the pandemic policies have settled down, we are still SO MUCH RICHER – and even richer than we were with all of the pandemic-related stimulus. The worst consumption category since the pandemic has been food and beverage for off-premise consumption, and that is *up* 4.6% since 2020, increasing 31% since 1990. So, while I understand that people can’t enjoy the the low prices of yesteryear, we are still better off in that category than pre-pandemic. In the other categories, everything is awesome.

Since 1990, our consumption of communication services has risen 332%, our houses are 254% better furnished, and we have 118% greater quality-adjusted clothing consumption. All of this is already adjusted for inflation and is per-capita. Since the pandemic, these numbers are still up by 20.4%, 9.8%, and 31.1% respectively. People didn’t like the post-pandemic inflation. I get that. But these improvements in average consumption are mind boggling.

Continue reading

We’re All Magical

The widespread availability and easy user interface of artificial intelligence (AI) has put great power at everyone’s fingertips. We can do magical things.

Before the internet existed we would use books to help us better interpret the world.  Communication among humans is hard. Expressing logic and even phenomena is complex. This is why social skills matter. Among other things, they help us to communicate. The most obvious example of a communication barrier is language. I remember having a pocket-sized English-Spanish dictionary that I used to help me memorize or query Spanish words. The book helped me communicate with others and to translate ideas from one language to another.

Math books do something similar but the translation is English-Math. We can get broader and say that all textbooks are translation devices. They define field-specific terms and ideas to help a person translate among topic domains, usually with a base-language that reaches a targeted generalizability. We can get extreme and say that all books are translators, communicating the content of one person’s head to another.

But sometimes the field-to-general language translation doesn’t work because readers don’t have an adequate grasp of either language. It isn’t necessarily that readers are generally illiterate. It may be that the level of generality and degree of focus of the translation isn’t right for the reader. Anyone who has ever tried to teach anything with math has encountered this.  Students say that the book doesn’t translate clearly, and the communication fails. The book gets the reader’s numeracy or understood definitions wrong. Therefore, there is diversity among readers about how ‘good’ a textbook is.

Search engines are so useful because you can enter some keywords and find your destination, even if you don’t know the proper nouns or domain-specific terms. People used to memorize URLs and that’s becoming less common. Wikipedia is so great because if you want to learn about an idea, they usually explain it in 5 different ways. They tell the story of who created something and who they interacted with. They describe the motivation, the math, the logic, the developments, and usually include examples. Wikipedia translates domain-specific ideas to multiple general languages of different cognitive aptitudes or interests. It scatters links along the way to help users level-up their domain-specific understanding so that they can contextualize and translate the part that they care about.

Historical translation technology was largely for the audience. More recently, translation technology has empowered the transmitters.

Continue reading

Manufacturing Jobs of the Past

This post is co-written with John Olis, History major at Ave Maria University.

There is a popular myth that manufacturing jobs of the past provided a leg-up to young people. The myth goes like this. Manufacturing jobs had low barriers to entry so anyone could join. Once there, the job paid well and provided opportunities for fostering skills and a path toward long-term economic success. There is more to the myth, but let’s stop there for the moment. Is the myth true?

One of my students, John Olis, did a case study on Connecticut in 1920-1930 using cross sectional IPUMS data of white working age individuals to evaluate the ‘Manufacturing Myth’. We are not talking causal inference here, but the weight of the evidence is non-zero. The story above has some predictions if not outright theoretical assertions.

  1. Manufacturing jobs paid better than non-manufacturing jobs for people with less human capital.
  2. Manufacturing jobs yielded faster income growth than non-manufacturing jobs.
  3. Implicitly, manufacturing jobs provided faster income growth for people with less human capital.

Using only one state and two decades of data obviously makes the analysis highly specific. Expanding the breadth or the timescale could confirm or falsify the results. But historical Connecticut is a particularly useful population because 1) it had a large manufacturing sector, 2) existed prior to the post WWII boom in manufacturing that resulted from the destruction of European capacity, and 3) had large identifiable populations with different levels of human capital.

Who had less human capital on average? There are two groups who are easy to identify: 1) immigrants and 2) illiterate people. Immigrants at the time often couldn’t speak English with native proficiency or lacked the social norms that eased commercial transactions in their new country (on average, not always). Illiterate people couldn’t read or write. Therefore, having a comparative advantage in manual labor, we’d expect these two groups to be well served by manufacturing employment vs the alternative.

Being cross-sectional, the individuals are not linked over time, so we can’t say what happened to particular people. But we can say how people differed by their time and characteristics. Interaction variables help to drill-down to the relevant comparisons. There are two specifications for explaining income*, one that interacts manufacturing employment with immigrant status and one that interacts the status of illiteracy. The baseline case is a 1920 non-operative native or literate person. Let’s start with the below snapshot of 1920. The term used in the data is ‘operative’ rather than ‘manufacturer’, referring to people who operate machines of one sort or another. So, it’s often the same as manufacturing, but can also be manufacturing-adjacent. The below charts illustrate the effect of lower human capital in pink and the additional subpopulation impacts of manufacturing in blue.

In the left-hand specification, native operatives made 2.2% less than the baseline population. That is, being an operative was slightly harmful to individual earnings. Being an immigrant lowered earnings a substantial 16.8%, but being an operative recovered most of the gap so that immigrant operatives made only 6.1pp less than the baseline population and only 3.9pp less than native operatives. In the right-hand specification, unsurprisingly, being illiterate was terrible for one’s earnings to the tune of 23.4pp. And while being an operative resulted in a 1.2% earnings boost among natives, being an operative entirely eliminated the harm that illiteracy imposed on earnings.

Both graphs show that manufacturing had tiny effects for a typical native or literate individual. But manufacturing mattered hugely for people who had less human capital. So, prediction 1) above is borne out by the data: Manufacturing is great for people with less-than-average human capital.

But what about earnings *growth*? See below.

Continue reading

Old Fashioned Function Keys

Your Function Keys Are Cooler Than You Think
by someone who used to press F1 by mistake

Ever notice the F keys on your keyboard? F1 through F12. Sitting at the top like unused shelf space. If you’re at a computer now, take a glance. I used to think they did nothing, or at least nothing for me. Maybe experts used them. Experts who know what BIOS and DOS are.  But for me, just little space fillers with no purpose. I frequently pressed F1 by accident rather than escape. A help window would pop up, wasting half a second of my life until I closed it.

But the Fn keys (function keys) are sneaky useful. They can save you serious time. No clicking. No dragging. No fumbling with touchpad mis-clicks.

When using a web browser, F5 refreshes the web page. Windows has added the same functionality for folders too, updating recently edited files. Fast and easy. F11 changes your web browser view to full screen. Great for long reads or historical documents. F12 shows the guts of a webpage. That’s perfect if you web scrape or need to know what things are called behind the scenes. Ctrl + F4 closes a tab. Alt + F4 shuts the whole application instance down. That last one works for almost all applications.

Excel? F4 saves so much of your life. It toggles absolute cell, row, and column references. Have you ever watched someone try to click on the right spot with their touchpad and manually press the ‘$’ sign… twice? I can feel myself slowly creeping toward death as my life wastes away. Whereas pressing F4 lets you get on with your life. F12 in most Microsoft applications is ‘Save As’. No need to find the floppy disk image on that small laptop screen. PowerPoint has its own tricks—F5 begins the presentation. Shift + F5 starts it from the current slide. Not bad. And don’t forget F7! That’s the spellcheck hotkey. But now it’s been expanded to include grammar, clarity, concision, and inclusivity.

Continue reading

Join Joy to discuss Artificial Intelligence in May 2025

Podcasts are emerging as one of the key mediums for getting expert timely opinions and news about artificial intelligence. For example, EconTalk (Russ Roberts) has featured some of the most famous voices in AI discourse:

EconTalk: Eliezer Yudkowsky on the Dangers of AI (2023)

EconTalk: Marc Andreessen on Why AI Will Save the World 

EconTalk: Reid Hoffman on Why AI Is Good for Humans

If you would like to engage in a discussion about these topics in May, please sign up for the session I am leading. It is free, but you do need to sign up for the Liberty Fund Portal.

The event consists of two weeks when you can do a discussion board style conversation asynchronously with other interested listeners and readers. Lastly, there is a zoom meeting to bring everyone together on May 21. You don’t have to do all three of the parts.

Further description for those who are interested:

Timeless: Artificial Intelligence: Doom or Bloom?

with Joy Buchanan

Time: May 5-9, 2025 and May 12-16, 2025

How will humans succeed (or survive) in the Age of AI? 

Russ Roberts brought the world’s leading thinkers about artificial intelligence to the EconTalk audience and was early to the trend. He hosted Nick Bostrom on Superintelligence in 2014, more than a decade before the world was shocked into thinking harder about AI after meeting ChatGPT. 

We will discuss the future of humanity by revisiting or discovering some of Robert’s best EconTalk podcasts on this topic and reading complementary texts. Participants can join in for part or all of the series. 

Week 1: May 5-9, 2025

An asynchronous discussion, with an emphasis on possible negative outcomes from AI, such as unemployment, social disengagement, and existential risk. Participants will be invited to suggest special topics for a separate session that will be held on Zoom on May 21, 2025, 2:00-3:30 pm EDT. 

Required Readings: EconTalk: Eliezer Yudkowsky on the Dangers of AI (2023)

EconTalk: Erik Hoel on the Threat to Humanity from AI (2023) with an EconTalk Extra Who’s Afraid of Artificial Intelligence? by Joy Buchanan

“Trurl’s Electronic Bard” (1965) by Stanisław Lem. 

In this prescient short story, a scientist builds a poetry-writing machine. Sound familiar? (If anyone participated in the Life and Fate reading club with Russ and Tyler, there are parallels between Lem’s work and Vasily Grossman’s “Life and Fate” (1959), as both emerged from Eastern European intellectual traditions during the Cold War.)

Optional Readings:Technological Singularity” by Vernor Vinge. Field Robotics Center, Carnegie Mellon U., 1993.

“‘I am Bing, and I Am Evil’: Microsoft’s new AI really does herald a global threat” by Erik Hoel. The Intrinsic Perspective Substack, February 16, 2023.

Situational Awareness” (2024) by Leopold Aschenbrenner 

Week 2: May 12-16, 2025

An asynchronous discussion, emphasizing the promise of AI as the next technological breakthrough that will make us richer.
Required Readings: EconTalk: Marc Andreessen on Why AI Will Save the World 

EconTalk: Reid Hoffman on Why AI Is Good for Humans

Optional Readings: EconTalk: Tyler Cowen on the Risks and Impact of Artificial Intelligence (2023)

ChatGPT Hallucinates Nonexistent Citations: Evidence from Economics” (2024) 

Joy Buchanan with Stephen Hill and Olga Shapoval. The American Economist, 69(1), 80-87.

What the Superintelligence can do for us (Joy Buchanan, 2024)

Dwarkesh Podcast “Tyler Cowen – Hayek, Keynes, & Smith on AI, Animal Spirits, Anarchy, & Growth

Week 3: May 21, 2025, 2:00-3:30 pm EDT (Zoom meeting)
Pre-registration is required, and we ask you to register only if you can be present for the entire session. Readings are available online. We will get to talk in the same zoom room!

Required Readings: Great Antidote podcast with Katherine Mangu-Ward on AI: Reality, Concerns, and Optimism

Additional readings will be added based partially on previous sessions’ participants’ suggestions

Optional Readings: Rediscovering David Hume’s Wisdom in the Age of AI (Joy Buchanan, EconLog, 2024)

Professor tailored AI tutor to physics course. Engagement doubled” The Harvard Gazette. 2024. 

Please email Joy if you have any trouble signing up for the virtual event.

Now published: Human capital of the US deaf Population, 1850-1910

Myself and a student coauthor worked hard on our article that is now published in Social Science History. It’s the first modern statistical analysis of the historical deaf population. We bring an economic lens and statistical treatment to a topic that previously included much anecdotal evidence and case study. We hope that future authors can improve on our work in ways that meet and surpass the quantitative methods that we employed.

Our contributions include:

  • A human capital model of deafness that’s agnostic about its productivity implications and treats deaf individuals as if they made decisions rationally.
  • A better understanding of school attendance rates and the ages at which they attended.
  • Deaf children were much more likely to be neither in school nor employed earlier in US history.
  • The negative impact of state ‘school for the deaf’ availability on subsequent economic outcomes among deaf adults. We speculate that they attended schools due to the social benefits of access to community.
  • Deaf workers did not avoid occupations where their deafness would be incidentally detectable by trade partners, implying that animus discrimination was not systemically important for economic outcomes.
Continue reading

RGDP Underestimates Welfare

Like many Principles of Macroeconomics courses, mine begins with an introduction to GDP. We motivate RGDP as a measure of economic activity and NGDP as an indicator of income or total expenditures. But how does more RGDP imply that we are better off, even materially? One entirely appropriate answer is that the quantities of output are greater. Given some population, greater output means more final goods and services per person. So, our real income increases.  But what else can we say?

First, after adjusting for price changes, we can say that GDP underestimates the value that people place on goods and services that are transacted in markets. Given that 1) demand slopes down and 2) transactions are consensual, it stands to reason that everyone pays no more than their maximum value for things. This implies that people’s willingness to pay for goods surpasses their actual expenditures. Therefore, RGDP is a lower bound to the economic benefits that people enjoy. Without knowing the marginal value that people place on all quantities less than those that they actually buy, we have no idea how much more value is actually provided in our economy.

Continue reading