GPT-4 Generates Fake Citations

I am happy to share my latest publication at The American Economist: ChatGPT Hallucinates Non-existent Citations: Evidence from Economics

Citation: Buchanan, J., Hill, S., & Shapoval, O. (2024). ChatGPT Hallucinates Non-existent Citations: Evidence from Economics. The American Economist. 69(1), 80-87  https://doi.org/10.1177/05694345231218454

Blog followers will know that we reported this issue earlier with the free version of ChatGPT using GPT-3.5 (covered in the WSJ). We have updated this new article by running the same prompts through the paid version using GPT-4. Did the problems go away with the more powerful LLM?

The error rate went down slightly, but our two main results held up. It’s important that any fake citations at all are being presented as real. The proportion of nonexistent citations was over 30% with GPT-3.5, and it is over 20% with our trial of GPT-4 several months later. See figure 2 from our paper below for the average accuracy rates. The proportion of real citations is always under 90%. GPT-4, when asked about a very specific narrow topic, hallucinates almost half of the citations (57% are real for level 3, as shown in the graph).

The second result from our study is that the error rate of the LLM increases significantly when the prompt is more specific. If you ask GPT-4 about a niche topic for which there is less training data, then a higher proportion of the citations it produces are false. (This has been replicated in different domains, such as knowledge of geography.)

What does Joy Buchanan really think?: I expect that this problem with the fake citations will be solved quickly. It’s very brazen. When people understand this problem, they are shocked. Just… fake citations? Like… it printed out reference for papers that do not actually exist? Yes, it really did that. We were the only ones who quantified and reported it, but the phenomenon was noticed by millions of researchers around the world who experimented with ChatGPT in 2023. These errors are so easy to catch that I expect ChatGPT will clean up its own mess on this particular issue quickly. However, that does not mean that the more general issue of hallucinations is going away.

Not only can ChatGPT make mistakes, as any human worker can mess up, but it can make a different kind of mistake without meaning to. Hallucinations are not intentional lies (which is not to say that an LLM cannot lie). This paper will serve as bright clear evidence that GPT can hallucinate in ways that detract from the quality of the output or even pose safety concerns in some use cases. This generalizes far beyond academic citations. The error rate might decrease to the point where hallucinations are less of a problem than the errors that humans are prone to make; however, the errors made by LLMs will always be of a different quality than the errors made by a human. A human research assistant would not cite nonexistent citations. LLM doctors are going to make a type of mistake that would not be made by human doctors. We should be on the lookout for those mistakes.

ChatGPT is great for some of the inputs to research, but it is not as helpful for original scientific writing. As prolific writer Noah Smith says, “I still can’t use ChatGPT for writing, even with GPT-4, because the risk of inserting even a small number of fake facts… “

Follow-Up Research: Will Hickman and I have an incentivized experiment on trust that you can read on SSRN: Do People Trust Humans More Than ChatGPT?

@IMurtazashvili has pointed me to a great resource for AI-era literature review work. “AI-Based Literature Review Tools” from Texas A&M

The Greatest NBA Coach Is… Dan Issel?

Some economists love to write about sports because they love sports. Others love to write about sports because the data are so good compared to most other facets of the economy. What other industry constantly releases film of workers doing their jobs, and compiles and shares exhaustive statistics about worker performance?

This lets us fill the pages of the Journal of Sports Economics with articles on players’ performance and pay, and articles evaluating strategies that sometimes influence how sports are played in turn. But coaches always struck me as harder to evaluate than players or strategies. With players, the eye test often succeeds.

To take an extreme example, suppose an average high-school athlete got thrown into a professional football or basketball game; a fan asked to evaluate them could probably figure out that they don’t belong there within minutes, or perhaps even just by glancing at them and seeing they are severely undersized. But what if an average high school coach were called up to coach at the professional level? How long would it take for a casual observer to realize they don’t belong? You might be able to observe them mismanaging games within a few weeks, but people criticize professional coaches for this all the time too; I think you couldn’t be sure until you see their record after a season or two. Even then it is much less certain than for a player- was their bad record due to their coaching, or were they just handed a bad roster to work with?

The sports economics literature seems to confirm my intuition that coaches are difficult to evaluate. This is especially true in football, where teams generally play fewer than 20 games in a season; a general rule of thumb in statistics is that you need at least 20 to 25 observations for statistical tests to start to work. This accords with general practice in the NFL, where it is considered poor form to fire a coach without giving him at least one full season. One recent article evaluating NFL coaches only tries to evaluate those with at least 3 seasons. If the article is to be believed, it wasn’t until 2020 that anyone published a statistical evaluation of NFL defensive coordinators, despite this being considered a vital position that is often paid over a million dollars a year:

Continue reading

House Rich, House Richer

The third quarter ‘All Transaction’ housing price data was just released this week. These numbers are interesting for a few of reasons. One reason is that home prices are a big component of our cost of living. Higher home prices are relevant to housing affordability. This week’s release is especially interesting because it’s starting to look like the Fed might be pausing its year 18-month streak of interest rates hikes. In case you don’t know, higher interest rates increase the cost of borrowing and decrease the price that buyers are willing to pay for a home. Nationally, we only had one quarter of falling home prices in late 2022, but the recent national growth rate in home prices is much slower than it was in 2021 through mid-2022.

Do you remember when there were a bunch of stories about remote workers and early retirees fleeing urban centers in the wake of Covid? We stopped hearing that story so much once interest rates started rising. The inflection point in the data was in Q2 of 2022. After that, price growth started slowing with the national average home price up 6.5%. But the national average masks some geographic diversity.  

Continue reading

Are You Better Off Than You Were Four Years Ago?

In the October 1980 Presidential debate, Ronald Reagan famously asked that question to the American voters. His next sentence made it clear he was talking about the relationship between prices and wages, or what economists call real wages: “is it easier for you to go and buy things in the stores than it was four years ago?”

Reagan was a master of political rhetoric, so it’s not surprising that many have tried to copy his question in the years since 1980. For example, Romney and Ryan tried to use this phrase in their 2012 campaign against Obama. But it’s a good question to ask! While the President may have less control over the economy than some observers think, the economy does seem to be a key factor in how voters decide (for example, Ray Fair has done a pretty good job of predicting election outcomes with a few major economic variables).

Voters in 2024 will probably be asking themselves a similar question, and both parties (at least for now) seem to be actively encouraging voters to make such a comparison. We still have 12 months of economic data to see before we can really ask the “4 years” question, but how would we answer that question right now? Here’s probably the best approach to see if people are “better off” in terms of being able to “go and buy things at the stores”: inflation-adjusted wages. This chart presents average wages for nonsupervisory workers, with two different inflation adjustments, showing the change over a 4-year time period.

Continue reading

Growth of the Transfer State

I’ve written about government spending before. But not all spending is the same. Building a bridge, buying a stapler, and taking from Peter to pay Paul are all different types of spending. I want to illustrate that last category. Anytime that the government gives money to someone without purchasing a good or service or making an interest payment, it’s called a ‘transfer’. People get excited about transfers. Social security is a transfer and so is unemployment insurance benefits. Those nice covid checks? Also transfers.

Here I’ll focus on Federal transfers, though the data on all transfers is very similar if you include states in the analysis. Let’s start with the raw numbers. Below is data on GDP, Federal spending, and federal transfers. Suffice it to say that they are bigger than they used to be. They’ve all been growing geometrically and they all exhibit bumps near recessions.

Continue reading

Let’s Be Thankful for Food Abundance

Despite recent increases in prices of food, we should still all be very thankful this Thanksgiving for the abundance of affordable food available in the modern world. Looking back at my past few blog posts, I notice that I have been very food-centric in my choice of topics! And last week I also showed how the Thanksgiving meal this year will be the second cheapest ever (only behind 2019). While it’s absolutely true that food prices are up a lot in the past 2 and 4 years, they probably aren’t up as much as you have heard.

It’s always my preference to take as long-term perspective as possible when thinking about economic progress. So here’s the best way I’ve come up with to show how cheap and abundant food is today: food as a share of household spending fell dramatically in the 20th century.

Most of the data in this chart comes from the BLS Consumer Expenditure Surveys. This survey was done occasionally since 1901, and then annually since 1984. I also use BEA data to estimate personal taxes paid as a percent of spending (the CEX Surveys have some tax data, but it’s not reliable nor consistent). I picked as close to 30-year intervals as I could (with a preference for showing the earliest and latest years available), and I chose spending categories that are 90-100% of total expenditures in most of these years. Keep in mind also that these are consumer expenditures. As a nation, we spend a lot more on healthcare and education than this chart suggests, but most of that spending is not directly from households (of course, it is indirectly). Think of this chart as an average household budget.

I hope the thing that jumps out at you is that the amount money households spend on food has fallen dramatically since 1901, from over 42 percent to under 13 percent of household expenditures. To be clear, this data includes both spending on food at home and at restaurants (after 1984 we can track them separately, and groceries are pretty consistently about 60 percent of food spending). And you may be wondering about very recent trends too, such as before the pandemic. In 2022, household spent slightly less on food than they did in 2019, falling from 13.5 to 12.8%.

You may also notice that taxes have increased, though not much since 1960. Housing cost have been consistently high, and also a bit higher than 1990, going from 27 percent to 33 percent in 2022. And housing is now the single largest budget expenditure category, but for most of the first half of the 20th century, it was food that was the largest. And since people aren’t changing their housing situation more than once a year (if that), it would also have been food that dominated weekly and monthly budget decisions and worry about price fluctuations.

This year there will be lots of complaining about prices around the Thanksgiving table. And much of that is warranted! But let’s also be thankful on this food-intensive holiday for how cheap the food is.

And if some smart-aleck youngster tries to tell you that they learned on TikTok that things were better during the Great Depression (yes, people are really saying this!), have them watch this video by Christopher Clarke. Or show them that in the mid-1930s an average family spent one-third of their budget on food in my chart above, or how much labor it would have taken to buy that turkey in the 1930s (about 40 times as much time spent working as today).

Delinquency Data

I keep reading and hearing people who are waiting for the shoe to drop on the next recession. They see high interest rates and… well, that’s what they see. Employment is ok and NGDP is chugging along.

One indicator of economic trouble is the delinquency rate on debt. That’s exactly what we would expect if people lose their job or discover that they are financially overextended. They’d fail to meet their debt obligations. But the broad measure of commercial bank loans is quiet. Not only is it quiet, it’s near historic lows in the data at only 1.25% in 2023Q2. Banks can lend with a confidence like never before.

But maybe that overall delinquency rate is obscuring some compositional items. After all, we know that many recessions begin with real-estate slowdowns. Below are the rates for commercial non-farmland loans, farmland loans, and residential mortgages. All are near historical lows, though there are hints that they’re might be on the rise. But one quarter doesn’t a recession make. I won’t show the graph for the sake of space, but all business loan delinquency rates have also been practically flat for the past five years.

Continue reading

Replication Funding for Development Economics

The RWI − Leibniz Institute for Economic Research has funding for researchers to replicate papers in development economics:

RWI invites applications for several positions of Replicator on a self-employed basis to conduct a robustness replication of a published microeconomic study in the field of Development Economics. The successful applicant will work with us on the project “Robustness and Replicability in Economics (R2E)”, funded by the German Science Foundation (DFG) Priority Programme “Meta-Rep”….

The ultimate goal is to contribute to the ongoing debate about replicability and replication rates in eco- nomics. We collaborate closely with the Institute for Replication (I4R). All robustness replications will contribute to a meta-paper summarizing the collective findings. We plan to publish this meta-paper by the end of 2024, and all replication fellows will be co-authors….

The position starts as soon as possible and is limited to six months. The work can be done fully remotely. The applicant will receive compensation of 2,500 € gross in total, possible distributed in installments based upon predetermined deliverables. Additionally, replication fellows will be listed as co-authors on the meta-paper. At the conclusion of the project, it is foreseen to gather all fellows for a final workshop at RWI in Essen, Germany.

I don’t know the team here but I’m always happy to see more attempts to make economic research more reliable. The funding and the planned publication make this potentially a good deal for applied microeconomists, especially grad students. Full details are here (warning: PDF).

Food Prices Are Up, But Let’s not Overstate How Much

Last week I gave some advice on how to save money on food. Food prices are up a lot in the past 4 years, but especially since the beginning of 2021. Over the 32 months since January 2021, grocery prices (according to the CPI) are up 20 percent (keep that number in mind). To give you an idea of how unusual that is, in the 32 months before the pandemic (up to January 2020), grocery prices only rose 2 percent. Perhaps even more astonishingly, if we look at October 2019 grocery prices, they were slightly lower on average than 4 years earlier in October 2015. From a flat 4 years to a 25 percent increase over the next 4 years. That’s a huge change for consumers.

But we also shouldn’t overstate the price increases. As you might guess, the best place for overstatements is social media. You can find plenty of them. For example, this very viral video claims that her family’s grocery prices doubled (in fact, almost exactly doubled, to the penny, which is suspicious) in just one single year, from August 2021 to August 2022. According to the CPI data, grocery prices were up 13.5 percent over that period — which, don’t get me wrong, is a lot! But it’s not 100 percent. I’ll focus on this one example, but I’m sure you will believe me that you can find dozens of examples like this on social media every single day (for example, yesterday someone claimed bread prices had tripled since 2019).

Let’s leave aside for a moment that in that viral video she claims to spend $1,500 per month on groceries. This would be a massive outlier for 2022. A family in the middle income quintile spent $460 per month on groceries in 2022, and $713 on all food including restaurants. So even if this family eats every single meal at home, they are still spending twice as much as a middle income family. Even a family with 5 or more people (the largest bucket BLS uses in that report) spent $755 per month on groceries ($1,232 on all food). According to the Consumer Expenditure survey, the middle quintile grocery spending went up 16%, and the five-person household went up 19% from 2021 to 2022. Big increases, no doubt! But not 100%.

So who are we to believe? Have prices roughly doubled since 2021? Or are they up about 20 percent? People are sometimes skeptical of the consumer price index, so let’s look at the actual price data that goes into the index. BLS has data on hundreds of individual food items, but here’s a summary chart with eight common food items. Here’s the change in the prices of those items since January 2021:

Continue reading

Finding Deals on Food

It’s the time of the year when we share ideas for things to buy, possibly as Christmas or other holiday gifts. But I’m going to share with you not a specific thing to buy, but instead a method for buying things. And probably not the kind of thing you might think of sticking in a wrapped present: food.

We’ve all heard about and felt inflation lately. But food prices have been especially noticeable to consumer, and not just because it’s a product you frequently buy and probably know the price of many food items. Food prices, both at home and restaurants, have increased much more than the average price levels.

On average, prices are up about 20 percent in the US over the past 4 years. But food prices are up about 25 percent, on average.

Wages (the purple line) actually have increase faster than the general price level over the past 4 years — that may shock you given what we constantly hear in the traditional and social media about “price increases outpacing wage gains” — but it is true when we are talking about food. Your dollar doesn’t go quite as far as it used to for food.

In some sense these costs are hard to avoid: food is a necessity. But there are ways to reduce your costs, and you probably know the general tips. Eat less at restaurants. Buy generic. Buy in bulk. Etc. These are good tips, but they all involve some sacrifice or annoyance. Is there anything else a consumer can do?

Yes. Here’s a few tips that can save you money, without the sacrifice. There is some thought involved, and perhaps a slight annoyance, but I’ve found that once you get in these habits, the mental and time cost is pretty low.

1. RESTAURANT APPS

You should always be ordering your food through restaurant apps when possible, especially for fast food. I try to track limited good deals on Twitter, but most restaurants offer on-going good deals. For example, McDonalds usually has a 20% off coupon, just for using the app. Taco Bell has a $6 box you can build, which would cost around $10 to order as a combo or à la carte at the restaurant. That’s a 40% discount for using the app.

Using apps also means you are using the restaurant’s rewards programs. Valuations vary, but McDonald’s rewards are roughly worth 10% cash back.

2. CHASE THE SALES AT GROCERY STORES

Clipping coupons is the classic way of saving money at the grocery store (we even have reality shows about it), but in the modern world grocery stores have expanded the ways to effectively save the same amount of money. The clearest example is, once again, the rise of apps. Stores will often have “digital only” coupons that you need to access through their app (which is also tied to your rewards account, just like restaurants).

While I’m a strong advocate of coupon clipping (and the virtual equivalent), it can be time consuming. Another strategy that can save you is thinking ahead about seasonal and other cyclical prices. For example, my kids like M&M’s. We usually buy a bulk 62-ounce container at Sam’s Club (already a savings), but today I took the additional saving step of buying the Halloween-themed bulk container. It was 36 percent less than the identical Christmas-themed M&M’s container right next to it. And I was replacing the Easter-themed bulk container that we purchased back in April, and they just finished.

Of course, I had to be planning ahead and know that November 1st was a great day to buy M&M’s. That takes some mental effort, sure. And you might think these kinds of deals are fairly limited in nature. But holidays aren’t the only kind of seasonal deals. For example, even though most fruit is generally available year-round now, there are still predictable price cycles of when things are “in season” and when they have to be imported from expensive locations. Even if you are only able to find these cyclical deals for 10 percent of your purchases, saving 30-50% on cyclical goods will shave another 3-5% off your grocery bill — bringing it closer in line to the average increase in prices (and wages).

3. CASH BACK CREDIT CARDS

I could write an entire post about credit card rewards. But let me focus here on credit cards that are especially good for buying food. At a minimum you should be getting 2 percent back on all of your purchases, as there are several no-annual-fee cards that give you 2 percent: the Citi Double Cash and Wells Fargo Active Cash are good examples.

But on food purchases, you should be able to beat 2 percent. For example, the Citi Custom Cash card gives you 5 percent back on your top spending category each month, up to $500 of spending. This can be on either groceries or restaurants. And since a family in the median quintile spends $250 at restaurants and $460 on groceries per month, you should be getting 5 percent back on basically all of your purchases in one of these two categories. (Personally I stick to restaurants for this card, because I buy most of my groceries at Walmart and Sams Club, which don’t count towards the grocery cash back.) Or if you want a simple card that gives you 3 percent back on both groceries and restaurants, check out the Capital One SavorOne card (again, no annual fee).

There are also several cards that have rotating 5 percent cash back categories each quarter, and they often include either restaurants or groceries. How do I keep track of which card to use for what kind of purchase? Simple: put a strip of masking tape on the card with a label. This will get some chuckles from your friends or the server at the restaurant, but that’s just an opportunity to tell them how to save money too!

Is There Really a Free Lunch?

Some of my economist friends are probably skeptical at this point. Aren’t I say there is a free lunch here? Isn’t the extra hassle of the steps I suggested going to outweigh any discount you get?

The answer is No. And while economists are quick to bring up the concept of opportunity cost, I find that most people tend to overestimate their opportunity cost. But even if you don’t overestimate your opportunity cost, you can bring in another useful economic concept: price discrimination.

Restaurants are very much in the business of price discrimination, and always have been. Tuesday Night specials, happy hours, etc. Every consumer has a different willingness to pay, and since it’s hard to resell a restaurant meal, restaurants can potentially use this technique to their advantage (and yours, if you are willing to look for discrimination). Grocery stores don’t have as much of an opportunity to discriminate, but they still find ways.

Don’t be afraid of price discrimination: use it to your advantage!