Comparing ChatGPT and Bing for a research literature review in April 2023

We wrote “ChatGPT Cites Economics Papers That Do Not Exist

I expect that problem to go away any day, so I gave it another try this week. For the record, they are currently calling it “ChatGPT Mar 23 Version” on the OpenAI website.

First, I asked ChatGPT for help with the following prompt:

ChatGPT is at it again. There is no such paper, as I will verify by showing John Duffy’s publications from that year: 

ChatGPT makes up lies (“hallucinations”). It is also great for some tasks, and smart people are already using it to become more productive. My post last week was on how impressive ChatGPT seemed in the Jonathan Swift impersonation. I didn’t take any time to do fact checking and I would bet money that at least something was made-up-facts in there.

I posed the same question to the Bing plug-in for the Edge browser (Microsoft). Yup, I have opened Edge for the first time in forever to use Bing.

Bing handles the prompt by linking to a useful relevant paper – so if you click the link you will get to a helpful and not misleading answer. Just being a smart search engine instead of hallucinating randomly is better, for my purposes.

The actual paper I wanted returned was this one, by the way:

Duffy, John. “Experimental macroeconomics.” Behavioural and Experimental Economics (2010): 113-119.

There is no reason that ChatGPT should be better than an expert in a subfield of a field of economics. But that’s the genius of a good search engine. You ask it “Can I repair a broken fiddlewhat?” The search engine does not claim to know but rather directs you to the blog of the world expert in fiddlewhats.

I can’t find the link to it, but I’m going to toss in one more thing here. Tyler Cowen did an interview this Spring on AI. There was a newspaper reporter who had a “creepy” interaction with an AI that made for the topic of a viral internet article. Tyler made a very contrarian point by saying that he interprets this as a case of AI alignment. The reporter wanted something sensational and he got what he wanted.

So, it will probably be true for a long time that if you want to find a failure of AI, you can get what you want. Still, I’m putting this on the record here because I wonder if this particular problem will get solved quickly.

Spending Like a…

Is the federal government spending at a faster rate? Your answer probably has more to do with your biases than with anything else. Most people don’t know the numbers or they imagine some more appropriate past. Below is logged current federal expenditures (this does not include government fixed investment, only consumption. Yes, we can argue about measures. This doesn’t include transfers).

The line of best fit is about 1.6% per quarter or 6.4% per year. Golly! Our spending is rising so fast! But, US federal spending grew relatively slowly in the 90s – maybe due to that fiscal conservative, Bill Clinton. And our federal spending grew even more slowly between 2010 and 2016 – maybe due to that other fiscal conservative, Barack Obama.

But, inflation varied over this period. What about real, inflation adjusted federal spending? See Below.

Continue reading

Why States Hate Nursing Homes

Medicaid is a health insurance program for those with low incomes, funded largely by states. Overall it accounts for less than 20% of US medical spending. But there is one area where it is the dominant payer: nursing homes. Nursing homes are expensive, and Medicare (the typical insurance for those over 65) won’t cover them after the first hundred days, so most nursing home residents end up paying out of pocket until they burn through all their savings and wind up on Medicaid. At which point, Medicaid pays about $100,000 per year to the nursing home for the rest of their life.

States are responsible for up to half of that cost, and so start looking for ways to save money. One idea they have is to make it harder to build nursing homes: if there aren’t beds available, potential nursing home patients will have to stay home instead, where they can’t rack up Medicaid spending the same way. In fact, some states go all the way to a complete moratorium on new nursing homes:

Source: Institute for Justice

Some other states allow new nursing homes, but only with a special permission slip called a Certificate of Need (CON). CON is often required for other types of health facilities as well, like hospitals or dialysis centers. Research by me and others has generally found that CON doesn’t work as a way to reduce spending, and in fact actually increases it. CON might reduce the number of facilities, but that reduction of supply and competition gives the remaining facilities more power to raise prices.

So which effect dominates- does the smaller number of facilities reduce total spending, or do the higher prices increase it? It depends on the elasticity of demand:

In health care demand is typically quite inelastic, so the price effect dominates, and spending goes up:

But nursing homes could be an exception here. Elasticity of demand could be relatively high because of the number of potential substitutes- home care or assisted living for those with relatively low medical needs, hospitals for those with relatively high medical needs. Plus this is the one type of health care where Medicaid is the dominant payer. They could be especially resistant to price increases here, both due to their market power and their willingness to keep prices so low that facilities won’t take Medicaid patients (another way to save money!).

A new paper by Vitor Melo and Elijah Neilson finds that this is indeed the case. Indiana, Pennsylvania, and North Dakota repealed their nursing home CON requirements in the ’90s, and at least for IN and PA their Medicaid spending went way up. The paper uses a new “synthetic difference in difference” technique that seems appropriate, and creates figures that seem confusing at first but get a ton of information across:

They correctly note that they don’t evaluate the welfare effects of the policy; it’s possible that the extra nursing home beds following CON repeal bring huge benefits to seniors that are worth the higher spending. But nursing homes could be the exception to the general rule that CON fails to achieve the goals, like reduced spending, that advocates set for it.

Workers Finally Get a Real Annual Raise

Back in December I pointed out that, thanks to slowing inflation, real wages had been rising since June 2022 (using either the CPI or the PCEPI for inflation adjustments).

With the latest monthly data, we can now report more good news for wage earners: CPI-adjusted wages have increased over the past 12 months. That had happened since 2021. In the past 12 months, wages of production and non-supervisory workers are up 5.1%, just a hair more than the annual increase in the CPI of 5.0%. It’s not much, and we’re not back to our pre-pandemic norm of 2% real wage growth. But it is more good news that we may finally getting past our post-COVID inflationary hangover.

Job Market Still Red-Hot; Inflation and High Rates Not Going Away Soon

As noted earlier, the main driver in inflation since 2021 has not been supply chain issues, but ongoing wage increases in (mainly) the service industry, fueled by a tight labor market. Some headlines note recent decreases in job openings, etc., suggesting that the end of inflation is near. The point of this post is that measures of labor market tightness remain at very high levels, and so it will be a while yet before the Fed can claim victory over inflation and start meaningfully reducing interest rates.

Below I will post a set of charts (courtesy of Seeking Alpha article by Wolf Richter) which make the following point: most measure of labor tightness remain at least as high as they were in late 2019, just before the pandemic hit. It is true that things have loosened up in the past few months, but that just means the labor market has gone from white-hot to merely red-hot. Let the data speak:

We hold that the current  tightness of the labor market is largely a result of pandemic policies which incentivized a whole tranche of experienced workers to take early retirement and also put lots of cash in our pockets which we are spending generously on services .  Those workers are not coming back, but at some point in the next 1-2 years the excess Covid cash will run out and we may finally get the long-expected recession. But if the government rushes in with enhanced unemployment benefits to ease the recession pain, we would expect inflation to remain well above the nominal 2% target

Let’s start an AEA Working Paper Series

For the purposes of this blog post we are the supreme chancellors of the American Economic Association and we are mad with power. The only question is what shall become of this power? What will be our legacy? For me, the answer is obvious.

An AEA Working Paper Series. A working paper series for all!

Wait, maybe we need to walk that last bit back. After all, a working paper series for all already exists. It’s called the internet. Or at least SSRN. What exactly is it that we want again?

If we’re being honest, what we want is to post our papers to the NBER working paper series (WPS), but some of us aren’t members. That’s still not a call to action, though. Are we just trying to replicate a club that we aren’t members of? What are the benefits of posting a paper to the NBER WPS that we desire for our own research? Better to work our way backwards from what we specifically want than complain about what we don’t have.

What I think we want to solve is the research dissemination problem in economics. The review process can last anywhere between 1 and 5 years, during which time research is, at best, fully internalized into the field before being formally accepted, but can also become stale, outmoded, or even leap-frogged by parallal researchers. Worse still, research can be missed. Fields, observations, questions, they all have moments. For a piece of research to have an impact, it is paramount that it be within the purview of researchers, policy makers, and journalists during those moments. The NBER has solved this problem through the prominence of it’s WPS, but only for members.

So we want the AEA to replicate the NBER WPS, but for everyone? Sure, but it’s not that easy, since solving the problem for everyone presents an actual paradox. The whole reason the NBER WPS has solved the research dissemination problem is exactly because it is not open to everyone. Researchers, policy makers, and journalists take the work posted within the NBER WPS seriously because they trust the filter. They trust the NBER to only grant membership to serious researchers. They trust members to only post papers that are sufficiently developed that they are ready for public consumption. They trust it because they need to trust it. There is far too much research produced to possibly be consumed without a massive quality filter. The journals used to provide this service, but they became too slow and the NBER filled the gap.

The problem we are trying to solve is not how to create just any research filter, but the source and nature of that filter. The American Economic Review, Journal of Political Economy, Quarterly Journal of Economics, they were (in theory) pure quality filters. They were meritocracies. Sure, there was always plenty of grumbling about journals being clubs (and there is, as it turns out, a lot of merit to those complaints), but the club component was at least nominally less dominant. The NBER, on the other hand, is a club, full stop. A useful and important club that has advanced the discipline in countless ways, but still a club.

So, to return to our motivating question, is there a way to offer the benefits of the NBER WPS outside of a pure club structure? If we have apply a quality filter, can we do so in a more openly meritocratic way that grants at least the prospect of access to a wider group of researchers?

Your first thought might be “We can read and evaluate the papers individually”…but that’s just an academic journal, and we already know the problems there. For a WPS to matter we need both sufficient scale and speed. The NBER solved this by filtering people, not papers. That is, I think, exactly what an AEA WPS needs to replicate. And that, to my mind, brings us to heart of the matter: can we create a WPS membership filter that is open and meritocratic, or is that a contradiction in terms?

Can the American Economic Association create a membership-only Working Paper Series that offers then benefits of the NBER WPS, but with less dependence on the path dependent forces that currently shape NBER membership?

Let’s start with the block of granite that is the entire AEA membership. Can the AEA offer a functioning WPS that includes the papers of all of it’s members? I don’t think so. Too many people, too many papers. Management, including quality control, would be too costly. More problematic, the number of papers coming out in weekly/monthly digests would be too large to be useful to readers. We need to cut it down, but how?

1. Price

No reason we can’t charge members who want access to the WPS a $100 a year on top of their dues to pay editors and managers. Discounts for graduate students and faculty in lower income countries would be important, but straightforward. That will lower the numbers, but not enough. Worse, a modest price would be insufficient, and might even backfire, as a quality filter.

2. Educational or institutional requirements

We can set up minimal standards for WPS membership. The simplest: a Ph.D. Stricter: faculty appointment. Strictest: faculty appointment in an R1 accredited institution. Is this fair? No, but I’m not sure what qualification requirements would be perfectly fair. You could, of course, have separate series for current graduate students, with the graduate series posting quarterly. Any of these would still include far too many papers, but we’re making progress.

3. Prior contributions

What if we limited WPS membership to established researchers? For example, limiting the WPS to individuals who have previously published in one of the six AEA journals? This strikes me as the most feasible to me, particularly from the point of view of the AEA and its strong interest in maintaining the integrity of its journals. It would make the numbers manageable.The AEA could also partner with other top associations, such as the Econometric Society, European Economic Association, and Royal Economic Society to create a list of 10-20 journals. Publication in one of these outlets would grant WPS membership access for 20 years. This would still advantage graduate students co-authoring with stars to jump start their career, but do you honestly think there’s any world in which writing with brilliant scholars isn’t an enormous advantage? Marginal improvements folks, that’s what we’re going for.

4. Citations

Perhaps research impact could be a channel for membership? It seems to me any researcher with 25 citations within the previously establish list of journals has established their bone fides as a researcher. This seems good to me on it’s face, but difficult to implement.

5. Current member/editor sponsorship

This is basically what the NBER does. I am bit hesitant to include qualification via the exact channel I am rebelling against, but there is also something to be said for allowing some discretion in opening the door for others, particularly when we consider the obstacles facing disadvantaged groups when trying gain access to the academy. I’d be far more interested if current member/editor sponsorship for AEA WPS membership were exclusively limited to junior researchers from groups targeted for greater inclusion.

6. Two of the previous four

My guess is that having multiple channels to WPS access is the best way forward, but with the caveat that a researcher has to qualify on more than one dimension to establish themselves as an active, top-tier researcher. Limiting access in these manners will keep the monthly flow of papers limited to a manageable, consumable quantity. The filter imposed will ensure the necessary level of quality, while at the same time granting access on dimensions sufficiently orthogonal to the traditional high school -> undergraduate->grad school prestige pipeline filter. Because that is the real goal here: to solve the economic research dissemination problem for broader subset of scholars. This isn’t a perfect solution, but I think it would stand as a significant step forward.

So there it is folks. The seed from which the AEA WPS will surely spring forth in the next 5 years, surpassing the NBER WPS and becoming the go-to channel to stay abreast of research at the bleeding edge of economic knowledge. I look forward to paying my dues and posting my first paper.

GPT showing off on Conversations with Tyler

Since posting it in January of 2023, our post on a failure of ChatGPT has been one of the most popular articles of the year in terms of web traffic. Read ChatGPT Cites Economics Papers That Do Not Exist

LLMs will make up nonsense lies, but it might sound sophisticated and serious. As human editors, we are not used to detecting that kind of problem, because usually if something sounds serious that means it has been written by a serious thoughtful person (usually).

That’s a problem, but ChatGPT is really good at some tasks.

Instead of pointing out the fact that ChatGPT is not perfect, Tyler Cowen wants people to see the potential. He has done an entire podcast interview with ChatGPT, asking it to take on the voice and role of deceased writer Jonathan Swift.

I’ll share part of the transcript:

COWEN: In 1709, you, Jonathan Swift, still were calling yourself a Whig. What exactly induced you to switch to the Tory side of the ledger?

SWIFT: My shift from the Whigs to the Tories was a gradual and complex process, influenced by several factors, including personal connections, political developments, and ideological disagreements.

During my early years in politics, I was indeed more aligned with the Whigs, who championed principles such as constitutional government, religious toleration, and the supremacy of Parliament over the monarchy. However, over time, I became increasingly disillusioned with the Whigs, particularly as their internal factions and power struggles intensified.

The Whigs’ support for the war with France, which began in 1702, was a point of contention for me. I believed that the war was costly…

The replies seem nuanced and insightful. Nothing made me suspect that they were factually inaccurate, although I do not know enough about Irish history to judge.

Is there any human who could have produced this script? I think so, although it would have required a lot of work. If one of these replies is better than anything a human Swift scholar would produce, how would we know?

GPT4 can write good summaries for the work of a prolific author like Swift, because the model can train on lots of examples.

GPT4 could probably write a good biography of a modern figure by pulling together all of the writing by them and about them. Maybe GPT4 could efficiently scrape up all mentions of this figure online and synthesize them faster than a human scholar. However, we observed GPT3 completely making up citations when we tried to get it to do economics summaries.

I’m concerned that humans will use GPT4 to write but not do the requisite fact-checking. That could introduce a new corpus of work that the next LLMs will train on, which might be full of lies. Humans might not admit to using GPT, and therefore we wouldn’t have a mechanism for using extra scrutiny on AI-generated writing from 2023. Humans can make mistakes too… so the ultimate solution could be an all-powerful AI that somehow does begin with a fairly accurate map of the world and goes around fact-checking everything faster than human editors ever could.

Minor Investment

Gary Becker, the Nobel laureate in economics, applied economic reasoning to social circumstances and particularly to families. He argued that children are a normal consumption good, and people consume more children with higher incomes. However, he also emphasized a quantity-quality trade-off. More children in a family means fewer resources and attention for each child. Higher-income couples may opt to invest in classes, training, and spend more time with a unitary child rather than increasing the number of children.

However, goods have multiple attributes and children do not merely provide a stream of consumption value while in the household. They offer access to future resources when they become employed themselves. Having more children or higher-quality children increases the economic benefits that older parents can enjoy, such as more help with household activities and the ability to travel with their adult children. Old-age benefits such as social security now serve the function of insulating people from their prior investments in future consumption.

Continue reading

Hospitals Just Got Easier to Build in West Virginia

West Virginia just repealed their Certificate of Need requirement for hospitals and birthing centers. Until now anyone wanting to open or expand a hospital needed to apply to a state board for permission. The process took time and money and could result in the board saying “no thanks, we don’t think the state needs another hospital”.

Now anyone wanting to open or expand a hospital and birthing center can skip this step and get to work. This means more facilities and more competition, which in turn leads to lower health care spending relative to trend.

Of course, the rest of West Virginia’s Certificate of Need requirements remain in place; if you want to open many other type of health care facilities, or purchase major equipment like an MRI, you must still get the state board to approve its “necessity”. In some cases, you shouldn’t even bother applying; West Virginia has a Moratorium on opioid treatment programs. Ideally West Virginia would join its neighbor Pennsylvania in a complete repeal of Certificate of Need requirements.

But making it easier to build hospitals and birthing centers is a major step. Hospitals are the largest single component of health spending in the US, and improved facilities might help reduce West Virginia’s infant mortality from its current level as the 4th worst state.

Update 4/7/23: A knowledgable correspondent suggests that the law may only allow existing hospitals to expand without CON (while totally new hospitals would still require one), citing this article. The text of the bill itself seems ambiguous to me. The section “Exemptions from certificate of need” adds “Hospital services performed at a hospital”. For birthing centers by contrast, new construction is clearly now allowed by right: exemptions from CON now include “Constructing, developing, acquiring, or establishing a birthing center”.

Whiteboard Macroeconomics

There’s nothing that economists love more than a good blackboard (or in modern times, a whiteboard) to work out some basic models of how we think the world works. Supply and demand rules in microeconomics, but macroeconomics has a few good blackboard models too.

So I was excited to see when a member of Congress was using a whiteboard to work through some basic economic logic, as Rep. Katie Porter did in this video she tweeted using the textbook macroeconomics aggregate demand and aggregate supply model:

However, while I haven’t taught macroeconomics in about a decade, it seems there are a few flaws in her analysis. Flaws enough that this probably wouldn’t get a passing grade on an oral exam. I could detail them myself, but… I will leave this to the readers as an exercise! For fun, even if you don’t think this is the best model in the world, just assume it’s a good model. What did Rep. Porter miss? Leave a comment.