The Welfare-Productivity Tradeoff in US-China Trade

Who benefits from trade between the US and China? If China subsidizes their exporting industries, should the US see this as a threat that undermines our industries, or thank China for lowering prices for US consumers? Does it matter that China runs a persistent trade surplus (exporting more than they import), while the US runs a persistent trade deficit?

Everyone has a take on these questions, but the answers I hear even among economists rarely draw from the leading modern models in the international trade literature. Krugman (1980) (10k citations) shows how large home markets matter for industries with increasing returns to scale. In a simple increasing returns model, unlike with Econ 101 comparative advantage, temporary subsidies can permanently flip which country an industry efficiently operates in.

Melitz (2003) (20k citations) extends the Krugman model to include firm-level productivity differences. Rubini (2014) extends the Melitz model to include innovation. Now Xiao (2025) has extended the Rubini model to include unbalanced trade, then calibrated the model with data from the US and China. Now that the mathematical models are able to incorporate more and more features of the real world, what do they show?

China’s trade surplus and the US trade deficit have tradeoffs. Specifically, China’s trade surplus leads them to be more productive than they otherwise would be, but have lower welfare, because so much of the fruit of their production is enjoyed by other countries. Conversely the US trade deficit leads us to produce less than we otherwise would, but to have higher welfare thanks to consumers enjoying the cheaper foreign goods.

In one sense this recapitulates some of the same debates people had without the math. Some people like trade because it benefits US consumers and overall present-day US wellbeing. Some don’t like it because it harms US manufacturing and our resiliency in any potential future conflict.

One advantage of the models is that it puts numbers on the tradeoffs. In this case, the welfare benefit to the US may be small relative to China’s welfare loss and relative to both countries’ productivity changes:

the average productivity increase caused by trade surplus ranges from 1.2 percentage points to 5.46 percentage points when the innovation cost changes. These results explain China’s long-term export promotion policies and align with its new policy goal of developing “new productivity forces”. I also identify a negative effect on China’s trade partners’ productivity (namely, the US), of between -2.74 percentage points and -5.89 percentage points. This comes at a welfare cost, equivalent to between 3 percentage points and 5.7 percentage points of consumption units. Correspondingly, China’s cheaper goods increase welfare in the US by between 0.26 percentage points and 1.22 percentage points

In addition to the big complex model, Xiao’s paper shares nice background on the sheer size of Chinese export subsidies, noting that they account for 2/3 of all manufacturing subsidies in G20 countries, and that export tax rebates are almost 2/3 as large as Chinese net exports. In short, China’s trade surplus is not simply driven by differing preferences and production capabilities across countries, but is largely driven by deliberate policy choices.

P.S. The paper’s author, Aochen Xiao, is on the econ job market.

arXiv will ban authors who submit papers with LLM mistakes

In the world of academic preprints, arXiv has long been the go-to platform for researchers to share work quickly. But with the explosion of generative AI tools, the repository is drawing a line in the sand.

On May 14, 2026, arXiv moderator Thomas Dietterich announced a clarified enforcement policy. If a submission contains incontrovertible evidence that authors didn’t properly check LLM-generated content, all listed authors face serious consequences.

What counts as “Incontrovertible Evidence”? The policy targets clear signs of unchecked AI output, including:

  • Hallucinated or fake references
  • Meta-comments left by the model (e.g., “Here is a 200-word summary; would you like me to make any changes?” or placeholder instructions like “fill in the real numbers from your experiments”)
  • Other obvious errors, plagiarized text, biased content, or misleading claims generated by AI

arXiv’s Code of Conduct already holds every author fully responsible for the entire paper’s contents.

The Penalty

  • One-year ban from submitting new papers to arXiv.
  • After the ban, future submissions must first be accepted at a reputable peer-reviewed venue before arXiv will host them.

At first researchers discussing the policy online seemed happy about the one-year ban, but when I pointed out that it is essentially a ban for life to use it at a pre-print venue, some people became nervous.

Why now? arXiv has been overwhelmed by low-effort “AI slop.” These papers are marked by fabricated citations and shallow summaries. This erodes trust in the entire preprint ecosystem.

In response to the complaints (someone like me would be worried that I’ll somehow let an error slip through and then be banned for life from posting working papers), Scientific Director Steinn Sigurðsson shared:

on the whole @arxiv flap about hallucinated references etc

you don’t see the stuff we reject… some of it is really really egregious

the decision to impose additional consequences is largely to throttle that stuff so n00bs and bad actors don’t trash us trying repeatedly

This is the problem that we face with every internet forum. A few bad actors ruin it for good people.

In 2022 I wrote Content moderation strategy

Elon Musk buying Twitter is the big news this week. He wants to enhance free speech on the site and, according to him, make it more open and fun. Some fans are hoping that he will make the content moderation and ban policy more transparent. Maybe that’s possible. 

If no one can be banned, then bad actors will bring the whole platform down. Inevitably, good people get caught in the net, and it’s devastating to be locked out of a platform where your peers are sharing.

However, if you want to be taken seriously by tech folk then ask for a system that is possible. A substantially better experience might be incompatible with the site being free to users.

Part of the problem that I don’t hear people talking about is that a free platform is not easily compatible with good customer service.

For some not-fake work and citations: Buchanan et al. (2024) provided early clear evidence that a mark of LLM-written work is fake citations. And, Buchanan and Hickman (2024) show that certain framings can prompt people to be more suspicious of AI-generated writing, such that they are pushed toward doing a fact-check before believing all claims.

Buchanan, Joy, and William Hickman. “Do people trust humans more than ChatGPT?.” Journal of Behavioral and Experimental Economics 112 (2024): 102239.

Buchanan, Joy, Stephen Hill, and Olga Shapoval. “ChatGPT hallucinates non-existent citations: Evidence from economics.” The American Economist 69.1 (2024): 80-87.

Most Published Research Findings Are Directionally Correct

As a new quick rule of thumb inspired by the Nature papers, you could do worse than “cut estimated effect sizes in half”. If a published paper says that a college degree raises wages 100%, then chances are the degree really does raise wages, but more like 40–50%. In 2005, John Ioannidis said that “most published research findings are false”. By 2026, we seem to have improved to “most published research findings are exaggerated.”

That’s the conclusion of my piece out today at Econlog: “Is Economics Finally Becoming Trustworthy?

There’s plenty of both good and bad news for economics and the social sciences in both my piece and the Nature special issue it describes. It’s kind of like the Our World in Data motto:

In short, our attempt to replicate hundreds of papers showed that published social science results shouldn’t be trusted precisely today, but they seem to be getting more reliable over time, and they are much more reliable than chance. Economics and political science look the best, though we are still very far from perfect:

You can read the full piece here.

EWED cited in Top Demography Journal

We’ve been cited in top newspapers, such as The Financial Times, before, but this might be a first. Our blog has been cited in Demography, a top-ranked journal in the field of demographics and population studies.

The internet is fun sometimes, and that is why we are here (almost) every day. Jeremy’s work is mostly about wealth, and this paper is mostly about income:

Has Generational Progress Stalled? Income Growth Over Five Generations of Americans

I was able to download the PDF directly from the journal website linked above, so it must be open-access. Instead of trying to restate all of their finding here, I’ll just quote:

At ages 36–40, Millennials’ mean net worth was about $95,000 higher than that of Generation X. Their home equity was $30,000 higher and non­hous­ing wealth was about $65,000 higher. Thus, although homeownership among Millennials has declined, home values have increased enough among those who own homes to increase mean home equity, while their nonhousing wealth has grown as well. Our find­ings of gen­er­a­tional increases in wealth echo those pre­viously found by Horpedahl (2021, 2024).

Horpedahl, J. (2021, Sep­tem­ber 1). Who is the wealth­i­est gen­er­a­tion? Economist Writing Every Day. Retrieved from https://economistwritingeveryday.com/2021/09/01/who-is-the-wealthiest-generation/

Horpedahl, J. (2024, Jan­u­ary 24). Young peo­ple have a lot more wealth than we thought. Economist Writing Every Day. Retrieved from https://economistwritingeveryday.com/2024/01/24/young-people-have-a-lot-more-wealth-than-we-thought/

Perhaps people will forget why this finding was such a big deal in 2021. It was the opposite of what many were saying!

Should Practicing Economists Read Tyler’s New Marginalism Book

Tyler Cowen’s new (free online) book entitled The Marginal Revolution: Rise and Decline, and the Pending AI Revolution is going to be “interesting,” but should you read it?

Mike Makowsky explained that Academic economists are overcommitted

If you are already struggling to meet your deadlines for referee reports you owe to editors, should you take the time? If you don’t have time to indulge your curiosity about the 18th century and dead thinkers, right in the middle of the semester, should you look at it now or maybe browse it over the summer?

I think it’s worth going straight to the last chapter right now.

“Chapter 4: Why Marginalism Will Dwindle, and What Will Replace It?

It was written for you and released quickly for this moment. Tyler does not personally have to worry about his job, but you might.

This link will take you straight to an in-browser e-reader https://tylercowen.com/marginal-revolution-generative-book/app/

Or you can download the PDF at https://tylercowen.com/wp-content/uploads/2026/03/TheMarginalRevolution-Tyler_Cowen.pdf

You might face mental resistance to reading this chapter, because you don’t want to hear the message. If that’s you, then it’s especially useful to read this chapter. He’s not correct about everything. Develop your counter argument, to go forth and save marginalism. You can only do that if you understand and name the threats. This is more about methods/professions and less about ideology than you might think from the title.

Here are some quotes that stood out to me

The ties of empirical work in economics to economic theory are evolving, and in particular the explicit ties to intuitive microeconomic reasoning, and marginalist thinking, are being cut. In much of traditional econometrics, the emphasis is on testing pre-existing models…

in machine learning, we let the algorithm build the “theory” for us, noting it may have tens of millions of variables and thus not count as a theory…

So much for prediction, what about hypothesis generation? Well, there is a new approach to that too, using machine learning.

A lot of economists do not regularly describe what they actually do for work. Yes, we are saving the world by writing papers, but what exactly do you do? Do you generate hypotheses? Is that what you are teaching your students to do?

It’s not fun to think of how the econ profession might need to reposition, but we owe it to students. Who better to work on this than tenured professors? 

I think the case for undergraduates students to major in economics is strong. I also think the case for doing 4 years of college is strong for students who want to learn.

Last summer I wrote: Students still need to learn principles

If economics is “more interesting” than hard science, then it might serve to scoop up good thinkers at the undergraduate level and get them doing something more technical than what they would end up doing in a humanities program. When I graduated from college, the fact that most econ student had accidentally learned to code was a benefit to them.

College graduate humans ought to be able to read and pass the Turing Test if they are going to be effective complements to AI.

Economists championing marginalism for students, today, write: For Gen Z, Economics May Be the Key to Success in the New AI World

Let me plug Mike as well for thinking about what research econs do in 2026: The actual AI problem in academic economics “Oh, what shall all the candlemakers do now that the sun has risen?” made me laugh.

How Much To Trust Research Papers? My Rules Of Thumb

  1. Trust literatures over single papers
  2. Common sense and Bayes’ Rule agree: extraordinary claims require extraordinary evidence
  3. Trust more when papers publicly share their data and code
  4. Trust higher-ranked journals more up to the level of top subfields (e.g. Journal of Health Economics, Journal of Labor Economics), but top general-interest journals can be prone to relaxing standards for sensationalist or ideologically favored claims (e.g. The Lancet, PNAS, Science/Nature when covering social science)
  5. More recent is better for empirical papers, data and methods have tended to improve with time
  6. Overall effects are more trustworthy than interaction or subgroup effects, the latter two are easier to p-hack and necessarily have lower statistical power
  7. Trust large experiments most, then quasi-experiments, then small experiments, then traditional regression (add some controls and hope for the best)
  8. The real effect size is half what the paper claims

That last is inspired by a special issue of Nature out today on the replicability of social science research. An exception to rule #4, this is an excellent project I will write more about soon.

Experimental Banking Reveals the Value of Leisure

In 2014 India required banks to offer no-cost accounts. This led hundreds of millions of people to open bank accounts for the first time, and more than doubled the number of Indian women who had a bank account:

This increased households’ collective ability to save and borrow, but didn’t shift decision-making power towards women despite the larger change for them. That is the finding of a paper by Tarana Chauhan, a Brown University postdoc who is currently on the job market. The paper is a well-executed example of a difference-in-difference analysis of observational data- that is, carefully examining data that other people generated to examine events that help establish causality. But the validity of difference-in-difference strategies in separating correlation from causation can always be questioned, and always is in economics seminars.

So Dr. Chauhan, this time with coauthors Berber KramerPatrick Ward and Subhransu Pattnaik, followed up by directly running an experiment. They got a company to offer subsidized loans to hundreds of randomly selected Indian farmers, then surveyed the farmers to see if they behaved differently than a control group that didn’t get loans. The loans carried a 14% interest rate, which seems high to Americans but was apparently 10pp lower than the other options available in India. They wanted to know whether farmers would use the loans to improve farm productivity, and whether this would have any differential effects on women.

The first stage of the experiment worked: households took the loans and got more engaged with the financial system.

Some used the money for smartphones:

But for the most part they seem not to have spent the money on farming- they didn’t buy significantly more land, seeds, fertilizer, or farm equipment. They did spend more on “non-farm business equipment” and “large consumer durables”. Despite not producing more food themselves, they reported higher food security. Income stayed flat, but women were able to shift some time away from work and toward leisure:

I find these results surprising given how poor the households receiving the loans are. They earn the equivalent of about $1,000/yr, putting them around the global “extreme poverty” line. At that income level I’d think they would value additional income highly relative to leisure, and yet when they get the loan, work time goes down and leisure time increases. Could it really be the case that they’ve already hit their income target, and are on the backward bending part of the labor supply curve? Some other possibilities are that they don’t expect that investing in farming would increase yields enough to be worthwhile, or that they worry any increased income would be taken away through explicit or implicit taxes. But the households generally seem better off as a result of the loan.

The other surprise- enough of the loans were paid back that the lenders made a profit despite the research pushing the interest rate below-market.

Does Broadband Bring Jobs?

No, according to a new paper from the University of Georgia’s Michael Kotrous.

Many people expected it to, partly by thinking about the jobs that could benefit from faster internet, and partly by looking at the experience of Chattanooga, Tennessee. Chattanooga was the first major city to get gigabit-speed broadband, and they did see a huge improvement in the labor market right afterwards:

But as the graph shows, the introduction of broadband there coincides with the end of the nationwide Great Recession. Was the boom in jobs after 2009 because of the broadband, or would it have happened anyway as party of the recovery from recession? A synthetic control strategy shows that Chattanooga’s recovery was pretty typical for cities like it, so the broadband angle probably didn’t do much:

This might seem like a historical curiosity about one city, but the federal government is currently trying to spend $42 billion to expand broadband to more places, partly motivated by the idea of bringing jobs. I thought the Broadband Equity Access and Deployment Program‘s big problem is how slow it is- Congress created with the Infrastructure Investment and Jobs Act of 2021, but money didn’t start getting sent out until late 2025, and it could be many more years before it leads to any useable broadband. Even then it now seems unlikely to bring jobs, though there could be other benefits.

This paper’s author Michael Kotrous is currently on the economics job market. As his former professor and coauthor, I recommend hiring him if your school gets the chance.

Humanity’s Last Exam in Nature

Last July I wrote here about “Humanity’s Last Exam”:

When every frontier AI model can pass your tests, how do you figure out which model is best? You write a harder test.

That was the idea behind Humanity’s Last Exam, an effort by Scale AI and the Center for AI Safety to develop a large database of PhD-level questions that the best AI models still get wrong.

The group initially released an arXiV working paper explaining how we created the dataset. I was surprised to see a version of that paper published in Nature this year, with the title changed to the more generic “A benchmark of expert-level academic questions to assess AI capabilities.”

One the one hand, it makes sense that the core author groups at the Center for AI Safety and Scale AI didn’t keep every coauthor in the loop, given that there were hundreds of us. On the other hand, I’m part of a different academic mega-project that currently is keeping hundreds of coauthors in the loop as it works its way through Nature. On the third, invisible hand, I’m never going to complain if any of my coauthors gets something of ours published in Nature when I’d assumed it would remain a permanent working paper.

AI is now getting close to passing the test:

What do we do when it can answer all the questions we already know the answer to? We start asking it questions we don’t know the answer to. How do you cure cancer? What is the answer to life, the universe, and everything? When will Jesus return, and how long until a million people are convinced he’s returned as an AI? Where is Ayatollah Khamenei right now?

Is This the End of the Largest Refugee Crisis in the Americas?

Our 2024 post on the Venezuelan election provides context for this week’s dramatic events:

Venezuela held an election this week; President Maduro says he won, while the opposition and independent observers say he lost. Disputed elections like this are fairly common across the world, but where Venezuela really stands out is not how people vote at the ballot box- it is how they vote with their feet.

Reuters notes that “A Maduro win could spur more migration from Venezuela, once the continent’s wealthiest country, which in recent years has seen a third of its population leave.”

This makes Venezuela the largest refugee crisis in the history of the Americas, and depending on how you count the partition of India, perhaps the largest refugee crisis in human history that was not triggered by an invasion or civil war.

Instead, it has been triggered by the Maduro regime choosing terrible policies that have needlessly and dramatically impoverished the country

Plus some foreshadowing:

I hope that the Venezuelan government will soon come to represent the will of its people. I’m not sure how that is likely to happen, though I guess positive change is mostly likely to come from Venezuelans themselves (perhaps with help from Colombia and Brazil); when the US tries to play a bigger role we often make things worse. But what has happened in Venezuela for the past 10 years is clearly much worse than the “normal” bad economic policies and even democratic backsliding that we see elsewhere. 

Here’s an update on the chart I shared then, showing that the diaspora has continued to swell:

I hope that Venezuela will soon become the sort of country people don’t want to flee. I don’t necessarily expect that it will, but it’s not now a crazy hope: