Ricardian Equivalence: Reasonable Assumption #2

There are several requirements for Ricardian Equivalence:

  1. Individuals or their families act as infinitely lived agents.
  2. All governments and agents can borrow and lend at a single rate.
  3. The path of government expenditures is independent of financing choices

Assumption 2) appears patently absurd on its face. I certainly cannot borrow at the same interest rate that the US Treasury can. QED. Do not pass go, do not collect $200. The yield on 1-year US treasuries is 3.58%. I can’t borrow at that rate… Or can I?

Let’s do some casuistry.

What is a loan?

It’s a contract that:

  • Provides the borrower with access to spending
  • with or without collateral
  • with a promise to repay the lender at defined times, usually with interest.

So, when you borrow $5 from a friend and pay it back on the same day, it’s a loan. The contract is verbal, there is no collateral, the repayment time is ‘soon’ with flexibility, and the interest rate is zero.

A mortgage is a collateralized loan. You borrow from a bank, make monthly payments for the term of the loan, and accrue interest on the principal. The contract is written, the house or a portion of its value is the collateral, and the interest rate is positive.

What about a Pawnshop loan? Most of us are probably unfamiliar with these. In this circumstance, a person has valuable non-assets that and the pawnshop has money.  They engage in a contractual asset swap. The borrower lends the non-money asset to the pawnshop as collateral and borrows money from the pawnshop. The pawnshop borrows the non-money asset and lends the money to the borrower. The borrower can use the money as they please, but the pawnshop can not use the non-money asset – they can simply hold it. They collect interest in order to cover their opportunity costs.

One outcome is that the borrower repays the loan and interest by the maturity date and reclaims their non-money asset. Another outcome is that the borrower retains the option to default without any further obligation. But they lose the right to reclaim their property according to the repayment terms. If the borrower exercises the option to default, then the pawnshop acquires full rights to the non-money asset. The pawnshop often resells the asset at a profit. The profit is relatively reliable because the illiquidity of the non-money asset allows the pawnshop to lend much less than its retail value. That illiquidity is also why the borrower is willing to accept the terms.

If we accept that the pawnshop contract is a loan, which is just a collateralized loan with a mostly standard default option, then get ready for this.

Continue reading

Humanity’s Last Exam in Nature

Last July I wrote here about “Humanity’s Last Exam”:

When every frontier AI model can pass your tests, how do you figure out which model is best? You write a harder test.

That was the idea behind Humanity’s Last Exam, an effort by Scale AI and the Center for AI Safety to develop a large database of PhD-level questions that the best AI models still get wrong.

The group initially released an arXiV working paper explaining how we created the dataset. I was surprised to see a version of that paper published in Nature this year, with the title changed to the more generic “A benchmark of expert-level academic questions to assess AI capabilities.”

One the one hand, it makes sense that the core author groups at the Center for AI Safety and Scale AI didn’t keep every coauthor in the loop, given that there were hundreds of us. On the other hand, I’m part of a different academic mega-project that currently is keeping hundreds of coauthors in the loop as it works its way through Nature. On the third, invisible hand, I’m never going to complain if any of my coauthors gets something of ours published in Nature when I’d assumed it would remain a permanent working paper.

AI is now getting close to passing the test:

What do we do when it can answer all the questions we already know the answer to? We start asking it questions we don’t know the answer to. How do you cure cancer? What is the answer to life, the universe, and everything? When will Jesus return, and how long until a million people are convinced he’s returned as an AI? Where is Ayatollah Khamenei right now?

Learning the Bitter Lesson at EconLog

I’m in EconLog with:

Learning the Bitter Lesson in 2026

At the link, I speculate on doom, hardware, human jobs, the jagged edge (via a Joshua Gans working paper), and the Manhattan Project. The fun thing about being 6 years late to a seminal paper is that you can consider how its predictions are doing.

Sutton draws from decades of AI history to argue that researchers have learned a “bitter” truth. Researchers repeatedly assume that computers will make the next advance in intelligence by relying on specialized human expertise. Recent history shows that methods that scale with computation outperform those reliant on human expertise. For example, in computer chess, brute-force search on specialized hardware triumphed over knowledge-based approaches. Sutton warns that researchers resist learning this lesson because building in knowledge feels satisfying, but true breakthroughs come from computation’s relentless scaling. 

The article has been up for a week and some intelligent comments have already come in. Folks are pointing out that I might be underrating the models’ ability to improve themselves going forward.

Second, with the frontier AI labs driving toward automating AI research the direct human involvement in developing such algorithms/architectures may be much less than it seems that you’re positing.

If that commenter is correct, there will be less need for humans than I said.

Also, Jim Caton over on LinkedIn (James, are we all there now?) pointed out that more efficient models might not need more hardware. If the AIs figure out ways to make themselves more efficient, then is “scaling” even going to be the right word anymore for improvement? The fun thing about writing about AI is that you will probably be wrong within weeks.

Between the time I proposed this to Econlog and publication, Ilya Sutskever suggested on Dwarkesh that “We’re moving from the age of scaling to the age of research“.

Vaccine Variety

The flu and covid-19 vaccines don’t work super well. Both vaccines permit infection and transmission at quite high rates. The benefit from these vaccines come largely from reductions in mortality or severe symptoms conditional on infection. The covid-19 vaccine is itself especially risky or ineffective depending on the age and health of the individual. Plenty of people eschew vaccines.

I live in Collier County, Florida where there have been 61 confirmed cases of measles so far this year. I have since learned that Measles is EXTREMELY contagious. It floats around the air and on items and just sort of hangs out and waits for a place to replicate. I’ve also learned that symptoms include a fever, eye irritation, possible brain swelling, severe dehydration, and a characteristic rash. The severe dehydration easily puts people in the hospital, the eye irritation can lead to permanent vision loss, and the brain swelling can be acute, or a symptom delayed by 5-6 years, which can also be fatal. I’ve also learned that having the vaccine, which is usually administered in two doses, provides about 97% immunity. The vaccine works so well, that the department of health recommends no behavioral change among the vaccinated population when there is a measles outbreak. Barring unique circumstances, measles immunity can persist for a lifetime.

Unfortunately, a large segment of the anti-vaccine mood affiliation retains the salience of the covid-19 vaccine characteristics. Other vaccines and diseases in the typical pediatric schedule are not similar. Most of these prevent infection >90% of the time (TDAP is low at 73%), prevent transmission, reduce mortality when there are breakthrough infections, are effective for years or decades, and are extremely safe for all age groups.

The risks of disease versus the corresponding vaccine are orders of magnitude away from each other. The tables below summarize the data (with sources). I did not double check the source on every single figure. If you glance below, then you’ll see why: Even if the numbers are closer by 10 or 100 times, vaccines still look really good.

First, mortality: The data is divided by disease and age group, and provides mortality rates for both the disease and for the vaccine. The numbers are proportions, conditional on infection or vaccination. There are a lot of zeros in the vaccine mortality rates and certainly more than for the diseases. For example, a measles infection is 10,000 more lethal than the MMR vaccine which prevents it. In fact, all of those zeros in the vaccine rates reflect mortality that is so uncommon, that the estimated one out of every 10 million is just rounded up because researchers don’t think that the risk is zero.

Continue reading

Broad Slump in Tech and Other Stocks: Fear Over AI Disruption Replaces AI Euphoria

Tech stocks (e.g. QQQ) roared up and up and up for most of 2023-2025, more than doubling in those three years. A big driving narrative was how AI was going to make everything amazing – productivity (and presumably profits) would soar, and robust investments in computing capacity (chips and buildings), and electric power infrastructure buildout, would goose the whole economy.

Will the Enormous AI Capex Spending Really Pay Off?

But in the past few months, a different narrative seems to have taken hold. Now the buzz is “the dark side of AI”. First, there is growing angst among investors over how much money the Big Tech hyperscalers (Google, Meta, Amazon, Microsoft, plus Oracle) are pouring into AI-related capital investments. These five firms alone are projected to spend over $0.6 trillion (!) in 2026. When some of this companies announced greater than expected spends in recent earning calls, analysts threw up all over their balance sheets. These are just eye-watering amounts, and investors have gotten a little wobbly in their support. These spends have an immediate effect on cash flow, driving it in some cases to around zero. And the depreciation on all that capex will come back to bite GAAP earnings in the coming years, driving nominal price/earnings even higher.

The critical question here is whether all that capex will pay out with mushrooming earnings three or four years down the road, or is the life blood of these companies just being flushed down the drain?  This is viewed as an existential arms race: benefits are not guaranteed for this big spend, but if you don’t do this spending, you will definitely get left behind. Firms like Amazon have a long history of investing for years at little profit, in order to achieve some ultimately profitable, wide-moat quasi-monopoly status.  If one AI program can manage to edge out everyone else, it could become the default application, like Amazon for online shopping or Google/YouTube for search and videos. The One AI could in fact rule us all.

Many Companies May Get Disrupted By AI

We wrote last week on the crash in enterprise software stocks like Salesforce and ServiceNow (“SaaSpocalypse”). The fear is that cheaper AI programs can do what these expensive services do for managing corporate data. The fear is now spreading more broadly (“AI Scare Trade”);  investors are rotating out of many firms with high-fee, labor-driven service models seen as susceptible to AI disruption. Here are two representative examples:

  • Wealth management companies Charles Schwab and Raymond James dropped 10% and 8% last week after a tech startup announced an AI-driven tax planning tool that could customize strategies for clients
  • Freight logistics firms C.H. Robinson and Universal Logistics fell 11% and 9% after some little AI outfit announced freight handling automation software

These AI disruption scenarios have been known for a long time as possibilities, but in the present mood, each new actual, specific case is feeding the melancholy narrative.

All is not doom and gloom here, as investors flee software companies they are embracing old-fashioned makers of consumer goods and other “stuff”:

The narrative last week was very clearly that “physical” was a better bet than “digital.” Physical goods and resources can’t be replaced by AI like digital goods and services can be at an alarming rate

As I write this (Monday), U.S. markets are closed for the holiday. We will see in the coming week whether fear or greed will have the upper hand.

Telephone Classroom Game for Teaching Large Language Models

Use the above game to generate interaction in a class setting. Students collectively form an LLM and have fun seeing the final sentence that gets produced. I call this game “LLM Telephone” based on the classic game of telephone. I suggest downloading the file LLM_Telephone_Game_Sheet and handing out printed copies. However, this game could be adapted to a virtual setting.

The nice thing about passing papers in the classroom is that you can have several sheets circulating in a quite room, so when the final sentence is read allowed it comes as a surprise to most people.

If you’d like to have a handout to follow the game with a more technical explanation, you can use this two-page PDF:

The game relies on a player presenting two tokens of which the next player can select their favorite. Participants should be bound by the rules of grammar and logic when making their selection and presenting two tokens to the next player.

This game works as a fun ice breaker for any type of class that touches on the topic of artificial intelligence. It is suitable for many ages and academic disciplines.

Truth: The Strength and Weakness of AI Coding

There was a seismic shift in the AI world recently. In case you didn’t know, a Claude Code update was released just before the Christmas break. It could code awesomely and had a bigger context window, which is sort of like memory and attention span. Scott Cunningham wrote a series of posts demonstrating the power of Claude Code in ways that made economists take notice. Then, ChatGPT Codex was updated and released in January as if to say ‘we are still on the frontier’. The battle between Claude Code and Codex is active as we speak.

The differentiation is becoming clearer, depending on who you talk to. Claude Code feels architectural. It designs a project or system and thrives when you hand it the blueprint and say “Design this properly.” It’s your amazingly productive partner. Codex feels like it’s for the specialist. You tell it exactly what you want. No fluff. No ornamental abstraction unless you request it.

Codex flourishes with prompts like “Refactor this function to eliminate recursion”, or “ Take this response data and apply the Bayesian Dawid-Skene method. It does exactly that. It assumes competence on your part and does not attempt to decorate the output. It assumes that you know what you’re doing. It’s like your RA that can do amazing things if you tell it what task you want completed. Having said all of this, I’ve heard the inverse evaluations too. It probably matters a lot what the programmer brings to the table.

Both Claude Code and Codex are remarkably adept at catching code and syntax errors. That is not mysterious. Code is valid or invalid. The AI writes something, and the environment immediately reveals whether it conforms to the rules. Truth is embedded in the logical structure. When a single error appears, correction is often trivial.

When multiple errors appear, the problem becomes combinatorial. Fix A? Fix B? Change the type? Modify the loop? There are potentially infinite branching possibilities. Even then, the space is constrained. The code must run, or time out. That constraint disciplines the search. The reason these models code so well is that the code itself is the truth. So long as the logic isn’t violated, the axioms lead to the result. The AI anchors on the code to be internally consistent. The model can triangulate because the target is stable and verifiable.

AI struggles when the anchor disappears

Continue reading

SaaSmageddon: Will AI Eat the Software Business?

A big narrative for the past fifteen years has been that “software is eating the world.” This described a transformative shift where digital software companies disrupted traditional industries, such as retail, transportation, entertainment and finance, by leveraging cloud computing, mobile technology, and scalable platforms. This prophecy has largely come true, with companies like Amazon, Netflix, Uber, and Airbnb redefining entire sectors. Who takes a taxi anymore?

However, the narrative is now evolving. As generative AI advances, a new phase is emerging: “AI is eating software.”  Analysts predict that AI will replace traditional software applications by enabling natural language interfaces and autonomous agents that perform complex tasks without needing specialized tools. This shift threatens the $200 billion SaaS (Software-as-a-Service) industry, as AI reduces the need for dedicated software platforms and automates workflows previously reliant on human input. 

A recent jolt here has been the January 30 release by Anthropic of plug-in modules for Claude, which allow a relatively untrained user to enter plain English commands (“vibe coding”) that direct Claude to perform role-specific tasks like contract review, financial modeling, CRM integration, and campaign drafting.  (CRM integration is the process of connecting a Customer Relationship Management system with other business applications, such as marketing automation, ERP, e-commerce, accounting, and customer service platforms.)

That means Claude is doing some serious heavy lifting here. Currently, companies pay big bucks yearly to “enterprise software” firms like SAP and ServiceNow (NOW) and Salesforce to come in and integrate all their corporate data storage and flows. This must-have service is viewed as really hard to do, requiring highly trained specialists and proprietary software tools. Hence, high profit margins for these enterprise software firms.

 Until recently, these firms been darlings of the stock market. For instance, as of June, 2025, NOW was up nearly 2000% over the past ten years. Imagine putting $20,000 into NOW in 2015, and seeing it mushroom to nearly $400,000.  (AI tells me that $400,000 would currently buy you a “used yacht in the 40 to 50-foot range.”)

With the threat of AI, and probably with some general profit-taking in the overheated tech sector, the share price of these firms has plummeted. Here is a six-month chart for NOW:

Source: Seeking Alpha

NOW is down around 40% in the past six months. Most analysts seem positive, however, that this is a market overreaction. A key value-add of an enterprise software firm is the custody of the data itself, in various secure and tailored databases, and that seems to be something that an external AI program cannot replace, at least for now. The capability to pull data out and crunch it (which AI is offering) it is kind of icing on the cake.

Firms like NOW are adjusting to the new narrative, by offering pay-per-usage, as an alternative to pay-per-user (“seats”). But this does not seem to be hurting their revenues. These firms claim that they can harness the power of AI (either generic AI or their own software) to do pretty much everything that AI claims for itself. Earnings of these firms do not seem to be slowing down.

With the recent stock price crash, the P/E for NOW is around 24, with a projected earnings growth rate of around 25% per year. Compared to, say, Walmart with a P/E of 45 and a projected growth rate of around 10%, NOW looks pretty cheap to me at the moment.

(Disclosure: I just bought some NOW. Time will tell if that was wise.)

Usual disclaimer: Nothing here should be considered advice to buy or sell any security.

Economic Impacts of Weather Apps Exaggerating Storm Dangers

Snowmageddon!! Over 20 inches of snow!!! That is what we in the mid-Atlantic should expect on Sat-Sun Jan 24-25 according to most weather apps, as of 9-10 days ahead of time.  Of course, that kept us all busy checking those apps for the next week. As of Wednesday, I was still seeing numbers in the high teens in most cases, using Washington, D.C. as a representative location. But my Brave browser AI search proved its intelligence on Wednesday by telling me, with a big yellow triangle warning sign:

 Note: Apps and social media often display extreme snow totals (e.g., 23 inches) that are not yet supported by consensus models. Experts recommend preparing for 6–12 inches as a realistic baseline, with the potential for more.

“Huh,” thought I. Well, duh, the more scared they make us, the more eyeballs they get and the more ad revenue they generate. Follow the money…

Unfortunately, I did not log exactly who said what when last week. My recollection is that weather.com was still predicting high teens snowfall as of Thursday, and the Apple weather app was still saying that as of Friday. The final total for D.C. was about 7.5 inches for winter storm Fern. In fairness, some very nearby areas got 9-10 inches, and it ended up being dense sleet rather than light fluffy snow. But there was still a pretty big mismatch.

Among the best forecasters I found was AccuWeather. They showed a short table of probabilities that centered on (as I recall) 6-10”, with some chances for higher and for lower, that let you decide whether to prepare for a low probability/high impact scenario. It seems that the Apple weather app is notoriously bad: instead of integrating several different forecast models like some other apps (and like your local talking head meteorologist), it apparently just spits out the results of one model:

The core issue is that many weather apps, including Apple Weather, display raw data from individual forecast models without the context and analysis that professional meteorologists provide. While meteorologists at the National Weather Service balance multiple computer models, dozens of simulations and their own expertise to create forecasts, apps often pull from a single source and deliver it directly to users.

“Everything that catches attention is mostly nonsense,” said Eric Fisher, chief meteorologist for WBZ-TV in Boston. He points to the viral snowfall maps that spread on social media, noting that extreme forecasts generate the most attention even when they may not be the most accurate.

Anyway, I tried to poke around and find out in dollar terms how much it benefits the weather apps to exaggerate storm dangers. I was unsuccessful there, but by playing with query wording, I was able to coax out of ChatGPT some numbers on how much these exaggerations may cost the rest of us. (A more qualitative search noted, among other things, that the “crying wolf” cost of people becoming jaded to alarmist forecasts may lead them to discount genuine warnings; and I will add it is inconvenient to find the stores out of things because of panic buying).

Everything below the line of asterisks is just a straight dump from the AI, since I could not readily improve on its presentation:

****************************************************************

The economic impact of weather apps exaggerating storm dangers isn’t easy to quantify precisely because most formal studies focus on weather warnings overall (from official sources like the National Weather Service), not specifically on private weather apps exaggerating risk. However, economics research on false alarms, over-warning, and exaggerated alerts offers concrete dollar figures that we can use to approximate the kinds of costs involved. Below are the key economic impacts with as many dollar figures as available:


💸 1. Opportunity Costs of False or Overbroad Storm Warnings

Time cost of false warnings:

  • Traditional county-wide tornado warnings (over-broad alerts) have historically forced many people to shelter unnecessarily, resulting in ≈234 million person-hours per year spent under warnings (1996–2004) — valued at roughly $2.7 billion annually in lost time/productivity in the U.S. alone.
  • Improved storm-based warnings that reduce false warning area could save up to $750 million–$1.9 billion/year in avoided lost time.

This demonstrates that exaggerated alerts — such as those sometimes amplified or presented without clear context by weather apps — can force unnecessary economic shutdowns of work and business activity.


🏙️ 2. Evacuation and Preparedness Costs from False Alarms

Evacuation costs:

  • A NOAA analysis estimates that evacuation costs for hurricanes, when done broadly for storm-warnings, can reach about $1.4 billion annually in the U.S., with roughly $1 billion being unnecessary cost due to false alarms (people evacuating where no damaging winds ultimately occur).

Evacuations are one of the most expensive forms of response — from lost wages to lodging and fuel costs — and exaggerated risk messaging can trigger them prematurely or more frequently.


🏭 3. Business Productivity Losses from False Alerts

Example from private sector:

  • A building management firm reported that by reducing false storm alerts (better targeted data rather than broad app warnings), it saved ~$10 million in lost production time in one year alone.

This is a real dollar example showing how over-warning impacts businesses directly when they preemptively shut down operations.


💼 4. Broader Firm-Level Economic Avoidance

Cost avoidance from better warning info:

  • A study of nearly 500 firms found that switching from deterministic (binary “storm/no storm”) warnings to probabilistic information could produce $2.3 billion to $7.6 billion per year in avoided costs for businesses, thanks to better decision-making.

While this doesn’t measure exaggeration specifically, it underscores the economic penalty of imprecise or alarmist warnings, which many weather apps unintentionally echo.


📉 5. Insurance & Property Impacts

Although not directly about exaggerated warnings, inaccurate perceptions of storm danger influence behavior and pricing:

  • Insurance premiums rising: In storm-prone states (e.g., Florida), homeowners’ premiums have risen by ~57% or more because of risk perceptions and loss patterns tied to severe weather.
  • Losses from storms: Billion-dollar disasters are common; for example, Hurricane Milton in 2024 had estimated total costs of $21–$85 billion (depending on source).

While not directly caused by exaggeration, inflated perceptions of danger (from apps or media) can affect insurance markets and property values, contributing indirectly to economic impacts.


📈 6. Net Benefits vs. Exaggeration

It’s important to balance:

  • Early warnings overall clearly save money and lives. Improved forecasts and early alert systems have been estimated to avoid $300 million to $2 billion per year in asset losses and provide up to $32 billion/year in broader economic benefits (through optimized activities and prep).

This highlights that the problem isn’t warnings per se, but inaccurate or exaggerated warnings that cause over-response.

The Hot Social Network Is… LinkedIn?

So says the Wall Street Journal. They have data to back it up:

Plus quotes from yours truly:

Even before Elon Musk gutted X’s content moderation, James Bailey was tired of the shouting. “It’s like a cursed artifact that gives you great power to keep up with what’s going on, but at the cost of subtly corrupting your soul,” said the 38-year-old Providence College economics professor.

He retreated. This year, he realized he was spending five to 10 minutes a day on a site he used to ignore.

The WSJ reporter contacted me after seeing my previous post about LinkedIn here, explaining how I think LinkedIn has improved as a way to share and read articles, and was always good as a way to keep up with former students. Just in the short time since the WSJ article came out, I finally used LinkedIn for one of its official purposes, hiring, where it worked wonders helping to fill a last-minute vacancy.

If you don’t trust me or the WSJ to identify the hot social network, lets see what the actual cool kids are up to