Truth: The Strength and Weakness of AI Coding

There was a seismic shift in the AI world recently. In case you didn’t know, a Claude Code update was released just before the Christmas break. It could code awesomely and had a bigger context window, which is sort of like memory and attention span. Scott Cunningham wrote a series of posts demonstrating the power of Claude Code in ways that made economists take notice. Then, ChatGPT Codex was updated and released in January as if to say ‘we are still on the frontier’. The battle between Claude Code and Codex is active as we speak.

The differentiation is becoming clearer, depending on who you talk to. Claude Code feels architectural. It designs a project or system and thrives when you hand it the blueprint and say “Design this properly.” It’s your amazingly productive partner. Codex feels like it’s for the specialist. You tell it exactly what you want. No fluff. No ornamental abstraction unless you request it.

Codex flourishes with prompts like “Refactor this function to eliminate recursion”, or “ Take this response data and apply the Bayesian Dawid-Skene method. It does exactly that. It assumes competence on your part and does not attempt to decorate the output. It assumes that you know what you’re doing. It’s like your RA that can do amazing things if you tell it what task you want completed. Having said all of this, I’ve heard the inverse evaluations too. It probably matters a lot what the programmer brings to the table.

Both Claude Code and Codex are remarkably adept at catching code and syntax errors. That is not mysterious. Code is valid or invalid. The AI writes something, and the environment immediately reveals whether it conforms to the rules. Truth is embedded in the logical structure. When a single error appears, correction is often trivial.

When multiple errors appear, the problem becomes combinatorial. Fix A? Fix B? Change the type? Modify the loop? There are potentially infinite branching possibilities. Even then, the space is constrained. The code must run, or time out. That constraint disciplines the search. The reason these models code so well is that the code itself is the truth. So long as the logic isn’t violated, the axioms lead to the result. The AI anchors on the code to be internally consistent. The model can triangulate because the target is stable and verifiable.

AI struggles when the anchor disappears

Continue reading

Commodity Sports

I’m trying to coin “Commodity Sports” as the term to refer to sports betting that takes place on exchanges regulated by the US Commodity Futures Trading Commission, as opposed to sports betting that takes place through casinos regulated by state gaming commissions. So far it seems to be working alright, I haven’t convinced Gemini but have got the top spot in traditional Google search:

That article- Will Commodity Sports Last?– is my first at EconLog. I’m happy to get a piece onto one of the oldest economics blogs, one where I was reading Arnold Kling’s takes on the Great Recession in real time, where I was introduced to Bryan Caplan’s writing before I read his books, and where Scott Sumner wrote for many years (though I started reading him at The Money Illusion before that).

The key idea of the piece, other than the legal oddity of sports betting sharing a legal category with corn futures, is that the Commodity Sports category is being pioneered by prediction markets like Kalshi. As readers here will know, I like prediction markets:

I love that CFTC-regulated exchanges like Kalshi and Polymarket are bringing prediction markets to the mainstream. The true value of prediction markets is to aggregate information dispersed across the world into a single number that represents the most accurate forecast of the future.

But I’m not so excited to see them expanding into sports:

Although I see huge value in prediction markets when they are offering more accurate forecasts on important issues that help policymakers, businesses, and individuals make more informed plans for our future (e.g., Which world leaders will leave office this year?, or Which countries will have a recession?)… I see much less value in having a more accurate forecast of how many receptions Jaxon Smith-Njigba will have.

Like Robin Hanson, I worry that the legal battles against Commodity Sports and the brewing cultural backlash against sports betting risk taking the most informative prediction markets down along with it.

The full piece is here.

SaaSmageddon: Will AI Eat the Software Business?

A big narrative for the past fifteen years has been that “software is eating the world.” This described a transformative shift where digital software companies disrupted traditional industries, such as retail, transportation, entertainment and finance, by leveraging cloud computing, mobile technology, and scalable platforms. This prophecy has largely come true, with companies like Amazon, Netflix, Uber, and Airbnb redefining entire sectors. Who takes a taxi anymore?

However, the narrative is now evolving. As generative AI advances, a new phase is emerging: “AI is eating software.”  Analysts predict that AI will replace traditional software applications by enabling natural language interfaces and autonomous agents that perform complex tasks without needing specialized tools. This shift threatens the $200 billion SaaS (Software-as-a-Service) industry, as AI reduces the need for dedicated software platforms and automates workflows previously reliant on human input. 

A recent jolt here has been the January 30 release by Anthropic of plug-in modules for Claude, which allow a relatively untrained user to enter plain English commands (“vibe coding”) that direct Claude to perform role-specific tasks like contract review, financial modeling, CRM integration, and campaign drafting.  (CRM integration is the process of connecting a Customer Relationship Management system with other business applications, such as marketing automation, ERP, e-commerce, accounting, and customer service platforms.)

That means Claude is doing some serious heavy lifting here. Currently, companies pay big bucks yearly to “enterprise software” firms like SAP and ServiceNow (NOW) and Salesforce to come in and integrate all their corporate data storage and flows. This must-have service is viewed as really hard to do, requiring highly trained specialists and proprietary software tools. Hence, high profit margins for these enterprise software firms.

 Until recently, these firms been darlings of the stock market. For instance, as of June, 2025, NOW was up nearly 2000% over the past ten years. Imagine putting $20,000 into NOW in 2015, and seeing it mushroom to nearly $400,000.  (AI tells me that $400,000 would currently buy you a “used yacht in the 40 to 50-foot range.”)

With the threat of AI, and probably with some general profit-taking in the overheated tech sector, the share price of these firms has plummeted. Here is a six-month chart for NOW:

Source: Seeking Alpha

NOW is down around 40% in the past six months. Most analysts seem positive, however, that this is a market overreaction. A key value-add of an enterprise software firm is the custody of the data itself, in various secure and tailored databases, and that seems to be something that an external AI program cannot replace, at least for now. The capability to pull data out and crunch it (which AI is offering) it is kind of icing on the cake.

Firms like NOW are adjusting to the new narrative, by offering pay-per-usage, as an alternative to pay-per-user (“seats”). But this does not seem to be hurting their revenues. These firms claim that they can harness the power of AI (either generic AI or their own software) to do pretty much everything that AI claims for itself. Earnings of these firms do not seem to be slowing down.

With the recent stock price crash, the P/E for NOW is around 24, with a projected earnings growth rate of around 25% per year. Compared to, say, Walmart with a P/E of 45 and a projected growth rate of around 10%, NOW looks pretty cheap to me at the moment.

(Disclosure: I just bought some NOW. Time will tell if that was wise.)

Usual disclaimer: Nothing here should be considered advice to buy or sell any security.

Forecasting An Eventful 2026

May you live in interesting times – apocryphal Chinese curse

In early 2025 I shared forecasts about the economy that turned out to be pretty good. This year, economic forecasts center around a boringly decent year (2.6% GDP growth, inflation below 3%, unemployment stays below 5%, no recession), though with high variance. But forecasts about politics and war foretell a turbulent year.

In the US, midterm elections have a 78% chance to flip control of the House and 35% chance to flip the Senate despite a tough map for Democrats. A midterm wave for the out-of-power party is typical in the US, given that the party in power always seems to over-play their hand and voters quickly get sick them. More surprising is that forecasters give a 44% chance that Donald Trump leaves office before his term is up, and a 16% chance that he leaves office this year. Markets give a 20% chance that he will be removed from office through the impeachment process, so the rest of the 44% would be from health issues or voluntary resignation.

Forecasters at Kalshi predict a greater than even chance that 4 notable world leaders leave office this year:

I find this especially notable because Viktor Orban is the only one who would be removed through regularly scheduled elections. In the UK, Keir Starmer was just elected Prime Minister in 2024 and doesn’t have to face reelection until 2029; but he is so unpopular that his own Labor Party is likely to kick him out of office if local elections in May go as badly as polls indicate. If so, he would join Boris Johnson and Liz Truss as the third British PM in four years to leave office without directly losing an election. The leaders of Cuba and Iran don’t face real elections and would presumably be pushed out by a popular uprising or US military action.

Some other important world leaders will probably stay in office this year, but forecasters still think there is a significant chance they leave: Israel’s Netanyahu (49%), Ukraine’s Zelenskyy (32%), and Russia’s Putin (14%). For the latter two, this belief could be tied to the surprisingly high odds given to a ceasefire in the Russia-Ukraine war this year (45%). Orban leaving office could be tied into this, as Hungary has often vetoed EU support for Ukraine.

Myself, I find most of these market odds to be high, and I’m tempted to make the “nothing ever happens” trade and bet that everyone stays in office. But even if all these markets are 10pp high, it still implies quite an eventful year ahead. Prepare accordingly.

2025 Was A Pretty Good Year for the US Economy — But So Was 2024

While some of the 2025 data is still coming in (such as GDP), we already have much of the core economic data to evaluate the year.

In a recent op-ed, President Trump claimed, “Just over one year ago, we were a ‘DEAD’ country. Now, we are the ‘HOTTEST’ country anywhere in the world!” Of course, every President claims they are doing great things, and Americans are almost exactly evenly divided over whether the economy was better under Biden or Trump — but this mostly just partisanism (Independents are close to evenly split, though).

So what is the truth? I have put together what I think are the best economic indicators to judge how the economy is doing. And what does it tell us? I think the fairest read is that 2025 was a pretty good year, but based on most economic data it was almost identical to 2024.

The only indicator that is clearly better is private-sector job growth in 2024. We might add S&P 500 in 2024 growth too, although some other assets such as gold have performed better in 2025. Inflation in 2025 is a tad lower, but not the massive improvement Trump suggests. This is especially the case for one of his favorite prices, gasoline. Yes, 2025 is a little lower than 2024… just like 2024 was a little lower than 2023.

And what of that greatest of all macroeconomic indicators, GDP? We don’t yet have Q4 data for GDP, which means we don’t have full-year 2025 data yet. But the growth rate of real GDP in 2024 was 2.8%, and betting markets are currently predicting 2.3% for 2025. Betting markets could be wrong! But it seems unlikely it would be much above 2.8% (those same betting markets only think there is a 4% chance it will be over 3.0%).

None of this is to say that the 2024 and 2025 economies are exactly the same. Certainly there is more uncertainty due to the shifting tariff policy, but on the other hand even with that uncertainty the economy is still performing fairly well. And my table above only includes economic outcomes, not any changes to government budgets, nor important social indicators such as crime. These are important too, but my focus in this post is only on the economic data.

It seems that in those surveys about whether the economy is better now or under Biden, it would be useful to offer an “about the same” option. Of course, in 2021-2022 inflation was much worse under Biden — but job growth was much better. A lot of this was baked in from the pandemic, 2020 monetary and fiscal stimulus, etc. Once we were back to a semi-normal economy in 2024, it was a decent year. Not blockbuster, but decent. So was 2025.

Unweighted Bayesians get Eaten By Wolves

A village charges a boy with watching the flock and raising the alarm if wolves show up. The boy decides to have a little fun and shout out false alarms, much to the chagrin of the villagers. Then an actual wolf shows up, the boy shouts his warning, but the villagers are proper Bayesians who, having learned from their mistakes, ignore the boy. The wolves have a field day, eating the flock, the boy, and his entire village.

I may have augmented Aesop’s classic fable with that last bit.

The boy is certainly a crushing failure at his job, but here’s the thing: the village is equally foolish, if not more so. The boy revealed his type, he’s bad at his job, but the village failed to react accordingly. They updated their beliefs but not their institutions. “We were good Bayesians” will look great on their tombstones.

They had three options.

A) Update their belief about the boy and ignore him.

This is what they did and look where that got them. Nine out of ten wolves agree that Good Bayesians are nutritious and delicious.

B) Update their beliefs about the boy, but continue to check on the flock when the boy raises the alarm.

They should have weighted their responses. Much like Pascal taking religion seriously because eternal torment was such a big punishment, you have to weight you expected probability of truth in the alarm against the scale of the downside if it is true. You can’t risk being wrong when it comes to existential threats.

C) Update their beliefs about the boy and immediately replace him with someone more reliable.

It’s all fine and good to be right about the boy being a lying jerk but that doesn’t fix your problem. You need to replace him with someone who can reliably do the job.

So this is a post about fascism. Some think that fascism is already here, others dismiss this as alarmism, others splititng the difference claiming that we are in some state of semi- or quasi-fascism. Within the claims that it is all alarmism, what I hear are the echoes of villagers annoyed by 50 years of claims that conservative politics were riddled with fascism, that Republicans were fascists, that everything they didn’t like was neoliberalism, fascism, or neoliberal fascism. Get called a wolf enough times and you might stop believing that wolves even exist.

Even if I am sympathetic, that doesn’t get you off the hook. It hasn’t been fascism for 50 years will look pretty on your tombstone.

Let’s return to our options

  • A) Don’t believe the people who have been shouting about fascism for years, but take seriously new voices raising the alarm.
  • B) Find a set of people who, exogenous to current events, you would and do trust and take their warnings seriously.
  • C) Don’t believe anyone who shouts fascism, because shouting fascism is itself evidence they are non-serious people.
  • D) Start monitoring the world yourself

Both A) and B) are sensible choices! If you’ve Bayesian updated yourself into not trusting claims of fascism from wide swaths of the commentariat, political leaders, and broader public, that’s fine, but you’ve got to find someone you trust. And if that leads you to a null set, then D) you’re going to have to do it yourself. Good luck with that. It takes a lot of time, expertise, and discipline not to end up the fascism-equivalent of an anti-vaxxer who “did their own research.”

Because let me tell you, C) is the route to perdition in all things Bayesian. Once your beliefs are mired in a recursive loop of confirmation bias, it’s all downhill. Every day will be just a little dumber than the one before. And that’s the real Orwellian curse of fascism.

Drawbacks of Long Term Thinking

This post is just some thoughts about perspective. I apologize for any lack of organization.

My academic influences include North, Weingast, Coase, Hayek, the field of Public Choice, and others. I’m not an ‘adherent’ to any school of thought. Those guys just provided some insights that I find myself often using.

What lessons did they teach? Plenty. When I see the world of firms, governments, and other institutions, I maintain a sharp distinction between intention and outcome. Any given policy that’s enacted is probably not the welfare maximizing one, but rather must keep special interests relatively happy. So, the presence of special interests is a given and doesn’t get me riled up. When I see an imperfect policy outcome, I think about who had to be enticed to vote for it. We live in a world where ‘first bests’ aren’t usually on the table.

Historically, or in lower income countries, I think about violence. Their rules and laws are not operating in a vacuum of peaceful consent. There is always the threat of violence. Laws are enforced (or not) conditional on whether and what type of violence that may result. All of the ideal legislation is irrelevant if theft and fraud are the lay of the land.

I think about institutional evolution with both internal and external pressures. I’m a bit worried about the persistence of the US republic, or at least worried for its pro-growth policies. I’m not worried about China in the long run. I don’t think they have the institutions that get them to ‘high income’ status. I do think that they are a tactical concern in the short run and that the government does/will have access to great volumes of resources in the medium run. That’s a bit of a concern. But like I said, I’m not super worried in the long run.

Continue reading

The Wealth Ladder

The Wealth Ladder is a 2025 personal finance book from data blogger Nick Maggiulli. The core idea is good: that the best financial strategies will be different based on your current wealth level. Maggiulli divides people into 6 net worth levels based on orders of magnitude, from less than $10K to over $100M. The middle of the book has separate chapters with advice for people in each level, so a book that is already a fairly quick and easy read as a whole could be even quicker if you skipped the chapters about levels other than your own.

The beginning of the book tries to develop some simple rules phrased in a way that they can apply across every level, because they are based on a percentage of your net worth. I like the idea but don’t think it really worked. His “1% Rule” says you should only accept an opportunity to earn money if it will increase your net worth by at least 1%. But in practice, whether an earning opportunity is worth your time depends less on how many absolute dollars in generates as a % of your net worth, and more on how many $ per hour it generates. The “0.01% Rule” (don’t worry about spending money on anything that costs less than 0.01% of your net worth) is better. But whether it is a good rule for you will depend on your age and income.

In short, while tailoring his advice in 6 different ways for the 6 wealth levels of his ladder is an improvement on one-size-fits all personal finance books, even this much tailoring isn’t enough. Having a $1 million net worth is normal for a household in their 60s but would be exceptional for one in their 20’s; and vice-versa for a household with under $10k net worth. Chapter 10 explains the data on this well, but it kind of undermines the ideas of the previous chapters. Households with the same net worth should be making very different decisions in their 20s vs 60s.

The strongest part of the book is the use of data from the Survey of Consumer Finances and the Panel Study of Income Dynamics to show how people differ by wealth level and how people move from one level to another. For instance, he shows that the poor have most of their wealth in cash and vehicles; the middle class in homes; the wealthy in retirement accounts and stocks; the very rich in private businesses. Americans tend to climb the wealth ladder slowly but steadily; over 10 years they are twice as likely to move up the ladder as to move down; over 20 years, 3 times as likely. The median person who made it to one of the top 3 rings (i.e. the median millionaire) is in their 60s.

If you get ahold of a copy of the book it’s definitely worthwhile to flip through all the tables and figures, but I won’t be adding to to my short list of the best personal finance books. The core metaphor of the ladder carriers the implicit assumption that everyone should be trying to get to the top of the ladder. But if someone is satisfied with less than $10 million, why should they take on lots of time and effort and risk to start a business for a small chance to go over $100 million?

Continue reading

Economic Impacts of Weather Apps Exaggerating Storm Dangers

Snowmageddon!! Over 20 inches of snow!!! That is what we in the mid-Atlantic should expect on Sat-Sun Jan 24-25 according to most weather apps, as of 9-10 days ahead of time.  Of course, that kept us all busy checking those apps for the next week. As of Wednesday, I was still seeing numbers in the high teens in most cases, using Washington, D.C. as a representative location. But my Brave browser AI search proved its intelligence on Wednesday by telling me, with a big yellow triangle warning sign:

 Note: Apps and social media often display extreme snow totals (e.g., 23 inches) that are not yet supported by consensus models. Experts recommend preparing for 6–12 inches as a realistic baseline, with the potential for more.

“Huh,” thought I. Well, duh, the more scared they make us, the more eyeballs they get and the more ad revenue they generate. Follow the money…

Unfortunately, I did not log exactly who said what when last week. My recollection is that weather.com was still predicting high teens snowfall as of Thursday, and the Apple weather app was still saying that as of Friday. The final total for D.C. was about 7.5 inches for winter storm Fern. In fairness, some very nearby areas got 9-10 inches, and it ended up being dense sleet rather than light fluffy snow. But there was still a pretty big mismatch.

Among the best forecasters I found was AccuWeather. They showed a short table of probabilities that centered on (as I recall) 6-10”, with some chances for higher and for lower, that let you decide whether to prepare for a low probability/high impact scenario. It seems that the Apple weather app is notoriously bad: instead of integrating several different forecast models like some other apps (and like your local talking head meteorologist), it apparently just spits out the results of one model:

The core issue is that many weather apps, including Apple Weather, display raw data from individual forecast models without the context and analysis that professional meteorologists provide. While meteorologists at the National Weather Service balance multiple computer models, dozens of simulations and their own expertise to create forecasts, apps often pull from a single source and deliver it directly to users.

“Everything that catches attention is mostly nonsense,” said Eric Fisher, chief meteorologist for WBZ-TV in Boston. He points to the viral snowfall maps that spread on social media, noting that extreme forecasts generate the most attention even when they may not be the most accurate.

Anyway, I tried to poke around and find out in dollar terms how much it benefits the weather apps to exaggerate storm dangers. I was unsuccessful there, but by playing with query wording, I was able to coax out of ChatGPT some numbers on how much these exaggerations may cost the rest of us. (A more qualitative search noted, among other things, that the “crying wolf” cost of people becoming jaded to alarmist forecasts may lead them to discount genuine warnings; and I will add it is inconvenient to find the stores out of things because of panic buying).

Everything below the line of asterisks is just a straight dump from the AI, since I could not readily improve on its presentation:

****************************************************************

The economic impact of weather apps exaggerating storm dangers isn’t easy to quantify precisely because most formal studies focus on weather warnings overall (from official sources like the National Weather Service), not specifically on private weather apps exaggerating risk. However, economics research on false alarms, over-warning, and exaggerated alerts offers concrete dollar figures that we can use to approximate the kinds of costs involved. Below are the key economic impacts with as many dollar figures as available:


💸 1. Opportunity Costs of False or Overbroad Storm Warnings

Time cost of false warnings:

  • Traditional county-wide tornado warnings (over-broad alerts) have historically forced many people to shelter unnecessarily, resulting in ≈234 million person-hours per year spent under warnings (1996–2004) — valued at roughly $2.7 billion annually in lost time/productivity in the U.S. alone.
  • Improved storm-based warnings that reduce false warning area could save up to $750 million–$1.9 billion/year in avoided lost time.

This demonstrates that exaggerated alerts — such as those sometimes amplified or presented without clear context by weather apps — can force unnecessary economic shutdowns of work and business activity.


🏙️ 2. Evacuation and Preparedness Costs from False Alarms

Evacuation costs:

  • A NOAA analysis estimates that evacuation costs for hurricanes, when done broadly for storm-warnings, can reach about $1.4 billion annually in the U.S., with roughly $1 billion being unnecessary cost due to false alarms (people evacuating where no damaging winds ultimately occur).

Evacuations are one of the most expensive forms of response — from lost wages to lodging and fuel costs — and exaggerated risk messaging can trigger them prematurely or more frequently.


🏭 3. Business Productivity Losses from False Alerts

Example from private sector:

  • A building management firm reported that by reducing false storm alerts (better targeted data rather than broad app warnings), it saved ~$10 million in lost production time in one year alone.

This is a real dollar example showing how over-warning impacts businesses directly when they preemptively shut down operations.


💼 4. Broader Firm-Level Economic Avoidance

Cost avoidance from better warning info:

  • A study of nearly 500 firms found that switching from deterministic (binary “storm/no storm”) warnings to probabilistic information could produce $2.3 billion to $7.6 billion per year in avoided costs for businesses, thanks to better decision-making.

While this doesn’t measure exaggeration specifically, it underscores the economic penalty of imprecise or alarmist warnings, which many weather apps unintentionally echo.


📉 5. Insurance & Property Impacts

Although not directly about exaggerated warnings, inaccurate perceptions of storm danger influence behavior and pricing:

  • Insurance premiums rising: In storm-prone states (e.g., Florida), homeowners’ premiums have risen by ~57% or more because of risk perceptions and loss patterns tied to severe weather.
  • Losses from storms: Billion-dollar disasters are common; for example, Hurricane Milton in 2024 had estimated total costs of $21–$85 billion (depending on source).

While not directly caused by exaggeration, inflated perceptions of danger (from apps or media) can affect insurance markets and property values, contributing indirectly to economic impacts.


📈 6. Net Benefits vs. Exaggeration

It’s important to balance:

  • Early warnings overall clearly save money and lives. Improved forecasts and early alert systems have been estimated to avoid $300 million to $2 billion per year in asset losses and provide up to $32 billion/year in broader economic benefits (through optimized activities and prep).

This highlights that the problem isn’t warnings per se, but inaccurate or exaggerated warnings that cause over-response.