SaaSmageddon: Will AI Eat the Software Business?

A big narrative for the past fifteen years has been that “software is eating the world.” This described a transformative shift where digital software companies disrupted traditional industries, such as retail, transportation, entertainment and finance, by leveraging cloud computing, mobile technology, and scalable platforms. This prophecy has largely come true, with companies like Amazon, Netflix, Uber, and Airbnb redefining entire sectors. Who takes a taxi anymore?

However, the narrative is now evolving. As generative AI advances, a new phase is emerging: “AI is eating software.”  Analysts predict that AI will replace traditional software applications by enabling natural language interfaces and autonomous agents that perform complex tasks without needing specialized tools. This shift threatens the $200 billion SaaS (Software-as-a-Service) industry, as AI reduces the need for dedicated software platforms and automates workflows previously reliant on human input. 

A recent jolt here has been the January 30 release by Anthropic of plug-in modules for Claude, which allow a relatively untrained user to enter plain English commands (“vibe coding”) that direct Claude to perform role-specific tasks like contract review, financial modeling, CRM integration, and campaign drafting.  (CRM integration is the process of connecting a Customer Relationship Management system with other business applications, such as marketing automation, ERP, e-commerce, accounting, and customer service platforms.)

That means Claude is doing some serious heavy lifting here. Currently, companies pay big bucks yearly to “enterprise software” firms like SAP and ServiceNow (NOW) and Salesforce to come in and integrate all their corporate data storage and flows. This must-have service is viewed as really hard to do, requiring highly trained specialists and proprietary software tools. Hence, high profit margins for these enterprise software firms.

 Until recently, these firms been darlings of the stock market. For instance, as of June, 2025, NOW was up nearly 2000% over the past ten years. Imagine putting $20,000 into NOW in 2015, and seeing it mushroom to nearly $400,000.  (AI tells me that $400,000 would currently buy you a “used yacht in the 40 to 50-foot range.”)

With the threat of AI, and probably with some general profit-taking in the overheated tech sector, the share price of these firms has plummeted. Here is a six-month chart for NOW:

Source: Seeking Alpha

NOW is down around 40% in the past six months. Most analysts seem positive, however, that this is a market overreaction. A key value-add of an enterprise software firm is the custody of the data itself, in various secure and tailored databases, and that seems to be something that an external AI program cannot replace, at least for now. The capability to pull data out and crunch it (which AI is offering) it is kind of icing on the cake.

Firms like NOW are adjusting to the new narrative, by offering pay-per-usage, as an alternative to pay-per-user (“seats”). But this does not seem to be hurting their revenues. These firms claim that they can harness the power of AI (either generic AI or their own software) to do pretty much everything that AI claims for itself. Earnings of these firms do not seem to be slowing down.

With the recent stock price crash, the P/E for NOW is around 24, with a projected earnings growth rate of around 25% per year. Compared to, say, Walmart with a P/E of 45 and a projected growth rate of around 10%, NOW looks pretty cheap to me at the moment.

(Disclosure: I just bought some NOW. Time will tell if that was wise.)

Usual disclaimer: Nothing here should be considered advice to buy or sell any security.

Joy on the Anthropic Copyright Settlement

I’m at Econlog this week with:

The Anthropic Settlement: A $1.5 Billion Precedent for AI and Copyright

There are two main questions. Will AI companies need to pay compensation to authors they are currently training off of? Secondly, how important is it for human writing to be a paying career in the future, if AI continues to need good new material to train from?

There is more at the link but here are some quotes:

If human writing ceases to be a viable career due to inadequate compensation, will LLMs lose access to fresh, high-quality training data? Could this create a feedback loop where AI models, trained on degraded outputs, stagnate?

This case also blurs the traditional divide between copyright and patents. Copyrighted material, once seen as static, now drives “follow-on” innovation derived from the original work. That is, the copyright protection in this case affects AI-content influenced by the copyrighted material in a way that previously applied to new technology that built on patented technical inventions. Thus, “access versus incentives” theory applies to copyright as much as it used to apply to patents. The Anthropic settlement signals that intellectual property law, lagging behind AI’s rapid evolution, must adapt.

Writing Humanity’s Last Exam

When every frontier AI model can pass your tests, how do you figure out which model is best? You write a harder test.

That was the idea behind Humanity’s Last Exam, an effort by Scale AI and the Center for AI Safety to develop a large database of PhD-level questions that the best AI models still get wrong.

The effort has proven popular- the paper summarizing it has already been cited 91 times since its release on March 31st, and the main AI labs have been testing their new models on the exam. xAI announced today that its new Grok 4 model has the highest score yet on the exam, 44.4%.

Current leaderboard on the Humanity’s Last Exam site, not yet showing Grok 4

The process of creating the dataset is a fascinating example of a distributed academic mega-project, something that is becoming a trend that has also been important in efforts to replicate previous research. The organizers of Humanity’s Last Exam let anyone submit a question for their dataset, offering co-authorship to anyone whose question they accepted, and cash prizes to those who had the best questions accepted. In the end they wound up with just over 1000 coauthors on the paper (including yours truly as one very minor contributor), and gave out $500,000 to contributors of the very best questions (not me), which seemed incredibly generous until Scale AI sold a 49% stake in their company to Meta for $14.8 billion in June.

Source: Figure 4 of the paper

Here’s what I learned in the process of trying to stump the AIs and get questions accepted into this dataset:

  1. The AIs were harder than I expected to stump because they used frontier models rather than the free-tier models I was used to using on my own. If you think AI can’t answer your question, try a newer model
  2. It was common for me to try a question that several models would get wrong, but at least one would still get right. For me this was annoying because questions could only be accepted if every model got them wrong. But of course if you want to get a correct answer, this means trying more models is good, even if they are all in the same tier. If you can’t tell what a correct answer looks like and your question is important, make sure to try several models and see if they give different answers
  3. Top models are now quite good at interpreting regression results, even when you try to give them unusually tricky tables
  4. AI still has weird weaknesses and blind spots; it can outperform PhDs in the relevant field on one question, then do worse than 3rd graders on the next. This exam specifically wanted PhD-level questions, where a typical undergrad not only couldn’t answer the question, but probably couldn’t even understand what was being asked. But it specifically excluded “simple trick questions”, “straightforward calculation/computation questions”, and questions “easily answerable by everyday people”, even if all the AIs got them wrong. My son had the idea to ask them to calculate hyperfactorials; we found some relatively low numbers that stumped all the AI models, but the human judges ruled that our question was too simple to count. On a question I did get accepted, I included an explanation for the human judges of why I thought it wasn’t too simple.

I found this to be a great opportunity to observe the strengths and weaknesses of frontier models, and to get my name on an important paper. While the AI field is being driven primarily by the people with the chops to code frontier models, economists still have lot we can contribute here, as Joy has shown. Any economist looking for the next way to contribute here should check out Anthropic’s new Economic Futures Program.