What Will End The AI Bull Market?

It’s feeling like the late ’90s, with an impressive new technology pushing tech stocks and the broader US market to all-time highs. Retail investors are using new platforms to get in on the action, tech companies are doing more IPOs to take advantage of the higher stock prices, and other companies are trying to boost their stocks by saying they are pivoting to the new technology (though often they aren’t really changing).

The excitement drives valuations to record levels:

Shiller CAPE Ratio

In the ’90s, the internet really was a transformational new technology that would enable lots of profitable new companies. But the market got ahead of itself, a bubble that led to a crash- the S&P fell by almost half, while the tech-heavy NASDAQ fell by over 3/4 and took 15 years to recover.

History rhymes, but it doesn’t repeat exactly. I don’t currently expect a big crash driven by AI stocks; it helps that unlike in the ’90s, many of the big players are currently profitable. But I also don’t expect the NASDAQ to keep posting 20+% returns every year.

If the AI bull market doesn’t end in a dramatic crash, how will it end? It’s already shrugged off a war. A US recession is unlikely this year, though plausible next year.

The end I see slowly approaching comes from crowding out. What Robert Solow said about computers in 1987 is true about AI today: you see the AI age everywhere except the productivity statistics. There’s only so much money to go around in markets when productivity growth is unexceptional and savings rates are falling.

We’re already seeing the war hit certain markets (if not US stocks). Iran’s gulf neighbors are now putting lots of money into missile defense, money they now won’t be spending on data centers or gold (down 16% from pre-war), and everyone else has to spend more on oil.

Interest rates have been rising- partly due to central bank attempts to fight inflation, partly due to ongoing high rates of government borrowing, and partly due to financing the AI buildout itself. Higher rates make it more expensive for companies to invest in the physical AI buildout, and make investors discount future AI revenues more while making bonds a more attractive substitute for stocks today. 10-year TIPS now yield 2% over the inflation rate, a sharp contrast to the 2021 stock boom when they yielded less than inflation. If I were older I’d be loading up on TIPS, and even at 38 I’m starting to get tempted.

Trying to call the top exactly is a fool’s errand, but if I were feeling foolish, I’d point to the big upcoming IPOs. SpaceX just filed for an IPO that would be the biggest ever both for the amount of money raised ($75 billion) and the total company valuation ($1.77 trillion). This shatters the previous records for the biggest overall raise ($29 billion raised by Saudi Aramco when it went public in 2019) and the biggest raise by an American company ($18 billion raised by Visa in 2008). OpenAI and Anthropic are likely to follow with IPOs that would also break the previous records- making 3 companies each trying to raise more than the $45 billion raised by the entire US IPO market in 2025. Even if the process of going public doesn’t reveal any flaws in the companies, that money has to come from somewhere- and it takes up a substantial proportion of all net inflows to US stocks in a typical year (IPOs plus new money into existing stocks).

In short- where will the money come from? What are investors going to sell in order to buy into these IPOs? Technically they could do it all with cash, but I think it’s at least plausible that they start selling other stocks. The selling pressure will continue after the IPOs as employees of the newly-public companies see their stocks vest and other early investors become able to sell off.

I’m not trying to time the market. Even if this is a ’90s re-run, we could easily still be in the 1998 buildup, not the 2000 peak and crash. But I am diversifying. US stocks are currently the world’s most expensive. Investors value US stocks that highly because there’s a real chance that US companies are profitably building the technologies that will drive the future. But there’s also a real chance they aren’t– and if that state of the world comes to pass, I’d prefer to own a significant chunk of bonds, foreign stocks, and real assets.

Claude Mythos Is Such a Dangerous Hacker Engine That Anthropic Has Withheld Broad Release

The latest AI model from Anthropic is so powerful that they don’t dare release it to the public. It is such a threat that Jay Powell and Scott Bessant summoned the major bank CEOs to a meeting last week to warn them about it. In line with Anthropic’s “helpful, honest, and harmless” motto, they have released it only to their Project Glasswing partners. These are organizations like AWS, Apple, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, who have been granted access to the model to identify and patch vulnerabilities in critical software.

Mythos is designed to identify and exploit vulnerabilities in software systems when prompted. Its specialty is identifying critical software vulnerabilities and bugs, but it can also assemble sophisticated exploits.

What makes Mythos particularly unsettling is that its most dangerous capabilities were not deliberately engineered. Anthropic’s team made it clear that they did not explicitly train Mythos to have these capabilities. Instead, they “emerged.”

Internal testing revealed that Mythos has already uncovered thousands of weak points in “every major operating system and web browser.” The implications are disturbing. Claude Mythos has autonomously discovered thousands of zero-day vulnerabilities in major operating systems and web browsers— flaws that human security researchers, working for years, had never detected. (see also here and here for examples).

Mythos can rapidly uncover hidden flaws in the codes of organizations and software development firms, but it also raises the fear that attackers could find those vulnerabilities first. Much of the underlying software that Mythos can scan supports banking, retail, airlines, hospitals, and critical utilities. Regulators worry that if Mythos, or models like it, fell into the wrong hands, “systemically important” banks and even entire financial networks could be compromised before institutions even knew they were exposed.

Anthropic launched Project Glasswing in April 2026 to collaborate with tech giants and banks to identify and fix vulnerabilities before they can be exploited.   This year, organizations should expect a large influx of AI-discovered hack points in critical software. The game plan is to use AI tools to patch the vulnerabilities it discovers. Your venerable legacy system is no longer safe. What AI can expose, it can also fix. We hope.

Ray Kurzweil predicted The Singularity (when artificial intelligence growth accelerates beyond human control) would arrive in 2045, but we might be closing in on it ahead of schedule.

Oops: Anthropic Accidently Leaked the Entire Code for Its “Claude Code” Program

One of Anthropic’s biggest wins has been its wildly-popular Claude Code program, that can do nearly all the grunt work of programming. Properly prompted, it can build new features, migrate databases, fix errors, and automate workflows.

So, it was big news in the AI world last week when an Anthropic employee accidently exposed a link that allowed folks to download the source code for this crown jewel – – the entire code, all 512,000 lines of it, which revealed the complete logic flow of the program, down to the tiniest features. For instance, Claude Code scans for profanity and negative phrases like “this sucks” to discern user sentiment, and tries to adjust for user frustration.

Gleeful researchers, competitors, and hackers promptly downloaded zillions of copies. Anthropic issued broad copyright takedown requests, but the damage was done. Researchers quickly used AI to rewrite the original TypeScript source code into Python and Rust, claiming to get around copyright laws on the original code. Oh, the irony: for years, AI purveyors have been arguing that when they ingest the contents of every published work (including copyrighted works) and repackage them, that’s OK. So now Anthropic is tasting the other side of that claim.

The leak has been damaging to Anthropic to some degree. Competitors don’t have to work to try to reverse engineer Claude Code, since now they know exactly how it works. Hackers have been quick to exploit vulnerabilities revealed by the leak. And Anthropic’s claim to be all about “Safety First” has been tarnished.

On the other hand, the model weights weren’t exposed, so you can’t just run the leaked code and get Claude’s results. Also, no customer data was revealed. Power users have been able to discern from the source how to run Claude Code most advantageously. This YouTube by Nick Puru discussed such optimizations, which he summarized in this roadmap:

There have actually been a number of unexpected benefits of the leak for Anthropic. Per AI:

Brand resonance and community engagement have surged, with some observers calling the incident “peak anthropic energy” that generated significant hype and validated the product’s technical impressiveness.  The leak has acted as a massive free marketing campaign, reinforcing the narrative of a fast-moving, innovative company while bouncing the brand back among developers despite the security lapse. 

Accelerated ecosystem adoption and bug fixing are also potential benefits, as the exposure allowed engineers to dissect the agentic harness and create open-source versions or “harnesses” that keep users within the Anthropic ecosystem. Additionally, the public scrutiny likely helps identify and patch vulnerabilities faster, while the leaked source maps provide a roadmap for competitors to build “Claude-like” agents, potentially standardizing the market around Anthropic’s architectural patterns.

The leak also revealed hidden roadmap features that build anticipation, such as:

  • Kairos: A persistent background daemon for continuous operation. 
  • Proactive Mode: A feature allowing the AI to act without explicit user prompts. 
  • Terminal Pets: Playful, personality-driven interfaces to increase user engagement.

Because of these benefits, conspiracy theorists have proposed that Anthropic leaked the code on purpose, or even (April Fools!) leaked fake code. Fact checkers have come to the rescue to debunk the conspiracy claims. But in the humans vs. AI competency debate, this whole kerfuffle doesn’t make humans look so great.

SaaSmageddon: Will AI Eat the Software Business?

A big narrative for the past fifteen years has been that “software is eating the world.” This described a transformative shift where digital software companies disrupted traditional industries, such as retail, transportation, entertainment and finance, by leveraging cloud computing, mobile technology, and scalable platforms. This prophecy has largely come true, with companies like Amazon, Netflix, Uber, and Airbnb redefining entire sectors. Who takes a taxi anymore?

However, the narrative is now evolving. As generative AI advances, a new phase is emerging: “AI is eating software.”  Analysts predict that AI will replace traditional software applications by enabling natural language interfaces and autonomous agents that perform complex tasks without needing specialized tools. This shift threatens the $200 billion SaaS (Software-as-a-Service) industry, as AI reduces the need for dedicated software platforms and automates workflows previously reliant on human input. 

A recent jolt here has been the January 30 release by Anthropic of plug-in modules for Claude, which allow a relatively untrained user to enter plain English commands (“vibe coding”) that direct Claude to perform role-specific tasks like contract review, financial modeling, CRM integration, and campaign drafting.  (CRM integration is the process of connecting a Customer Relationship Management system with other business applications, such as marketing automation, ERP, e-commerce, accounting, and customer service platforms.)

That means Claude is doing some serious heavy lifting here. Currently, companies pay big bucks yearly to “enterprise software” firms like SAP and ServiceNow (NOW) and Salesforce to come in and integrate all their corporate data storage and flows. This must-have service is viewed as really hard to do, requiring highly trained specialists and proprietary software tools. Hence, high profit margins for these enterprise software firms.

 Until recently, these firms been darlings of the stock market. For instance, as of June, 2025, NOW was up nearly 2000% over the past ten years. Imagine putting $20,000 into NOW in 2015, and seeing it mushroom to nearly $400,000.  (AI tells me that $400,000 would currently buy you a “used yacht in the 40 to 50-foot range.”)

With the threat of AI, and probably with some general profit-taking in the overheated tech sector, the share price of these firms has plummeted. Here is a six-month chart for NOW:

Source: Seeking Alpha

NOW is down around 40% in the past six months. Most analysts seem positive, however, that this is a market overreaction. A key value-add of an enterprise software firm is the custody of the data itself, in various secure and tailored databases, and that seems to be something that an external AI program cannot replace, at least for now. The capability to pull data out and crunch it (which AI is offering) it is kind of icing on the cake.

Firms like NOW are adjusting to the new narrative, by offering pay-per-usage, as an alternative to pay-per-user (“seats”). But this does not seem to be hurting their revenues. These firms claim that they can harness the power of AI (either generic AI or their own software) to do pretty much everything that AI claims for itself. Earnings of these firms do not seem to be slowing down.

With the recent stock price crash, the P/E for NOW is around 24, with a projected earnings growth rate of around 25% per year. Compared to, say, Walmart with a P/E of 45 and a projected growth rate of around 10%, NOW looks pretty cheap to me at the moment.

(Disclosure: I just bought some NOW. Time will tell if that was wise.)

Usual disclaimer: Nothing here should be considered advice to buy or sell any security.

Joy on the Anthropic Copyright Settlement

I’m at Econlog this week with:

The Anthropic Settlement: A $1.5 Billion Precedent for AI and Copyright

There are two main questions. Will AI companies need to pay compensation to authors they are currently training off of? Secondly, how important is it for human writing to be a paying career in the future, if AI continues to need good new material to train from?

There is more at the link but here are some quotes:

If human writing ceases to be a viable career due to inadequate compensation, will LLMs lose access to fresh, high-quality training data? Could this create a feedback loop where AI models, trained on degraded outputs, stagnate?

This case also blurs the traditional divide between copyright and patents. Copyrighted material, once seen as static, now drives “follow-on” innovation derived from the original work. That is, the copyright protection in this case affects AI-content influenced by the copyrighted material in a way that previously applied to new technology that built on patented technical inventions. Thus, “access versus incentives” theory applies to copyright as much as it used to apply to patents. The Anthropic settlement signals that intellectual property law, lagging behind AI’s rapid evolution, must adapt.

Writing Humanity’s Last Exam

When every frontier AI model can pass your tests, how do you figure out which model is best? You write a harder test.

That was the idea behind Humanity’s Last Exam, an effort by Scale AI and the Center for AI Safety to develop a large database of PhD-level questions that the best AI models still get wrong.

The effort has proven popular- the paper summarizing it has already been cited 91 times since its release on March 31st, and the main AI labs have been testing their new models on the exam. xAI announced today that its new Grok 4 model has the highest score yet on the exam, 44.4%.

Current leaderboard on the Humanity’s Last Exam site, not yet showing Grok 4

The process of creating the dataset is a fascinating example of a distributed academic mega-project, something that is becoming a trend that has also been important in efforts to replicate previous research. The organizers of Humanity’s Last Exam let anyone submit a question for their dataset, offering co-authorship to anyone whose question they accepted, and cash prizes to those who had the best questions accepted. In the end they wound up with just over 1000 coauthors on the paper (including yours truly as one very minor contributor), and gave out $500,000 to contributors of the very best questions (not me), which seemed incredibly generous until Scale AI sold a 49% stake in their company to Meta for $14.8 billion in June.

Source: Figure 4 of the paper

Here’s what I learned in the process of trying to stump the AIs and get questions accepted into this dataset:

  1. The AIs were harder than I expected to stump because they used frontier models rather than the free-tier models I was used to using on my own. If you think AI can’t answer your question, try a newer model
  2. It was common for me to try a question that several models would get wrong, but at least one would still get right. For me this was annoying because questions could only be accepted if every model got them wrong. But of course if you want to get a correct answer, this means trying more models is good, even if they are all in the same tier. If you can’t tell what a correct answer looks like and your question is important, make sure to try several models and see if they give different answers
  3. Top models are now quite good at interpreting regression results, even when you try to give them unusually tricky tables
  4. AI still has weird weaknesses and blind spots; it can outperform PhDs in the relevant field on one question, then do worse than 3rd graders on the next. This exam specifically wanted PhD-level questions, where a typical undergrad not only couldn’t answer the question, but probably couldn’t even understand what was being asked. But it specifically excluded “simple trick questions”, “straightforward calculation/computation questions”, and questions “easily answerable by everyday people”, even if all the AIs got them wrong. My son had the idea to ask them to calculate hyperfactorials; we found some relatively low numbers that stumped all the AI models, but the human judges ruled that our question was too simple to count. On a question I did get accepted, I included an explanation for the human judges of why I thought it wasn’t too simple.

I found this to be a great opportunity to observe the strengths and weaknesses of frontier models, and to get my name on an important paper. While the AI field is being driven primarily by the people with the chops to code frontier models, economists still have lot we can contribute here, as Joy has shown. Any economist looking for the next way to contribute here should check out Anthropic’s new Economic Futures Program.