Claude Mythos Is Such a Dangerous Hacker Engine That Anthropic Has Withheld Broad Release

The latest AI model from Anthropic is so powerful that they don’t dare release it to the public. It is such a threat that Jay Powell and Scott Bessant summoned the major bank CEOs to a meeting last week to warn them about it. In line with Anthropic’s “helpful, honest, and harmless” motto, they have released it only to their Project Glasswing partners. These are organizations like AWS, Apple, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, who have been granted access to the model to identify and patch vulnerabilities in critical software.

Mythos is designed to identify and exploit vulnerabilities in software systems when prompted. Its specialty is identifying critical software vulnerabilities and bugs, but it can also assemble sophisticated exploits.

What makes Mythos particularly unsettling is that its most dangerous capabilities were not deliberately engineered. Anthropic’s team made it clear that they did not explicitly train Mythos to have these capabilities. Instead, they “emerged.”

Internal testing revealed that Mythos has already uncovered thousands of weak points in “every major operating system and web browser.” The implications are disturbing. Claude Mythos has autonomously discovered thousands of zero-day vulnerabilities in major operating systems and web browsers— flaws that human security researchers, working for years, had never detected. (see also here and here for examples).

Mythos can rapidly uncover hidden flaws in the codes of organizations and software development firms, but it also raises the fear that attackers could find those vulnerabilities first. Much of the underlying software that Mythos can scan supports banking, retail, airlines, hospitals, and critical utilities. Regulators worry that if Mythos, or models like it, fell into the wrong hands, “systemically important” banks and even entire financial networks could be compromised before institutions even knew they were exposed.

Anthropic launched Project Glasswing in April 2026 to collaborate with tech giants and banks to identify and fix vulnerabilities before they can be exploited.   This year, organizations should expect a large influx of AI-discovered hack points in critical software. The game plan is to use AI tools to patch the vulnerabilities it discovers. Your venerable legacy system is no longer safe. What AI can expose, it can also fix. We hope.

Ray Kurzweil predicted The Singularity (when artificial intelligence growth accelerates beyond human control) would arrive in 2045, but we might be closing in on it ahead of schedule.

Oops: Anthropic Accidently Leaked the Entire Code for Its “Claude Code” Program

One of Anthropic’s biggest wins has been its wildly-popular Claude Code program, that can do nearly all the grunt work of programming. Properly prompted, it can build new features, migrate databases, fix errors, and automate workflows.

So, it was big news in the AI world last week when an Anthropic employee accidently exposed a link that allowed folks to download the source code for this crown jewel – – the entire code, all 512,000 lines of it, which revealed the complete logic flow of the program, down to the tiniest features. For instance, Claude Code scans for profanity and negative phrases like “this sucks” to discern user sentiment, and tries to adjust for user frustration.

Gleeful researchers, competitors, and hackers promptly downloaded zillions of copies. Anthropic issued broad copyright takedown requests, but the damage was done. Researchers quickly used AI to rewrite the original TypeScript source code into Python and Rust, claiming to get around copyright laws on the original code. Oh, the irony: for years, AI purveyors have been arguing that when they ingest the contents of every published work (including copyrighted works) and repackage them, that’s OK. So now Anthropic is tasting the other side of that claim.

The leak has been damaging to Anthropic to some degree. Competitors don’t have to work to try to reverse engineer Claude Code, since now they know exactly how it works. Hackers have been quick to exploit vulnerabilities revealed by the leak. And Anthropic’s claim to be all about “Safety First” has been tarnished.

On the other hand, the model weights weren’t exposed, so you can’t just run the leaked code and get Claude’s results. Also, no customer data was revealed. Power users have been able to discern from the source how to run Claude Code most advantageously. This YouTube by Nick Puru discussed such optimizations, which he summarized in this roadmap:

There have actually been a number of unexpected benefits of the leak for Anthropic. Per AI:

Brand resonance and community engagement have surged, with some observers calling the incident “peak anthropic energy” that generated significant hype and validated the product’s technical impressiveness.  The leak has acted as a massive free marketing campaign, reinforcing the narrative of a fast-moving, innovative company while bouncing the brand back among developers despite the security lapse. 

Accelerated ecosystem adoption and bug fixing are also potential benefits, as the exposure allowed engineers to dissect the agentic harness and create open-source versions or “harnesses” that keep users within the Anthropic ecosystem. Additionally, the public scrutiny likely helps identify and patch vulnerabilities faster, while the leaked source maps provide a roadmap for competitors to build “Claude-like” agents, potentially standardizing the market around Anthropic’s architectural patterns.

The leak also revealed hidden roadmap features that build anticipation, such as:

  • Kairos: A persistent background daemon for continuous operation. 
  • Proactive Mode: A feature allowing the AI to act without explicit user prompts. 
  • Terminal Pets: Playful, personality-driven interfaces to increase user engagement.

Because of these benefits, conspiracy theorists have proposed that Anthropic leaked the code on purpose, or even (April Fools!) leaked fake code. Fact checkers have come to the rescue to debunk the conspiracy claims. But in the humans vs. AI competency debate, this whole kerfuffle doesn’t make humans look so great.

SaaSmageddon: Will AI Eat the Software Business?

A big narrative for the past fifteen years has been that “software is eating the world.” This described a transformative shift where digital software companies disrupted traditional industries, such as retail, transportation, entertainment and finance, by leveraging cloud computing, mobile technology, and scalable platforms. This prophecy has largely come true, with companies like Amazon, Netflix, Uber, and Airbnb redefining entire sectors. Who takes a taxi anymore?

However, the narrative is now evolving. As generative AI advances, a new phase is emerging: “AI is eating software.”  Analysts predict that AI will replace traditional software applications by enabling natural language interfaces and autonomous agents that perform complex tasks without needing specialized tools. This shift threatens the $200 billion SaaS (Software-as-a-Service) industry, as AI reduces the need for dedicated software platforms and automates workflows previously reliant on human input. 

A recent jolt here has been the January 30 release by Anthropic of plug-in modules for Claude, which allow a relatively untrained user to enter plain English commands (“vibe coding”) that direct Claude to perform role-specific tasks like contract review, financial modeling, CRM integration, and campaign drafting.  (CRM integration is the process of connecting a Customer Relationship Management system with other business applications, such as marketing automation, ERP, e-commerce, accounting, and customer service platforms.)

That means Claude is doing some serious heavy lifting here. Currently, companies pay big bucks yearly to “enterprise software” firms like SAP and ServiceNow (NOW) and Salesforce to come in and integrate all their corporate data storage and flows. This must-have service is viewed as really hard to do, requiring highly trained specialists and proprietary software tools. Hence, high profit margins for these enterprise software firms.

 Until recently, these firms been darlings of the stock market. For instance, as of June, 2025, NOW was up nearly 2000% over the past ten years. Imagine putting $20,000 into NOW in 2015, and seeing it mushroom to nearly $400,000.  (AI tells me that $400,000 would currently buy you a “used yacht in the 40 to 50-foot range.”)

With the threat of AI, and probably with some general profit-taking in the overheated tech sector, the share price of these firms has plummeted. Here is a six-month chart for NOW:

Source: Seeking Alpha

NOW is down around 40% in the past six months. Most analysts seem positive, however, that this is a market overreaction. A key value-add of an enterprise software firm is the custody of the data itself, in various secure and tailored databases, and that seems to be something that an external AI program cannot replace, at least for now. The capability to pull data out and crunch it (which AI is offering) it is kind of icing on the cake.

Firms like NOW are adjusting to the new narrative, by offering pay-per-usage, as an alternative to pay-per-user (“seats”). But this does not seem to be hurting their revenues. These firms claim that they can harness the power of AI (either generic AI or their own software) to do pretty much everything that AI claims for itself. Earnings of these firms do not seem to be slowing down.

With the recent stock price crash, the P/E for NOW is around 24, with a projected earnings growth rate of around 25% per year. Compared to, say, Walmart with a P/E of 45 and a projected growth rate of around 10%, NOW looks pretty cheap to me at the moment.

(Disclosure: I just bought some NOW. Time will tell if that was wise.)

Usual disclaimer: Nothing here should be considered advice to buy or sell any security.

Joy on the Anthropic Copyright Settlement

I’m at Econlog this week with:

The Anthropic Settlement: A $1.5 Billion Precedent for AI and Copyright

There are two main questions. Will AI companies need to pay compensation to authors they are currently training off of? Secondly, how important is it for human writing to be a paying career in the future, if AI continues to need good new material to train from?

There is more at the link but here are some quotes:

If human writing ceases to be a viable career due to inadequate compensation, will LLMs lose access to fresh, high-quality training data? Could this create a feedback loop where AI models, trained on degraded outputs, stagnate?

This case also blurs the traditional divide between copyright and patents. Copyrighted material, once seen as static, now drives “follow-on” innovation derived from the original work. That is, the copyright protection in this case affects AI-content influenced by the copyrighted material in a way that previously applied to new technology that built on patented technical inventions. Thus, “access versus incentives” theory applies to copyright as much as it used to apply to patents. The Anthropic settlement signals that intellectual property law, lagging behind AI’s rapid evolution, must adapt.

Writing Humanity’s Last Exam

When every frontier AI model can pass your tests, how do you figure out which model is best? You write a harder test.

That was the idea behind Humanity’s Last Exam, an effort by Scale AI and the Center for AI Safety to develop a large database of PhD-level questions that the best AI models still get wrong.

The effort has proven popular- the paper summarizing it has already been cited 91 times since its release on March 31st, and the main AI labs have been testing their new models on the exam. xAI announced today that its new Grok 4 model has the highest score yet on the exam, 44.4%.

Current leaderboard on the Humanity’s Last Exam site, not yet showing Grok 4

The process of creating the dataset is a fascinating example of a distributed academic mega-project, something that is becoming a trend that has also been important in efforts to replicate previous research. The organizers of Humanity’s Last Exam let anyone submit a question for their dataset, offering co-authorship to anyone whose question they accepted, and cash prizes to those who had the best questions accepted. In the end they wound up with just over 1000 coauthors on the paper (including yours truly as one very minor contributor), and gave out $500,000 to contributors of the very best questions (not me), which seemed incredibly generous until Scale AI sold a 49% stake in their company to Meta for $14.8 billion in June.

Source: Figure 4 of the paper

Here’s what I learned in the process of trying to stump the AIs and get questions accepted into this dataset:

  1. The AIs were harder than I expected to stump because they used frontier models rather than the free-tier models I was used to using on my own. If you think AI can’t answer your question, try a newer model
  2. It was common for me to try a question that several models would get wrong, but at least one would still get right. For me this was annoying because questions could only be accepted if every model got them wrong. But of course if you want to get a correct answer, this means trying more models is good, even if they are all in the same tier. If you can’t tell what a correct answer looks like and your question is important, make sure to try several models and see if they give different answers
  3. Top models are now quite good at interpreting regression results, even when you try to give them unusually tricky tables
  4. AI still has weird weaknesses and blind spots; it can outperform PhDs in the relevant field on one question, then do worse than 3rd graders on the next. This exam specifically wanted PhD-level questions, where a typical undergrad not only couldn’t answer the question, but probably couldn’t even understand what was being asked. But it specifically excluded “simple trick questions”, “straightforward calculation/computation questions”, and questions “easily answerable by everyday people”, even if all the AIs got them wrong. My son had the idea to ask them to calculate hyperfactorials; we found some relatively low numbers that stumped all the AI models, but the human judges ruled that our question was too simple to count. On a question I did get accepted, I included an explanation for the human judges of why I thought it wasn’t too simple.

I found this to be a great opportunity to observe the strengths and weaknesses of frontier models, and to get my name on an important paper. While the AI field is being driven primarily by the people with the chops to code frontier models, economists still have lot we can contribute here, as Joy has shown. Any economist looking for the next way to contribute here should check out Anthropic’s new Economic Futures Program.