Claude Mythos Is Such a Dangerous Hacker Engine That Anthropic Has Withheld Broad Release

The latest AI model from Anthropic is so powerful that they don’t dare release it to the public. It is such a threat that Jay Powell and Scott Bessant summoned the major bank CEOs to a meeting last week to warn them about it. In line with Anthropic’s “helpful, honest, and harmless” motto, they have released it only to their Project Glasswing partners. These are organizations like AWS, Apple, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, who have been granted access to the model to identify and patch vulnerabilities in critical software.

Mythos is designed to identify and exploit vulnerabilities in software systems when prompted. Its specialty is identifying critical software vulnerabilities and bugs, but it can also assemble sophisticated exploits.

What makes Mythos particularly unsettling is that its most dangerous capabilities were not deliberately engineered. Anthropic’s team made it clear that they did not explicitly train Mythos to have these capabilities. Instead, they “emerged.”

Internal testing revealed that Mythos has already uncovered thousands of weak points in “every major operating system and web browser.” The implications are disturbing. Claude Mythos has autonomously discovered thousands of zero-day vulnerabilities in major operating systems and web browsers— flaws that human security researchers, working for years, had never detected. (see also here and here for examples).

Mythos can rapidly uncover hidden flaws in the codes of organizations and software development firms, but it also raises the fear that attackers could find those vulnerabilities first. Much of the underlying software that Mythos can scan supports banking, retail, airlines, hospitals, and critical utilities. Regulators worry that if Mythos, or models like it, fell into the wrong hands, “systemically important” banks and even entire financial networks could be compromised before institutions even knew they were exposed.

Anthropic launched Project Glasswing in April 2026 to collaborate with tech giants and banks to identify and fix vulnerabilities before they can be exploited.   This year, organizations should expect a large influx of AI-discovered hack points in critical software. The game plan is to use AI tools to patch the vulnerabilities it discovers. Your venerable legacy system is no longer safe. What AI can expose, it can also fix. We hope.

Ray Kurzweil predicted The Singularity (when artificial intelligence growth accelerates beyond human control) would arrive in 2045, but we might be closing in on it ahead of schedule.

A Rant about Long Run Problems and Passe Solutions

If you listen to or read major economists discussing what they think are big-picture problems, then their list usually includes three topics: Fertility, Culture, & the Fiscal Health.  On the wonkier side, you’ll also hear that housing scarcity and affordability is a problem, but let’s stick with the first three.

Fertility

People are deciding to have fewer children for a variety of reasons. In no particular order, the reasons include greater access to financial institutions, more popular female education, higher female wages, lower infant mortality, and falling religiosity. Some also speculate that housing affordability, safety regulations, and social safety nets contribute too.

What’s wrong with lower fertility? In an objective sense, there is nothing wrong. But, in the sense that people value similar things, we are in somewhat uncharted territory. Realized fertility is dropping across the globe. We know that economies of scale increase productivity and real wages. We also know that technological innovation comes from having more minds engaged with economic problems. It’s possible that labor productivity rises faster than the productivity that we lose with smaller scale, but it’s an open question. What happens to the liberal societies and polities when the liberals fail to persist? These are big geopolitical concerns.

Culture

People seem to be more fragmented religiously and culturally. Social scientists used to discuss Judeo-Christian norms more often. Sometimes you’d hear about English or Roman legal tradition or enlightenment values. But now, there seems to be very little in terms of common social cohesion. In the USA, the general common culture seems to be ‘smile and be nice’. That’s not the worst common rule, but it’s not enough to hang our hat on for a capable liberal state.

The lack of cultural cohesion isn’t my own particular concern – public intellectuals in economics and elsewhere feel like there is a problem. There is a mix of reasoning behind the concern. Some people are worried about transmitting values to the next generation, some are worried about how people behave when no one’s watching, and still others are worried about simply lacking a Schelling  point that coordinates large scale economic cooperation.

Fiscal Health

Continue reading

Oops: Anthropic Accidently Leaked the Entire Code for Its “Claude Code” Program

One of Anthropic’s biggest wins has been its wildly-popular Claude Code program, that can do nearly all the grunt work of programming. Properly prompted, it can build new features, migrate databases, fix errors, and automate workflows.

So, it was big news in the AI world last week when an Anthropic employee accidently exposed a link that allowed folks to download the source code for this crown jewel – – the entire code, all 512,000 lines of it, which revealed the complete logic flow of the program, down to the tiniest features. For instance, Claude Code scans for profanity and negative phrases like “this sucks” to discern user sentiment, and tries to adjust for user frustration.

Gleeful researchers, competitors, and hackers promptly downloaded zillions of copies. Anthropic issued broad copyright takedown requests, but the damage was done. Researchers quickly used AI to rewrite the original TypeScript source code into Python and Rust, claiming to get around copyright laws on the original code. Oh, the irony: for years, AI purveyors have been arguing that when they ingest the contents of every published work (including copyrighted works) and repackage them, that’s OK. So now Anthropic is tasting the other side of that claim.

The leak has been damaging to Anthropic to some degree. Competitors don’t have to work to try to reverse engineer Claude Code, since now they know exactly how it works. Hackers have been quick to exploit vulnerabilities revealed by the leak. And Anthropic’s claim to be all about “Safety First” has been tarnished.

On the other hand, the model weights weren’t exposed, so you can’t just run the leaked code and get Claude’s results. Also, no customer data was revealed. Power users have been able to discern from the source how to run Claude Code most advantageously. This YouTube by Nick Puru discussed such optimizations, which he summarized in this roadmap:

There have actually been a number of unexpected benefits of the leak for Anthropic. Per AI:

Brand resonance and community engagement have surged, with some observers calling the incident “peak anthropic energy” that generated significant hype and validated the product’s technical impressiveness.  The leak has acted as a massive free marketing campaign, reinforcing the narrative of a fast-moving, innovative company while bouncing the brand back among developers despite the security lapse. 

Accelerated ecosystem adoption and bug fixing are also potential benefits, as the exposure allowed engineers to dissect the agentic harness and create open-source versions or “harnesses” that keep users within the Anthropic ecosystem. Additionally, the public scrutiny likely helps identify and patch vulnerabilities faster, while the leaked source maps provide a roadmap for competitors to build “Claude-like” agents, potentially standardizing the market around Anthropic’s architectural patterns.

The leak also revealed hidden roadmap features that build anticipation, such as:

  • Kairos: A persistent background daemon for continuous operation. 
  • Proactive Mode: A feature allowing the AI to act without explicit user prompts. 
  • Terminal Pets: Playful, personality-driven interfaces to increase user engagement.

Because of these benefits, conspiracy theorists have proposed that Anthropic leaked the code on purpose, or even (April Fools!) leaked fake code. Fact checkers have come to the rescue to debunk the conspiracy claims. But in the humans vs. AI competency debate, this whole kerfuffle doesn’t make humans look so great.

How to Install Drywall

Nearly every interior wall and ceiling in every home in America is covered with sheetrock = drywall = gypsum board. Sheetrock (a brand name for drywall) consists of an interior layer of rigid gypsum (a mineral composed of calcium sulfate dihydrate) plus some additives, with outside layers of strong paper or fiberglass. It normally comes in 4 ft x 8 ft sheets.

Normal houses have a framework of mainly 2×4 or larger wood lumber. Each wall has vertical 2×4 studs, spaced every 16”. Sheetrock is trimmed to size, and nailed or (these days) screwed into the studs.

That is the theory, anyway.

I have never done this stuff at large scale before, other than clumsily patching occasional small dings in a wall. A little while ago, I got to experience the process, hands-on. I was part of a team that helped someone whose basement had flooded. We cut out the lower ~4 ft of drywall, and replaced it with fresh drywall.

First, how to you cut drywall? A long, straight cut is accomplished by drawing a straight line and cutting along it, all the way through one layer of the facing paper. Then you hang the drywall sheet on the edge of a table, and crack the interior gypsum layer. Then you cut the other side of the paper. The end result of such a cut is like this:

Typically, you install drywall on the ceiling first. Then the top 4 ft of the walls, then the bottom 4 ft of the walls. You butt the pieces close to each other. For the lowest piece of drywall, you insert a curved metal wedge under it, and step on the wedge with your foot to lift that drywall piece to butt its top edge up against the upper piece. If you look carefully near the middle of the following photo, you can see the red wedge I used to jack up that small lower piece of drywall. It’s OK to leave a gap between the floor and the lower edge of the bottom drywall, since that gap will be covered by baseboard.

This was in a bathroom. I cut the lower green pieces with a little hand power saw, and screwed them into the studs, using the green and black driver visible on the stand in the left foreground.

The next two photos are before and after of a bedroom wall, again showing the bottom course of sheetrock we installed.

Filling in Cracks and Holes

As you can see, at this stage, there are like ¼” cracks between the installed sheets of sheetrock, and the mounting screw holes are visible. These imperfections are filled in with goo called joint compound, or “mud.” The mud is applied with a “knife” like this:

Cracks are covered with paper or fiberglass tape, with mud smeared over the tape. Typically, three layers of mud are needed to achieve perfect, smooth coverage. Each layer must dry hard before applying the next layer. Each layer may be sanded lightly as needed.

 A key technique is to tilt the knife so the mud is maybe 1/16” thick over the tape or over a screw, but taper the mud to zero thickness on the wall away from the tape or screw. This feathering is essential; if your mud layer ends with appreciable thickness instead of feathering, you have to do a lot of sanding to get a smooth blending into the plain drywall at that edge. Pro tip: carefully stir more water into the joint compound as needed to keep it wet and flowing, especially for overnight storage. This video from Vancouver Carpenter displays mudding technique.

That is mainly it. For perspective and confidence building, it is helpful to work with an expert, as I was able to do.

What is an AI Skill?

If you’ve been on LinkedIn recently, then you may have seen the chatter about teaching your artificial intelligence to have various skills. I saw one post by a guy who claimed to have created several skills, each representing a tech billionaire.

At first, I thought “I am behind the 8-ball. What is this new thing?”. Obviously I know what the word “skill” is and how people use it, but I had not encountered its use in the context of AI having it. What does it mean for an AI to have a skill? I somewhat dreaded the the work of learning the new skill of teaching my AI skills.

Then I had lunch with a computer scientist and I learned that skills are nothing new.

Continue reading

Does Broadband Bring Jobs?

No, according to a new paper from the University of Georgia’s Michael Kotrous.

Many people expected it to, partly by thinking about the jobs that could benefit from faster internet, and partly by looking at the experience of Chattanooga, Tennessee. Chattanooga was the first major city to get gigabit-speed broadband, and they did see a huge improvement in the labor market right afterwards:

But as the graph shows, the introduction of broadband there coincides with the end of the nationwide Great Recession. Was the boom in jobs after 2009 because of the broadband, or would it have happened anyway as party of the recovery from recession? A synthetic control strategy shows that Chattanooga’s recovery was pretty typical for cities like it, so the broadband angle probably didn’t do much:

This might seem like a historical curiosity about one city, but the federal government is currently trying to spend $42 billion to expand broadband to more places, partly motivated by the idea of bringing jobs. I thought the Broadband Equity Access and Deployment Program‘s big problem is how slow it is- Congress created with the Infrastructure Investment and Jobs Act of 2021, but money didn’t start getting sent out until late 2025, and it could be many more years before it leads to any useable broadband. Even then it now seems unlikely to bring jobs, though there could be other benefits.

This paper’s author Michael Kotrous is currently on the economics job market. As his former professor and coauthor, I recommend hiring him if your school gets the chance.

A Bull Case for Tech Stocks

Negative headlines tend to get more attention than bland positive titles. We have seen a lot of angst in the past few months over the massive capex spend by big tech companies, with questions over whether there will be adequate returns on these investments.

There was a genuine untethered bubble in tech stocks circa 1997-2000. Companies with no earnings and no moats were given billion-dollar valuations, on the strength of a business plan sketched on a cocktail napkin. After the brutal bursting of that bubble, tech stocks repriced and then steadily strengthened for the next 25 years.

Nevertheless, it seems there is always some negative narrative to be found regarding tech stock valuations and prospects. Seeking Alpha author Beth Kindig writes that investors who were spooked by all those bubble warnings lost out big time:

Investors have been hearing “tech bubble” warnings for more than a decade — but instead of collapsing, the Nasdaq‑100 has gained 550%. If we look back ten years ago to 2015, headlines such as “Sell everything! 2016 will be a cataclysmic year” confronted investors with calls for an imminent recession. The bears made repeated claims that a “tech bubble” was about to burst with some of the world’s most prominent venture capitalists drawing parallels to the dot-com era.

What followed tells a very different story, with not only the Nasdaq-100 up 550% over a 10-year period but also high-flying stocks like Shopify returning as much as 5200% and Nvidia returning 22,000% over the same period.

It’s true that capturing those gains does not come easy. Investors had to hold through five drawdowns that were greater than 20%, including two declines greater than 30%, while tuning out a constant stream of bearish commentary – often from reputable sources – proclaiming the long-awaited tech bubble has finally “popped.” Despite these strong convictions, the long-term trend remained intact.

She presented this graphic which illustrates many of the negative headlines over the past decade:

While she acknowledges that traditional cloud computing applications are slowing in growth rates, and there will be general market price volatility, she contends that AI is still in an acceleration phase:

The dot-com era was defined by oversupply and fragile fundamentals; today’s AI buildout is being led by the world’s strongest operators, backed by real revenues and profits, and constrained by hard limits in compute, memory, networking, and power.

The more important question isn’t whether we’ll see a pullback — it’s where we are in the cycle. AI is still transitioning from the training phase into the inference phase, where monetization will accelerate and the “capex with no revenue” narrative will begin to fade. In other words, the loudest bubble debates are arriving before the most important revenue engine fully turns on.

Those of us who are long tech stocks hope she is correct.

How a Protective Options Collar Cushioned a Loss in Korean Stock Fund EWY

After being convinced by a series of favorable articles, I bought a few shares last month of the EWY fund, which holds shares of major South Korean companies. The narrative seemed compelling: the vast production of compute processing chips for AI has led to a structural supply shortage of fast memory chips. South Korean firms excel in making these chips, and so high, growing profits seemed assured. What could possibly go wrong?

What I didn’t know was that thousands of other retail investors were thinking the exact same thing, and hence had bid the price of EWY up to possibly unreasonable levels. Somehow, my bullish analysts missed that point. In particular, the South Korean market is driven by an unusually high level of margin trading, where investors borrow money on margin to buy shares. A market drop leads to margin calls, which leads to forced selling, which really crashes prices.

The other thing I did not know was that, two days after my purchase, the attacks on Iran would commence. Oops. Among other things, this would drive up the world price of oil, which impacts energy importers like South Korea. This seems to have been the trigger for the sharp stock drop.

Here is the six-month price chart for EWY:

As it happened, I bought pretty much at the top, and as of Monday midday when I am writing this, EWY was down about 17%. That doesn’t look like much of a drop on the chart, because of the long run-up to this point, but it is an unpleasant development if you just bought in two weeks ago.

Fortunately, when I bought the EWY shares, I set up a protective options collar, since this was not a high conviction buy. First, I bought a put with a strike price about 7% below my purchase price, which would limit my maximum loss on the EWY shares to 7%. A problem is that this put cost serious money (about 11% of the share price), so my maximum loss could actually be 7% plus 11% = 18%. Therefore, I offset nearly all the cost of the put by selling a call with a strike price about 17% above the current EWY share price. That meant that I could profit from a rise in EWY share price by up to 17%, while being protected against a drop of more than 7%. That seemed like a favorable asymmetry (7% max loss vs 17% max gain).

This arrangement (buying a protective put to limit downside, financed by selling a call which limits upside) is called an options “collar”. I’d rather accept a limited upside than have to worry about doing clever trading to mitigate a big loss.

As of Monday, my collar was working well to protect the overall position. As might be expected, the value of my put increased, with the drop in EWY share price. But also, the value of my call decreased, which further helps me, since I am short that call. The net result was that about 75% of the loss in the stock price was compensated by the changes in values of the two options.

This is just a small, experimental position, but it was nice to see practical outcomes line up with theory.

Disclaimer: As usual, nothing here should be considered advice to buy or sell any security.

Ricardian Equivalence: Reasonable Assumption #2

There are several requirements for Ricardian Equivalence:

  1. Individuals or their families act as infinitely lived agents.
  2. All governments and agents can borrow and lend at a single rate.
  3. The path of government expenditures is independent of financing choices

Assumption 2) appears patently absurd on its face. I certainly cannot borrow at the same interest rate that the US Treasury can. QED. Do not pass go, do not collect $200. The yield on 1-year US treasuries is 3.58%. I can’t borrow at that rate… Or can I?

Let’s do some casuistry.

What is a loan?

It’s a contract that:

  • Provides the borrower with access to spending
  • with or without collateral
  • with a promise to repay the lender at defined times, usually with interest.

So, when you borrow $5 from a friend and pay it back on the same day, it’s a loan. The contract is verbal, there is no collateral, the repayment time is ‘soon’ with flexibility, and the interest rate is zero.

A mortgage is a collateralized loan. You borrow from a bank, make monthly payments for the term of the loan, and accrue interest on the principal. The contract is written, the house or a portion of its value is the collateral, and the interest rate is positive.

What about a Pawnshop loan? Most of us are probably unfamiliar with these. In this circumstance, a person has valuable non-assets that and the pawnshop has money.  They engage in a contractual asset swap. The borrower lends the non-money asset to the pawnshop as collateral and borrows money from the pawnshop. The pawnshop borrows the non-money asset and lends the money to the borrower. The borrower can use the money as they please, but the pawnshop can not use the non-money asset – they can simply hold it. They collect interest in order to cover their opportunity costs.

One outcome is that the borrower repays the loan and interest by the maturity date and reclaims their non-money asset. Another outcome is that the borrower retains the option to default without any further obligation. But they lose the right to reclaim their property according to the repayment terms. If the borrower exercises the option to default, then the pawnshop acquires full rights to the non-money asset. The pawnshop often resells the asset at a profit. The profit is relatively reliable because the illiquidity of the non-money asset allows the pawnshop to lend much less than its retail value. That illiquidity is also why the borrower is willing to accept the terms.

If we accept that the pawnshop contract is a loan, which is just a collateralized loan with a mostly standard default option, then get ready for this.

Continue reading

Humanity’s Last Exam in Nature

Last July I wrote here about “Humanity’s Last Exam”:

When every frontier AI model can pass your tests, how do you figure out which model is best? You write a harder test.

That was the idea behind Humanity’s Last Exam, an effort by Scale AI and the Center for AI Safety to develop a large database of PhD-level questions that the best AI models still get wrong.

The group initially released an arXiV working paper explaining how we created the dataset. I was surprised to see a version of that paper published in Nature this year, with the title changed to the more generic “A benchmark of expert-level academic questions to assess AI capabilities.”

One the one hand, it makes sense that the core author groups at the Center for AI Safety and Scale AI didn’t keep every coauthor in the loop, given that there were hundreds of us. On the other hand, I’m part of a different academic mega-project that currently is keeping hundreds of coauthors in the loop as it works its way through Nature. On the third, invisible hand, I’m never going to complain if any of my coauthors gets something of ours published in Nature when I’d assumed it would remain a permanent working paper.

AI is now getting close to passing the test:

What do we do when it can answer all the questions we already know the answer to? We start asking it questions we don’t know the answer to. How do you cure cancer? What is the answer to life, the universe, and everything? When will Jesus return, and how long until a million people are convinced he’s returned as an AI? Where is Ayatollah Khamenei right now?