Humanity’s Last Exam in Nature

Last July I wrote here about “Humanity’s Last Exam”:

When every frontier AI model can pass your tests, how do you figure out which model is best? You write a harder test.

That was the idea behind Humanity’s Last Exam, an effort by Scale AI and the Center for AI Safety to develop a large database of PhD-level questions that the best AI models still get wrong.

The group initially released an arXiV working paper explaining how we created the dataset. I was surprised to see a version of that paper published in Nature this year, with the title changed to the more generic “A benchmark of expert-level academic questions to assess AI capabilities.”

One the one hand, it makes sense that the core author groups at the Center for AI Safety and Scale AI didn’t keep every coauthor in the loop, given that there were hundreds of us. On the other hand, I’m part of a different academic mega-project that currently is keeping hundreds of coauthors in the loop as it works its way through Nature. On the third, invisible hand, I’m never going to complain if any of my coauthors gets something of ours published in Nature when I’d assumed it would remain a permanent working paper.

AI is now getting close to passing the test:

What do we do when it can answer all the questions we already know the answer to? We start asking it questions we don’t know the answer to. How do you cure cancer? What is the answer to life, the universe, and everything? When will Jesus return, and how long until a million people are convinced he’s returned as an AI? Where is Ayatollah Khamenei right now?

My First Exit

I invested in my first private company in 2022; my first opportunity to cash out of a private investment came this year when Our Bond did an IPO, now trading on Nasdaq as OBAI.

I’m happy to get a profitable exit less than 4 years after my first investment, given that I’m investing in early-stage companies. Venture funds tend to run for 10 years to give their companies time to IPO or get acquired, and WeFunder (the private investment platform I used) says that “On average, companies on Wefunder that earn a return take around 7 years to do so.” The speed here is especially striking given that I didn’t invest in Our Bond itself until April 2025.

Most private companies that raise money from individual investors are very early stage, what venture capitalists would call “pre-seed” or “seed-stage” companies looking for angel investors. Later-stage companies often find it simpler to raise their later stages (Series B, et c) from a few large institutional investors. But a few choose to do “community rounds” and allow individuals to invest later. This is what Our Bond did right before their IPO, allowing me to exit in less than a year.

This helps calm my biggest concern with equity crowdfunding- adverse selection:

The companies themselves have a better idea of how well they are doing, and the best ones might not bother with equity crowdfunding; they could probably raise more money with less hassle by going to venture funds or accredited angel investors.

My guess is that the reason some good companies bother with this is marketing. Why did Substack bother raising $7.8 million from 6000 small investors on WeFunder in 2023, when they probably could have got that much from a single VC firm like A16Z? They got the chance to explain how great their company and product is to an interested audience, and to give thousands of investors an incentive to promote the company. Getting one big check from VCs is simpler, but it doesn’t directly promote your product in the same way.

All this is enough to convince me that the equity crowdfunding model enabled by the 2012 JOBS Act will continue to grow.

Still, things could have easily gone better for me, as these markets are clearly inefficient and have complexities I’m still learning to navigate. Profitability is not just about choosing the right companies to invest in, but about managing exits. I expected the typical IPO roadshow would give me months of heads-up, but Our Bond surprised its investors with a direct listing. The first thing I heard about the IPO was a February 4th email from “VStockTransfer” that I thought was a scam at first, since it was a 3rd-party company I’d never heard of asking me to pay them money to access my shares. But Our Bond confirmed it was real- VStockTransfer was the custodian for the private shares, and charges $120 to “DRS transfer” them to a brokerage of your choice where they can be sold.

I submitted the request to move the shares to Schwab the same day, but Schwab estimated it would take a week to move them. Neither Schwab nor VStockTransfer ever sent me a notification that the shares had been transferred, and by the time I noticed they had moved a week later, the stock price had fallen dramatically:

As I write this on February 18th, the OBAI price represents a 1.3x return on the price I invested in the private company at last April. When I was first able to sell some stock on February 11th, the price represented a 3x return; if I’d been able to sell right away on the 4th without waiting for the brokerage transfer process, it would have been a 10x return.

By the Efficient Market Hypothesis this timing shouldn’t be so critical, but I knew there would be a rush for the exits as lots of private investors would want to unload their shares at the first opportunity, an opportunity some would have waited years for. Sometimes old-fashioned supply and demand analysis is a better guide to markets than the EMH: demand for OBAI stock had no big reason to change in February, but freely floating supply saw a big increase as private shares got unlocked and moved to brokerages.

Getting a 10x return vs a 1.3x return on one of your winners is the difference between a great early investor and a bad one. I always thought such differences would be driven by who picks the best companies to invest in, but at least in this case it could be driven by who is fastest on the draw with brokerage transfers.

If I ever find myself holding shares in another company that does a direct listing, I’ll be doing whatever I can to make sure the transfer goes as fast as possible (pick the fastest brokerage, check on the transfer status every day, et c). This process also seems like one reason to do fewer, larger private investments- a fixed $120 transfer fee is a big deal if the initial investment was in the low hundreds but wouldn’t matter much for a larger one.

Being accredited would help there, allowing access to additional later-stage, less-risky companies. But I’ll call OBAI a win for equity crowdfunding, and a big win for asset pricing theories based on liquidity and flows over efficient estimation of the present discounted value of future cashflows.

Disclaimer: I still hold some OBAI

Commodity Sports

I’m trying to coin “Commodity Sports” as the term to refer to sports betting that takes place on exchanges regulated by the US Commodity Futures Trading Commission, as opposed to sports betting that takes place through casinos regulated by state gaming commissions. So far it seems to be working alright, I haven’t convinced Gemini but have got the top spot in traditional Google search:

That article- Will Commodity Sports Last?– is my first at EconLog. I’m happy to get a piece onto one of the oldest economics blogs, one where I was reading Arnold Kling’s takes on the Great Recession in real time, where I was introduced to Bryan Caplan’s writing before I read his books, and where Scott Sumner wrote for many years (though I started reading him at The Money Illusion before that).

The key idea of the piece, other than the legal oddity of sports betting sharing a legal category with corn futures, is that the Commodity Sports category is being pioneered by prediction markets like Kalshi. As readers here will know, I like prediction markets:

I love that CFTC-regulated exchanges like Kalshi and Polymarket are bringing prediction markets to the mainstream. The true value of prediction markets is to aggregate information dispersed across the world into a single number that represents the most accurate forecast of the future.

But I’m not so excited to see them expanding into sports:

Although I see huge value in prediction markets when they are offering more accurate forecasts on important issues that help policymakers, businesses, and individuals make more informed plans for our future (e.g., Which world leaders will leave office this year?, or Which countries will have a recession?)… I see much less value in having a more accurate forecast of how many receptions Jaxon Smith-Njigba will have.

Like Robin Hanson, I worry that the legal battles against Commodity Sports and the brewing cultural backlash against sports betting risk taking the most informative prediction markets down along with it.

The full piece is here.

Forecasting An Eventful 2026

May you live in interesting times – apocryphal Chinese curse

In early 2025 I shared forecasts about the economy that turned out to be pretty good. This year, economic forecasts center around a boringly decent year (2.6% GDP growth, inflation below 3%, unemployment stays below 5%, no recession), though with high variance. But forecasts about politics and war foretell a turbulent year.

In the US, midterm elections have a 78% chance to flip control of the House and 35% chance to flip the Senate despite a tough map for Democrats. A midterm wave for the out-of-power party is typical in the US, given that the party in power always seems to over-play their hand and voters quickly get sick them. More surprising is that forecasters give a 44% chance that Donald Trump leaves office before his term is up, and a 16% chance that he leaves office this year. Markets give a 20% chance that he will be removed from office through the impeachment process, so the rest of the 44% would be from health issues or voluntary resignation.

Forecasters at Kalshi predict a greater than even chance that 4 notable world leaders leave office this year:

I find this especially notable because Viktor Orban is the only one who would be removed through regularly scheduled elections. In the UK, Keir Starmer was just elected Prime Minister in 2024 and doesn’t have to face reelection until 2029; but he is so unpopular that his own Labor Party is likely to kick him out of office if local elections in May go as badly as polls indicate. If so, he would join Boris Johnson and Liz Truss as the third British PM in four years to leave office without directly losing an election. The leaders of Cuba and Iran don’t face real elections and would presumably be pushed out by a popular uprising or US military action.

Some other important world leaders will probably stay in office this year, but forecasters still think there is a significant chance they leave: Israel’s Netanyahu (49%), Ukraine’s Zelenskyy (32%), and Russia’s Putin (14%). For the latter two, this belief could be tied to the surprisingly high odds given to a ceasefire in the Russia-Ukraine war this year (45%). Orban leaving office could be tied into this, as Hungary has often vetoed EU support for Ukraine.

Myself, I find most of these market odds to be high, and I’m tempted to make the “nothing ever happens” trade and bet that everyone stays in office. But even if all these markets are 10pp high, it still implies quite an eventful year ahead. Prepare accordingly.

The Wealth Ladder

The Wealth Ladder is a 2025 personal finance book from data blogger Nick Maggiulli. The core idea is good: that the best financial strategies will be different based on your current wealth level. Maggiulli divides people into 6 net worth levels based on orders of magnitude, from less than $10K to over $100M. The middle of the book has separate chapters with advice for people in each level, so a book that is already a fairly quick and easy read as a whole could be even quicker if you skipped the chapters about levels other than your own.

The beginning of the book tries to develop some simple rules phrased in a way that they can apply across every level, because they are based on a percentage of your net worth. I like the idea but don’t think it really worked. His “1% Rule” says you should only accept an opportunity to earn money if it will increase your net worth by at least 1%. But in practice, whether an earning opportunity is worth your time depends less on how many absolute dollars in generates as a % of your net worth, and more on how many $ per hour it generates. The “0.01% Rule” (don’t worry about spending money on anything that costs less than 0.01% of your net worth) is better. But whether it is a good rule for you will depend on your age and income.

In short, while tailoring his advice in 6 different ways for the 6 wealth levels of his ladder is an improvement on one-size-fits all personal finance books, even this much tailoring isn’t enough. Having a $1 million net worth is normal for a household in their 60s but would be exceptional for one in their 20’s; and vice-versa for a household with under $10k net worth. Chapter 10 explains the data on this well, but it kind of undermines the ideas of the previous chapters. Households with the same net worth should be making very different decisions in their 20s vs 60s.

The strongest part of the book is the use of data from the Survey of Consumer Finances and the Panel Study of Income Dynamics to show how people differ by wealth level and how people move from one level to another. For instance, he shows that the poor have most of their wealth in cash and vehicles; the middle class in homes; the wealthy in retirement accounts and stocks; the very rich in private businesses. Americans tend to climb the wealth ladder slowly but steadily; over 10 years they are twice as likely to move up the ladder as to move down; over 20 years, 3 times as likely. The median person who made it to one of the top 3 rings (i.e. the median millionaire) is in their 60s.

If you get ahold of a copy of the book it’s definitely worthwhile to flip through all the tables and figures, but I won’t be adding to to my short list of the best personal finance books. The core metaphor of the ladder carriers the implicit assumption that everyone should be trying to get to the top of the ladder. But if someone is satisfied with less than $10 million, why should they take on lots of time and effort and risk to start a business for a small chance to go over $100 million?

Continue reading

How to Make a Few Billion Dollars

The title is excellent, given that the author Brad Jacobs did in fact make a few billion dollars.

The book itself is fine to read, but also fine to skip if you aren’t yourself burning to build a billion dollar company through excellent management and mergers and acquisitions. I certainly don’t care to, which Jacobs says would make me a bad hire for one of his companies:

I only hire people who are motivated to make a lot of money…. If an candidate says to me ‘I’m not motivated by money’, I suspect either they’re not being candid or they lack the hunger that’s necessary to succeed

The book has plenty of hard-driving sentiments like this that you’d expect from a self-made billionaire:

Fire C players

For the first time ever, an American company, Exxon, had reported quarterly earnings in excess of $1 billion. The words “obscene profits” flashed on my TV screen, and I remember thinking “That sounds pretty good! Maybe I ought to check out the oil sector.” [This part I agree with, economic theory predicts that entrepreneurs will enter the sectors with the highest profits and its what I’d do if I wanted to make money, though in practice I think it is surprisingly rare for would-be entrepreneurs to choose this way -JB]

“The CEO trait most closely correlated with organizational success is high IQ” [specifically more important than EQ]

But Jacobs balances these ideas with some surprisingly hippy-like attitudes. Jacobs went to Bennington College and almost had a career as a jazz keyboardist. Chapter 1 is titled “How to Rearrange Your Brain”, and emphasizes the importance of meditation. Page 21 is basically “have you ever really looked at your hands, man… do it, it’s a trip”

I don’t want to spend even one hour around people who are unkind. An organization is like a party. You only want to invite people who bring the vibe up

Though perhaps this hippy/anti-hippy balance shouldn’t be surprising for someone who says one of the main things he asks about potential hires is “can this person think dialectically”.

Strongly recommend the book if you want to follow Jacobs’ path; weakly recommend it as a general management/self-help book or way to learn about markets.

The Hot Social Network Is… LinkedIn?

So says the Wall Street Journal. They have data to back it up:

Plus quotes from yours truly:

Even before Elon Musk gutted X’s content moderation, James Bailey was tired of the shouting. “It’s like a cursed artifact that gives you great power to keep up with what’s going on, but at the cost of subtly corrupting your soul,” said the 38-year-old Providence College economics professor.

He retreated. This year, he realized he was spending five to 10 minutes a day on a site he used to ignore.

The WSJ reporter contacted me after seeing my previous post about LinkedIn here, explaining how I think LinkedIn has improved as a way to share and read articles, and was always good as a way to keep up with former students. Just in the short time since the WSJ article came out, I finally used LinkedIn for one of its official purposes, hiring, where it worked wonders helping to fill a last-minute vacancy.

If you don’t trust me or the WSJ to identify the hot social network, lets see what the actual cool kids are up to

Is This the End of the Largest Refugee Crisis in the Americas?

Our 2024 post on the Venezuelan election provides context for this week’s dramatic events:

Venezuela held an election this week; President Maduro says he won, while the opposition and independent observers say he lost. Disputed elections like this are fairly common across the world, but where Venezuela really stands out is not how people vote at the ballot box- it is how they vote with their feet.

Reuters notes that “A Maduro win could spur more migration from Venezuela, once the continent’s wealthiest country, which in recent years has seen a third of its population leave.”

This makes Venezuela the largest refugee crisis in the history of the Americas, and depending on how you count the partition of India, perhaps the largest refugee crisis in human history that was not triggered by an invasion or civil war.

Instead, it has been triggered by the Maduro regime choosing terrible policies that have needlessly and dramatically impoverished the country

Plus some foreshadowing:

I hope that the Venezuelan government will soon come to represent the will of its people. I’m not sure how that is likely to happen, though I guess positive change is mostly likely to come from Venezuelans themselves (perhaps with help from Colombia and Brazil); when the US tries to play a bigger role we often make things worse. But what has happened in Venezuela for the past 10 years is clearly much worse than the “normal” bad economic policies and even democratic backsliding that we see elsewhere. 

Here’s an update on the chart I shared then, showing that the diaspora has continued to swell:

I hope that Venezuela will soon become the sort of country people don’t want to flee. I don’t necessarily expect that it will, but it’s not now a crazy hope:

How Good Were 2025 Forecasts?

Last January I shared a roundup of forecasts for the year from markets and professional economists. Were they any good? Here was their prediction for the US economy:

WSJ’s survey of economists reports that inflation expectations for 2025 were around 2% before the election, but are closer to 3% now. Their economists expect GDP growth slowing to 2%, unemployment ticking up slightly but staying in the low 4% range, with no recession. The basic message that 2025 will be a typical year for the US macroeconomy, but with inflation being slightly elevated, perhaps due to tariffs.

The verdicts (based on current data, which isn’t yet final for all of 2025):

Inflation: Nailed it exactly (2.7%)

GDP: We’re still waiting on Q4, but 2025 as a whole is on track to be a bit above the 2.0% forecast.

Unemployment: 4.6% as of November 2025, a bit above the 4.3% forecast

Recession: Didn’t happen, making the 22% chance forecast look fine

So the professional forecasters were probably a bit low on GDP and unemployment, but overall I’d say they had a good year. What about prediction markets?

For those who hope for DOGE to eliminate trillions in waste, or those who fear brutal austerity, the message from markets is that the huge deficits will continue, with the federal debt likely climbing to over $38 trillion by the end of the year. This is one reason markets see a 40% chance that the US credit rating gets downgraded this year.

While the US has only a 22% chance of a recession, China is currently at 48%, Britain at 80%, and Germany at 91%. The Fed probably cuts rates twice to around 4.0%.

Deficits: Nailed it, the federal debt is currently around $38.4 trillion.

US Credit Downgrade: It’s hard to score a prediction of a 40% chance of a binary event happening, but in any case Moodys downgraded the US’ credit rating in May, so that all three major agencies now rate it as not perfect.

The Fed: Cut rates a bit more than expected.

Foreign Recessions: China and Britain avoided recessions. Germany had a recession by the technical definition of Kalshi’s market, but not really in practice (FRED shows -0.2% Real GDP growth in Q2 followed by 0.00000% growth in Q3). Britain avoiding recession when markets showed an 80% chance was the biggest miss among the forecasts I highlighted.

Overall though, I’d say forecasters did fairly well in predicting how 2025 turned out, in spite of curveballs like the April tariff shock.

If you think the forecasters are no good and you can do better, you have more options than ever. Prediction markets are getting more questions and more liquidity if you’re up for putting your money where your mouth is; if you don’t want to put your own money at risk, there are forecasting contests with prizes for predicting 2026.

What to Do Before the New Year?

Merry Christmas! I’m gifting you a couple ideas for money things to do in the remaining six days of 2025.

Ways to Help Yourself

Money in US Flexible Spending Accounts (FSAs) often disappears if not requested by New Year’s Day. Don’t forget to draw these down- especially it is a Dependent Care FSA, which can’t carry any money over to the new year. The money goes back to your employer if you don’t spend it, which means they don’t have an incentive to remind you themselves; so I’ll remind you to save you from having to go Krieger.

The next few days are also your last chance to do most tax-deductible spending in 2025, which could be business expenses, or contributing to tax-deductible accounts that don’t expire like a 401k or HSA (not FSA). See a more detailed list of tax ideas here. Depending on your situation (especially whether you itemize), this might also be a good time to make tax-deductible donations, which would:

Help Others

There are many good causes to donate to, but funding high-value low-cost health interventions in poor countries was probably the cheapest reliable way to save a life even before this year. When one of the largest funders of global health, USAID, was shut down this year, the marginal benefit of donations to global health likely went even higher. Givewell does the cost-effectiveness calculations to identify good options for specific charities in this area, like Helen Keller International. I like that I’ve been donating to these charities for years via Givewell’s donation portal and none of them have ever called me (since they don’t require a phone number) or mailed me anything.

This picture shows all the remains of the website of USAID, an agency that spent $32 billion in FY 2024

See you all in 2026