Meta Is Poaching AI Talent With $100 Million Pay Packages; Will This Finally Create AGI?

This month I have run across articles noting that Meta’s Mark Zuckerberg has been making mind-boggling pay offers (like $100 million/year for 3-4 years) to top AI researchers at other companies, plus the promise of huge resources and even (gasp) personal access to Zuck, himself. Reports indicate that he is succeeding in hiring around 50 brains from OpenAI (home of ChatGPT), Anthropic, Google, and Apple. Maybe this concentration of human intelligence will result in the long-craved artificial general intelligence (AGI) being realized; there seems to be some recognition that the current Large Language Models will not get us there.

There are, of course, other interpretations being put on this maneuver. Some talking heads on a Bloomberg podcast speculated that Zuckerberg was using Meta’s mighty cash flow deliberately to starve competitors of top AI talent. They also speculated that (since there is a limit to how much money you can possibly, pleasurably spend) – – if you pay some guy $100 million in a year, a rational outcome would be he would quit and spend the rest of his life hanging out at the beach. (That, of course, is what Bloomberg finance types might think, who measure worth mainly in terms of money, not in the fun of doing cutting edge R&D).

I found a thread on reddit to be insightful and amusing, and so I post chunks of it below. Here is the earnest, optimist OP:

andsi2asi

Zuckerberg’s ‘Pay Them Nine-Figure Salaries’ Stroke of Genius for Building the Most Powerful AI in the World

Frustrated by Yann LeCun’s inability to advance Llama to where it is seriously competing with top AI models, Zuckerberg has decided to employ a strategy that makes consummate sense.

To appreciate the strategy in context, keep in mind that OpenAI expects to generate $10 billion in revenue this year, but will also spend about $28 billion, leaving it in the red by about $18 billion. My main point here is that we’re talking big numbers.

Zuckerberg has decided to bring together 50 ultra-top AI engineers by enticing them with nine-figure salaries. Whether they will be paid $100 million or $300 million per year has not been disclosed, but it seems like they will be making a lot more in salary than they did at their last gig with Google, OpenAI, Anthropic, etc.

If he pays each of them $100 million in salary, that will cost him $5 billion a year. Considering OpenAI’s expenses, suddenly that doesn’t sound so unreasonable.

I’m guessing he will succeed at bringing this AI dream team together. It’s not just the allure of $100 million salaries. It’s the opportunity to build the most powerful AI with the most brilliant minds in AI. Big win for AI. Big win for open source

And here are some wry responses:

kayakdawg

counterpoint 

a. $5B is just for those 50 researchers, loootttaaa other costs to consider

b. zuck has a history of burning big money on r&d with theoretical revenue that doesnt materialize

c. brooks law: creating agi isn’t an easily divisible job – in fact, it seems reasonable to assume that the more high-level experts enter the project the slower it’ll progress given the communication overhead

7FootElvis

Exactly. Also, money alone doesn’t make leadership effective. OpenAI has a relatively single focus. Meta is more diversified, which can lead to a lack of necessary vision in this one department. Passion, if present at the top, is also critical for bleeding edge advancement. Is Zuckerberg more passionate than Altman about AI? Which is more effective at infusing that passion throughout the organization?

….

dbenc

and not a single AI researcher is going to tell Zuck “well, no matter how much you pay us we won’t be able to make AGI”

meltbox

I will make the AI by one year from now if I am paid $100m

I just need total blackout so I can focus. Two years from now I will make it run on a 50w chip.

I promise

The option to leave

The US, like every geopolitical entity to ever exist, has produced global public goods (i.e. international security, defeating the Nazis, etc) and global public bads (greenhouse gases, failed interference in other countries, etc).

I would like to posit something very simple: the greatest public good the United States has ever produced is the option to leave where you are and emigrate to the United States. If a country and its leadership is failing, non-trivial fractions of their population had the viable option to pack their bags and walk out the door. Perhaps unfairly, this is doubly true for their best, brightest, and most endowed with resources, making the threat all the more salient. It’s voting with your feet i.e. Tiebout effects writ large.

If you are a failing nation, your options become to watch your population dissipate or put up a wall blocking exit. Either that or, you know, actively take steps to improve your country so that fewer people wish to leave their home and start over elsewhere. The ramifications of stifled immigraion to the United States will be felt for decades, and not just in the United States in the form of an enervated economy and betrayal of our core civic values, but globally in weakended constraints on every failing regime.

Chesterton Right about the History of Patriotism

Unexpectedly, Chesterton on Patriotism from 2021 is one of my all-time top performing posts due to a slow but steady drip of Google Search hits.

In 1908, G.K. Chesterton published the following line in Orthodoxy,

This, as a fact, is how cities did grow great. Go back to the darkest roots of civilization and you will find them knotted round some sacred stone or encircling some sacred well.

By 1908, Chesterton had likely been exposed to Victorian early anthropological thinkers like Tylor and Frazer. Maybe I shouldn’t be impressed that he’d get it right, but I don’t think of Chesterton as having access to the best and latest evidence for how human civilization evolved.

I was browsing the book Sapiens (2011) this week and came across:

In the conventional picture, pioneers first built a village, and when it prospered, they set up a temple in the middle. But Göbekli Tepe suggests that the temple may have been built first, and that a village later grew up around it. (pg 102)

Today’s post is dedicated to congratulating Chesterton on making a conjecture that turns out to line up with the best we now know and archeological evidence that was only discovered in 1995.

Chesterton wrote,

The only way out of it seems to be for somebody to love Pimlico; to love it with a transcendental tie and without any earthly reason. If there arose a man who loved Pimlico, then Pimlico would rise into ivory towers and golden pinnacles… If men loved Pimlico as mothers love children, arbitrarily, because it is theirs, Pimlico in a year or two might be fairer than Florence.

Also this month I witnessed Americans celebrating the 4th of July. People here love this country “because it is theirs.”

I’ve heard a lot of panicking in the past 10 years about the fate of the nation, and I think we should always be in a partial state of paranoia. But, if love of country is needed in the recipe, we’ve still got it. (you might need an Instagram account to view Mark Zuckerberg Zuck wakeboarding in a bald eagle suit)

The Simple Utility Function Vs. Socialism

I’m a big fan of Friedrich Hayek. I first read his work in an academic setting. But many people first encounter him via The Road to Serfdom, his book that outlines the political and social consequences of state economic controls. I always meant to go back and read it, but it usually took a back seat to other works. Now, I’m slowly making my way through.

A lovely snippet includes Hayek explaining the popular sentiment that “it’s only money” or that money-related concerns are base or superficial. Such an attitude is especially common when people recount their childhood or family life during times of financial difficulty. The story often goes “times were hard, but we had each other”. Similarly, a popularly derisive trope is that economists ‘only care about money’ [, rather than the more important things].

Continue reading

Writing Humanity’s Last Exam

When every frontier AI model can pass your tests, how do you figure out which model is best? You write a harder test.

That was the idea behind Humanity’s Last Exam, an effort by Scale AI and the Center for AI Safety to develop a large database of PhD-level questions that the best AI models still get wrong.

The effort has proven popular- the paper summarizing it has already been cited 91 times since its release on March 31st, and the main AI labs have been testing their new models on the exam. xAI announced today that its new Grok 4 model has the highest score yet on the exam, 44.4%.

Current leaderboard on the Humanity’s Last Exam site, not yet showing Grok 4

The process of creating the dataset is a fascinating example of a distributed academic mega-project, something that is becoming a trend that has also been important in efforts to replicate previous research. The organizers of Humanity’s Last Exam let anyone submit a question for their dataset, offering co-authorship to anyone whose question they accepted, and cash prizes to those who had the best questions accepted. In the end they wound up with just over 1000 coauthors on the paper (including yours truly as one very minor contributor), and gave out $500,000 to contributors of the very best questions (not me), which seemed incredibly generous until Scale AI sold a 49% stake in their company to Meta for $14.8 billion in June.

Source: Figure 4 of the paper

Here’s what I learned in the process of trying to stump the AIs and get questions accepted into this dataset:

  1. The AIs were harder than I expected to stump because they used frontier models rather than the free-tier models I was used to using on my own. If you think AI can’t answer your question, try a newer model
  2. It was common for me to try a question that several models would get wrong, but at least one would still get right. For me this was annoying because questions could only be accepted if every model got them wrong. But of course if you want to get a correct answer, this means trying more models is good, even if they are all in the same tier. If you can’t tell what a correct answer looks like and your question is important, make sure to try several models and see if they give different answers
  3. Top models are now quite good at interpreting regression results, even when you try to give them unusually tricky tables
  4. AI still has weird weaknesses and blind spots; it can outperform PhDs in the relevant field on one question, then do worse than 3rd graders on the next. This exam specifically wanted PhD-level questions, where a typical undergrad not only couldn’t answer the question, but probably couldn’t even understand what was being asked. But it specifically excluded “simple trick questions”, “straightforward calculation/computation questions”, and questions “easily answerable by everyday people”, even if all the AIs got them wrong. My son had the idea to ask them to calculate hyperfactorials; we found some relatively low numbers that stumped all the AI models, but the human judges ruled that our question was too simple to count. On a question I did get accepted, I included an explanation for the human judges of why I thought it wasn’t too simple.

I found this to be a great opportunity to observe the strengths and weaknesses of frontier models, and to get my name on an important paper. While the AI field is being driven primarily by the people with the chops to code frontier models, economists still have lot we can contribute here, as Joy has shown. Any economist looking for the next way to contribute here should check out Anthropic’s new Economic Futures Program.

23 MSAs Produce Half of US GDP

The 23 blue-shaded MSAs in this map produce half of US GDP:

You might be tempted to think this map, like so many maps, is just a map of US population. It kind of is, but not completely. These 23 MSAs have 133 million people (as of the 2020 Census), or about 40% of the US population. That’s a lot, but it’s much less than half, which the GDP proportion they account for. In other words, these MSAs also tend to have above-average per capita income.

The three largest MSAs by population (NY, LA, Chicago) are also the three largest by GDP. But after the first three there are some interesting discrepancies. The San Francisco MSA is the 4th largest by GDP, but only the 12th largest by population — San Fran has a population similar to the Phoenix MSA, but almost double the GDP. San Francisco MSA has a very high GDP per capita (the third highest).

The San Jose MSA is also among these 23 largest MSAs for GDP, and also sticks out — it is the 13th largest by total GDP, but only the 36th largest by population. San Jose has a population similar to Cleveland and Nashville, but well over double the GDP of these two MSAs individually. In fact, there are 12 MSAs larger in population than San Jose, but that aren’t among these 23 MSAs that produce half of US GDP: places like St. Louis, Orlando, San Antonio, Pittsburgh, and Columbus. Silicon Valley really pulls up San Jose: it has the 2nd largest GDP per capita among MSAs, only beaten by much smaller Midland, Texas and its oil income.

Here is the full list of those 23 MSAs:

  1. New York-Newark-Jersey City, NY-NJ-PA
  2. Los Angeles-Long Beach-Anaheim, CA
  3. Chicago-Naperville-Elgin, IL-IN-WI
  4. San Francisco-Oakland-Berkeley, CA
  5. Dallas-Fort Worth-Arlington, TX
  6. Washington-Arlington-Alexandria, DC-VA-MD-WV
  7. Houston-The Woodlands-Sugar Land, TX
  8. Boston-Cambridge-Newton, MA-NH
  9. Seattle-Tacoma-Bellevue, WA
  10. Atlanta-Sandy Springs-Alpharetta, GA
  11. Philadelphia-Camden-Wilmington, PA-NJ-DE-MD
  12. Miami-Fort Lauderdale-Pompano Beach, FL
  13. San Jose-Sunnyvale-Santa Clara, CA
  14. Phoenix-Mesa-Chandler, AZ
  15. Minneapolis-St. Paul-Bloomington, MN-WI
  16. Detroit-Warren-Dearborn, MI
  17. San Diego-Chula Vista-Carlsbad, CA
  18. Denver-Aurora-Lakewood, CO
  19. Baltimore-Columbia-Towson, MD
  20. Austin-Round Rock-Georgetown, TX
  21. Charlotte-Concord-Gastonia, NC-SC
  22. Riverside-San Bernardino-Ontario, CA
  23. Tampa-St. Petersburg-Clearwater, FL

Economic Impact of Agricultural Worker Deportations Leads to Administration Policy Reversals

Here is a chart of the evolution of U.S. farm workforce between 1991 and 2022:

Source: USDA

A bit over 40% of current U.S. farm workers are illegal immigrants. In some regions and sectors, the percentage is much higher. The work is often uncomfortable and dangerous, and far from the cool urban centers. This is work that very few U.S. born workers would consider doing, unless the pay was very high, so it would be difficult to replace the immigrant labor on farms in the near term. I don’t know how much the need for manpower would change if cheap illegal workers were not available, and therefore productivity was supplemented with automation.

It apparently didn’t occur to some members of the administration that deporting a lot of these workers (and frightening the rest into hiding) would have a crippling effect on American agriculture. Sure enough, there have recently been reports in some areas of workers not showing up and crops going unharvested.

It is difficult for me as a non-expert to determine how severe and widespread the problems actually are so far. Anti-Trump sources naturally emphasize the genuine problems that do exist and predict apocalyptic melt-down, whereas other sources are more measured. I suspect that the largest agribusinesses have kept better abreast of the law, while smaller operations have cut legal corners and may have that catch up to them. For instance, a small meat packer in Omaha reported operating at only 30% capacity after ICE raids, whereas the CEO of giant Tyson Foods claimed that “every one who works at Tyson Foods is authorized to do so,” and that the company “is in complete compliance” with all the immigration regulations.

With at least some of these wholly predictable problems from mass deportations now becoming reality, the administration is undergoing internal debates and policy adjustments in response. On June 12, President Trump very candidly acknowledged the issue, writing on Truth Social, “Our great Farmers and people in the hotel and leisure business have been stating that our very aggressive policy on immigration is taking very good, long-time workers away from them, with those jobs being almost impossible to replace…. We must protect our Farmers, but get the CRIMINALS OUT OF THE USA. Changes are coming!” 

The next day, ICE official Tatum King wrote regional leaders to halt investigations of the agricultural industry, along with hotels and restaurants. That directive was apparently walked back a few days later, under pressure from outraged conservative supporters and from Deputy White House Chief of Staff Stephen Miller. Miller, an immigration hard-liner, wants to double the ICE deportation quota, up to 3,000 per day.

This issue could go in various ways from here. Hard-liners on the left and on the right have a way of pushing their agendas to unpalatable extremes. It can be argued that the Democrats could easily have won in 2024 had their policies been more moderate. Similarly, if immigration hard-liners get their way now, I predict that the result will be their worst nightmare: a public revulsion against enforcing immigration laws in general. If farmers and restaurateurs start going bust, and food shortages and price spikes appear in the supermarket, public support for the administration and its project of deporting illegal immigrants will reverse in a big way. Some right-wing pundits would not be bothered by an electoral debacle, since their style is to stay constantly outraged, and (as the liberal news outlets currently demonstrate), it is easier to project non-stop outrage when your party is out of power.

An optimist, however, might see in this controversy an opening for some sort of long-term, rational solution to the farm worker issue. Agricultural Secretary Brooke Rollins has proposed expansion of the H-2A visa program, which allows for temporary agricultural worker residency to fill labor shortages. This is somewhat similar to the European guest worker programs, though with significant differences. H-2A requires the farmer to provide housing and take legal responsibility for his or her workers. H-2B visas allow for temporary non-agricultural workers, without as much employer responsibility. A bill was introduced into Congress with bi-partisan support to modernize the H-2A program, so that legislative effort may have legs. Maybe there can be a (gasp!) compromise.

President Trump last week came out strongly in favor of this sort of solution, with a surprisingly positive take on the (illegal) workers who have worked diligently on a farm for years. By “put you in charge” he is seems to refer to the responsibilities that H-2A employers undertake for their employers, and perhaps extending that to H-2B employers. He acknowledges that the far-right will not be happy, but hopes “they’ll understand.” From Newsweek:

“We’re working on legislation right now where – farmers, look, they know better. They work with them for years. You had cases where…people have worked for a farm, on a farm for 14, 15 years and they get thrown out pretty viciously and we can’t do it. We gotta work with the farmers, and people that have hotels and leisure properties too,” he said at the Iowa State Fairgrounds in Des Moines on Thursday.

“We’re gonna work with them and we’re gonna work very strong and smart, and we’re gonna put you in charge. We’re gonna make you responsible and I think that that’s going to make a lot of people happy. Now, serious radical right people, who I also happen to like a lot, they may not be quite as happy but they’ll understand. Won’t they? Do you think so?”

We shall see.

It’s not AGI if it has a dial you can adjust to produce your preferred falsehoods

It’s not AGI, it’s barely even regular AI, when an LLM is this heavily directed. This appears to be very real over on Twitter. What’s most telling is the thinness of the prompts that yield very specific responses that, suffice it to say, Grok would not have provided even 3 months ago.

Musk has adjusted Grok’s algorithm so it’s now a neo-Nazi.Pretty cool that almost every progressive commentator, elected official and organization still uses Musk’s X algorithm to communicate with the public! Good job guys.

Max Berger (@maxberger.bsky.social) 2025-07-06T17:39:08.885Z

I’ll simply say this: no one has declined more in my estimation in my entire life than Elon Musk. I thought he was an engineering genius not even 5 years ago, perhaps awkward in some ways, but earnest. Now he is (or is working very diligently to project an identity of) a white supremacist desperate to play off of traditional racist and antisemitic fears to maintain his own status and influence. His ambition and resources have been combined with a monstrous agenda, and the world is much worse for it. It’s tragic in every way.

With regards to AI, there needs to be more discussion of the market for AIs, plural. I think a lot of people are operating off the assumption that AI will be like Google or VHS. A natural monopoly; one AI to rule them all and bind them. I’m not so sure. I think there is a very real chance that AI’s will find niches. That different algorithms will create different families of bespoke AIs. It feels like the world is already siloed into echo chambers of entertainment- and identity-based news feeds. If AI allows us each to get bespoke answers, serving our own person confirmation biases, to each and every question, is that better or worse? In a counter intuitive way, it could actually be better. You can’t get communities and cults of one. It might be better for the world if the news became something you couldn’t create effective propaganda out of.

Hallucination as a User Error

You don’t use a flat head screwdriver to drill a hole in a board. You should know to use a drill.

I appreciate getting feedback on our manuscript, “LLM Hallucination of Citations in Economics Persists with Web-Enabled Models,” via X/Twitter. @_jannalulu wrote: “that paper only tested 4o (which arguably is a bad enough model that i almost never use it).”

Since the scope and frequency of hallucinations came as a surprise to many LLM users, they have often been used as a ‘gotcha’ to criticize AI optimists. People, myself included, have sounded the alarm that hallucinations could infiltrate articles, emails, and medical diagnoses.

The feedback I got from power users on Twitter this week made me think that there might be a cultural shift in the medium term. (Yes, we are always looking for someone to blame.) Hallucinations will be considered the fault of the human user who should have:

  1. Used a better model (learn your tools)
  2. Written a better prompt (learn how to use your tools)
  3. Assigned the wrong task to LLMs (it’s been known for over 2 years that general LLM models hallucinate citations). What did you expect from “generative” AI? LLMs are telling you what literature ought to exist as opposed to what does exist.

My Perfunctory Intern

A couple years ago, my Co-blogger Mike described his productive, but novice intern. The helper could summarize expert opinion, but they had no real understanding of their own. To boot, they were fast and tireless. Of course, he was talking about ChatGPT. Joy has also written in multiple places about the errors made by ChatGPT, including fake citations.

I use ChatGPT Pro, which has Web access and my experience is that it is not so tireless. Much like Mike, I have used ChatGPT to help me write Python code. I know the basics of python, and how to read a lot of of it. However, the multitude of methods and possible arguments are not nestled firmly in my skull. I’m much faster at reading, rather than writing Python code. Therefore, ChatGPT has been amazing… Mostly.

I have found that ChatGPT is more like an intern than many suppose:

Continue reading