Understanding Data: A Chart on Gen Z and Alcohol

Have you seen this chart?

The chart originates from Statista, as you can see from the label in the image. But it is very frequently shared on social media, Reddit, and elsewhere (often with the Statista label clipped), occasionally generating millions of views and lots of heated comments.

But it’s a bad graph. In so many ways. Let’s break them down.

The data comes from BLS’s Consumer Expenditures Survey. I use this data frequently, as regular readers probably know. The data in the viral chart is from 2021 (more on that in a moment), but if I create a similar chart using the most recent data in 2023 but also include spending by those older than Baby Boomers (primarily the Silent Generation), you will notice a curious thing:

Continue reading

Saving Money by Ordering Car Parts from Amazon or eBay

Here is a personal economical anecdote from this week. A medium-sized dead branch fell from a tall tree and ripped off the driver side mirror on my old Honda. My local repair shop said it would cost around $600 to replace it. That is a significant percentage of what the old clunker is worth. Ouch.

They kindly noted that most of that cost would was ordering a replacement mirror assembly from Honda, which would cost over $400 and take several days to arrive.  I asked if I could try to get a mirror from a junkyard, to save money. The repair guy said they would be willing to install a part I brought in, but suggested eBay or Amazon instead.

Back 20 years ago, before online commerce was so established, my local repair shop would routinely save us money by getting used parts from some sort of junkyard network.
So, I started looking into that route. First, junkyards are not junkyards anymore, they are “salvage yards.” Second, it turns out that to remove a side mirror from a Honda is not a simple matter. You have to remove the inside whole plastic door panel to get at the mirror mounting screws, and removing that panel has some complications. Also, I could not find a clear online resource for locating parts at regional salvage yards. It looks like you have to drive to a salvage yard, and perhaps have them search some sort of database to find a comparable vehicle somewhere that might have the part you want.


All this seemed like a lot of hassle, so I went to eBay, and found a promising looking new replacement part there for about $56, including shipping. It would take about a week to get here (probably being direct shipped from China). On Amazon, I found essentially the same part for about $63, that would get here the next day. For the small difference and price, I went the Amazon route, partly for the no hassle returns if the part turned out to be defective and partly because I get 5% back on my Amazon credit card there.
I just got the car back from the repair shop with the replacement mirror, and it works fine. The total cost, with labor was about $230, which is much better than the original $600+ estimate.


I’m not sure how broadly to generalize this experience. Some further observations:

( 1 ) For a really critical car part, I’d have to consider carefully if the Chinese knock-off would perform appreciably worse than some name-brand part – -although, I believe many repair shops often use parts that are not strictly original parts.

( 2 ) Commonly replaced parts like oil and air filters are typically cheaper to buy on-line than from your local Auto Zone or other local merchant. I like supporting local shops, so sometimes I eat the few extra $$ and shopping time, and buy from bricks and mortar.

( 3 ) Some repair shops make significant money on their markup on parts, and so they might not be happy about you bringing in your own parts. They also might decline to warrant the operation of that part. And many big box franchise repair shops may simply refuse to install customer-supplied parts.

( 4 ) For a newish car, still under warranty, the manufacturer warranty might be affected by using non-original parts.

( 5 ) Back to junk/salvage yards: there are some car parts, so-called hard parts, that are expected to last the life of the car. Things like the mounting brackets for engine parts. Typically, no spares of these are manufactured. So, if one of those parts gets dinged up in an accident, your only option may be used parts taken from a junker.

Illusions of Illusions of Reasoning

Even since Scott’s post on Tuesday of this week, a new response has been launched titled “The Illusion of the Illusion of the Illusion of Thinking

Abstract (emphasis added by me): A recent paper by Shojaee et al. (2025), The Illusion of Thinking, presented evidence of an “accuracy collapse” in Large Reasoning Models (LRMs), suggesting fundamental limitations in their reasoning capabilities when faced with planning puzzles of increasing complexity. A compelling critique by Opus and Lawsen (2025), The Illusion of the Illusion of Thinking, argued these findings are not evidence of reasoning failure but rather artifacts of flawed experimental design, such as token limits and the use of unsolvable problems. This paper provides a tertiary analysis, arguing that while Opus and Lawsen correctly identify critical methodological flaws that invalidate the most severe claims of the original paper, their own counter-evidence and conclusions may oversimplify the nature of model limitations. By shifting the evaluation from sequential execution to algorithmic generation, their work illuminates a different, albeit important, capability. We conclude that the original “collapse” was indeed an illusion created by experimental constraints, but that Shojaee et al.’s underlying observations hint at a more subtle, yet real, challenge for LRMs: a brittleness in sustained, high-fidelity, step-by-step execution. The true illusion is the belief that any single evaluation paradigm can definitively distinguish between reasoning, knowledge retrieval, and pattern execution.

As am writing a new manuscript about hallucination of web-enabled models, this is close to what I am working on. Conjuring up fake academic references might point to a lack of true reasoning ability.

Do Pro and Dantas believe that LLMs can reason? What they are saying, at least, is that evaluating AI reasoning is difficult. In their words, the whole back-and-forth “highlights a key challenge in evaluation: distinguishing true, generalizable reasoning from sophisticated pattern matching of familiar problems…”

The fact that the first sentence of the paper contains the bigram “true reasoning” is interesting in itself. No one dobuts that LLMs are reasoning anymore, at least within their own sandboxes. Hence there have been Champagne jokes going around of this sort:

If you’d like to read a response coming from o3 itself, Tyler pointed me to this:

Salty SALT in the OBBB

The Republicans hold a majority in both chambers of congress and they are the party of the president. They want to use that opportunity to pass substantial legislation that addresses their priorities. Hence, the One, Big, Beautiful Bill (OBBB). But, just like the Democratic party, Republican congressmen are a coalition with various and sometimes divergent policy agendas. There are ‘Trump’ Republicans, who want tariffs, executive orders, and deportations. There are more liberal members who want more free markets. You can also find the odd ‘crypto bro’, blue-state representatives, and deficit hawks. Given the slim majority in the House of Representatives, they all have to get something out of the legislation. Put them together, and what have you got?* You get a signature piece of legislation that no one is happy about but everyone touts.

One example of such compromise is the State and Local Tax federal income tax deduction, or SALT deduction. The idea behind it is that income shouldn’t be taxed twice. If you pay a part of your income to your state government in the form of taxes, then the argument goes that you shouldn’t be taxed on that part of your income because you never actually saw it in your bank account. The state took it and effectively lowered your income. The state and local taxes get deducted from the taxable income that you report to the federal government.  The reasoning is that you shouldn’t need to pay taxes on your taxes.

Paying taxes on your taxes sounds bad. And plenty of people don’t like one tax, much less two. The Tax Foundation has done a lot of good work to cut through the chaff and has published many pieces on the SALT deduction over the years.**

Cut and Dry SALT Deduction Facts:

  • It’s a tax cut
  • It reduces federal tax revenue
  • It adds tax code complication
  • It is used by people who itemize rather than take the standard income tax deduction
  • Prior to the 2017 Tax & Jobs Act, there was no limit on the SALT deduction. After, the limit was $10k.
  • The current OBBB increases the SALT deduction.

Those are the basics. Everything else is analysis. The Grover Norquist Republicans never see a tax cut that looks bad, so they’d like to see the SALT limit raised or disappear. Tax think tanks that like simplicity don’t like the SALT deduction because it adds complication. Plenty of others say they don’t like complication, but often change their mind when it comes to the details (much like cutting government waste). Think tanks tend to be a bit lonely on this point.

People mostly care about the SALT deduction due to the distributional effects. Who ends up benefiting from the deduction? The short answer is people who 1) itemize & 2) have heavy state and local tax bills. Who is that? Rich people of course! They have high incomes and lots of wealth and real estate – on which they pay taxes. But not all rich people pay loads of state taxes. So the SALT deduction is a tax cut that primarily benefits rich people who live in high tax districts. Where’s that? See the below.

Continue reading

LIFE Survey Comes Alive

Last year I posted that the Philly Fed had started a new quarterly survey on Labor, Income, Finances, and Expectations (LIFE). I thought it looked promising but had yet to achieve its potential:

It will be interesting to see if this ends up taking a place in the set of Fed surveys that are always driving economic discussions, like the Survey of Consumer Finances and the Survey of Professional Forecasters. If they keep it up and start putting out some graphics to summarize it, I think it will. My quick impression (not yet having spoken to Fed people about it) is that it will be the “quick hit” version of the Survey of Consumer Finances. It asks a smaller set of questions on somewhat similar topics, but is released quickly after each quarter instead of slowly after each year. If they stick with the survey it will get more useful over time, as there is more of a baseline to compare to.

But a year later the survey now has what I hoped for: a solid baseline for comparisons, and pre-made graphics to summarize the results. It continues to show complex and mixed economic performance in the US. People think the economy is getting worse:

They are cutting discretionary (but not necessity) spending at record levels:

They are worried about losing their jobs at record levels:

But key areas like housing, childcare, and transportation are stabilizing:

Overall I think we can synthesize these seemingly contradictory pictures by saying that Americans’ finances are fine now, but they are quite worried that things are about to get worse, perhaps due to the tariffs taking effect. You can find the rest of the LIFE survey results (including all the non-record-setting ones) here.

Household Formation and Generational Wealth

Last week I tried to address whether rising wealth for younger generations was primarily driven by rising home values. My analysis suggested that it was a cause, but not the only cause. Here’s another chart on that topic, showing median net worth excluding home equity for recent generations:

Two things are notable in the chart. For millennials, even excluding home equity they are well ahead of past generations, though of course their net worth is much smaller excluding this category of wealth (the total median net worth for millennials in 2022 was $93,800). But for Gen X in 2022 (last data in that chart), they are slightly behind Boomers, never having recovered from the decline in wealth after 2007 (primarily from the stock market decline, since we’re excluding housing).

But today I want to address another general objection to the wealth data found in the Fed’s SCF and DFA programs. That objection has to do with household formation. Specifically, these surveys are calculated for households, and the age/generation indicators are for the household head (or “householder” as it is now called). And we know that household formation has been declining over time, as more young people live with parents, with roommates, etc. So the Millennial data we see in the chart above is excluding any Millennials that have not yet formed their own household.

Here’s a general picture of the decline, which has been happening gradually since about 1980. Note: I use the age group 26-41, because this is the age of Millennials in 2022 (the most recent SCF survey year). The highlighted years on the chart are when the Silent, Baby Boomer, Gen X, and Millennial generations were about the same age (26-41).

What this means is that when we are looking at households in these wealth surveys (or any survey that focuses on households) we aren’t quite comparing apples to apples. Does this mean the surveys are worthless? No! With the microdata in the SCF, we can look at not only the median value, but the entire distribution. Since the household formation rate has fallen by about 11 percentage points between Boomers in 1989 and Millennials in 2022, one solution is to look up or down the distribution for a rough comparison.

For example, if we assume all of the 11 percent of non-householders among Millennials have wealth below the median, we can make a rough correction by looking at the 39th percentile for Millennials — the 39th percentile would be the median if you included all of those 11 percent of non-householders as households. Similarly, for Gen X would move down 5 percentage points in the distribution to the 45th percentile in 2007.

The household-formation-adjusted chart does paint a more pessimistic picture than just looking at the median for each generation: the 39th percentile Millennial has about 20% less wealth than the median Boomer did at roughly the same age. Seems like generational decline! Is there any silver lining?

First, you should interpret the chart above as a worst case scenario for Millennial wealth. It assumes all non-householders have low wealth. But likely not all of them do. If instead we use the 43rd percentile of Millennials in 2022, their net worth is $61,000, slightly above Boomers at the same age. (The household formation problem isn’t going away anytime soon as generations age — even if we look at Gen Xers, with a median age of 50 in 2022, their household formation is still 6 percentage points behind Boomers at that age.)

Second, my worst case scenario almost certainly overstates the problem. If all of those 11 percent fewer Millennials not yet forming households were to get married to other millennials, it would only add half of that many households to the aggregate distribution (when two non-householders get married, it becomes one household). So instead of moving down 11 percentage points to the 39th percentile, we should only move down 5 or 6 percentiles. The 44th percentile of Millennial net worth in 2022 was $63,060 — again, compare this to Boomers in the chart above.

Finally, if we combine both of the adjustments discussed in this post, looking at wealth excluding home equity and also adjusting for the decline in household formation, we get the following chart (here I once again use the 39th percentile for Millennials and the 45th percentile for Gen X, i.e., the worst case scenario):

With this final adjustment, we get a slightly different picture. The wealth of these three generations is roughly the same at the same age. No increase in wealth, but no decline either. You could read this as pessimistic, if your assumption is that wealth should rise over time, but the general vibes out there are that young people are worse off than in the past. This wealth data suggests, once again, that the kids are doing all right.

Did Apple’s Recent “Illusion of Thinking” Study Expose Fatal Shortcomings in Using LLM’s for Artificial General Intelligence?

Researchers at Apple last week published with the provocative title, “The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity.”  This paper has generated uproar in the AI world. Having “The Illusion of Thinking” right there in the title is pretty in-your-face.

Traditional Large Language Model (LLM) artificial intelligence programs like ChatGPT train on massive amounts of human-generated text to be able to mimic human outputs when given prompts. A recent trend (mainly starting in 2024) has been the incorporation of more formal reasoning capabilities into these models. The enhanced models are termed Large Reasoning Models (LRMs). Now some leading LLMs like Open AI’s GPT, Claude, and the Chinese DeepSeek exist both in regular LLM form and also as LRM versions.

The authors applied both the regular (LLM) and “thinking” LRM versions of Claude 3.7 Sonnet and DeepSeek to a number of mathematical type puzzles. Open AI’s o-series were used to a lesser extent. An advantage of these puzzles is that researchers can, while keeping the basic form of the puzzle, dial in more or less complexity.

They found, among other things, that the LRMs did well up to a certain point, then suffered “complete collapse” as complexity was increased. Also, at low complexities, LLMs actually outperform LRMs. And (perhaps the most vivid evidence of lack of actual understanding on the part of these programs), when they were explicitly offered an efficient direct solution algorithm in the prompt, the programs did not take advantage of it, but instead just kept grinding away in their usual fashion.

As might be expected, AI skeptics were all over the blogosphere, saying, I told you so, LLMs are just massive exercises in pattern matching, and cannot extrapolate outside of their training set. This has massive implications for what we can expect in the near or intermediate future. Among other things, the optimism about AI progress is largely what is fueling the stock market, and also capital investment in this area: Companies like Meta and Google are spending ginormous sums trying to develop artificial “general” intelligence, paying for ginormous amounts of compute power, with those dollars flowing to firms like Microsoft and Amazon building out data centers and buying chips from Nvidia. If the AGI emperor has no clothes, all this spending might come to a screeching crashing halt.

Ars Technica published a fairly balanced account of the controversy, concluding that, “Even elaborate pattern-matching machines can be useful in performing labor-saving tasks for the people that use them… especially for coding and brainstorming and writing.”

Comments on this article included one like:

LLMs do not even know what the task is, all it knows is statistical relationships between words.   I feel like I am going insane. An entire industry’s worth of engineers and scientists are desperate to convince themselves a fancy Markov chain trained on all known human texts is actually thinking through problems and not just rolling the dice on what words it can link together.

And

if we equate combinatorial play and pattern matching with genuinely “generative/general” intelligence, then we’re missing a key fact here. What’s missing from all the LLM hubris and enthusiasm is a reflexive consciousness of the limits of language, of the aspects of experience that exceed its reach and are also, paradoxically, the source of its actual innovations. [This is profound, he means that mere words, even billions of them, cannot capture some key aspects of human experience]

However, the AI bulls have mounted various come-backs to the Apple paper. The most effective I know of so far was published by Alex Lawsen, a researcher at LLM firm Open Philanthropy. Lawsen’s rebuttal, titled “The Illusion of the Illusion of Thinking,  was summarized by Marcus Mendes. To summarize the summary, Lawsen claimed that the models did not in general “collapse” in some crazy way. Rather, the models in many cases recognized that they would not be able to solve the puzzles given the constraints input by the Apple researchers. Therefore, they (rather intelligently) did not try to waste compute power by grinding away to a necessarily incomplete solution, but just stopped. Lawsen further showed that the ways Apple ran the LRM models did not allow them to perform as well as they could. When he made a modest, reasonable change in the operation of the LRMs,

Models like Claude, Gemini, and OpenAI’s o3 had no trouble producing algorithmically correct solutions for 15-disk Hanoi problems, far beyond the complexity where Apple reported zero success.

Lawsen’s conclusion: When you remove artificial output constraints, LRMs seem perfectly capable of reasoning about high-complexity tasks. At least in terms of algorithm generation.

And so, the great debate over the prospects of artificial general intelligence will continue.

Kayfabe in the political marketplace

Kayfabe: the tacit agreement to behave as if something is real, sincere, or genuine when it is not

The term comes from wrestling, which is fitting because for years I’ve been stealing Dana Gould’s line “Politics is just professional wrestling in suits.” There are limits to my casual theft, though. I will not pretend that I am the first to observe that politics has deeply internalized the kayfabe code of vehemently declaring beliefs or expectations in no way actually held while simultaneously understanding that your rivals are doing the same.

I am curious whether you can undermine your own political agenda or influence by going too far, by committing too much to a fictional worldview. Much like Serpico, if you go too deep you can lose track of the real you. Grandstanding in front of the cameras is one thing, but if you want to trade in the political marketplace, you need to be able to credibly do so in good faith. I can’t help but wonder if part of the inability of Congress to assert it’s constitutional power in face of Executive overreach is a political market failue. Has the information and signal quality between represenatives been so eroded that prices and, in turn, exchanges can’t emerge?

On a more optimistic note, however, I expect that at some point a political party will find advantage in have better norms of credible signaling if only because they will have an easier time solving their own collective action problem. The question remains, though, at what point will the political advantage of superior collective action dominate the electoral advantage of earnestly lying to voters?

Waymo and Pictures Online

Memes on Twitter are at least as important today as political cartoons of the 19th and 20th centuries. Two memes went viral this week, surrounding the events of protests in Los Angeles.  

The first meme reflects how Lord of the Rings is deeply embedded in American culture. Two million views means that most people don’t need the joke explained. (Part of the Lord of the Rings story is that the wise and powerful elves abandon chaotic Middle Earth for the safety of the Grey Havens. Similarly, the Waymo cars quietly drove themselves to safety away from Los Angeles after several had been vandalized and burned.)

I worry that The Lord of the Rings has made us too optimistic. Americans rarely tolerate stories that do not have happy endings. Has that made it hard for us to understand global events, or impaired our ability to accurately predict how most battles will end?

The next viral meme about the Waymos has two tweets to track. The originator of the joke got half a million views. Someone who added AI-generated images to the original text got half a million views as well.

The four quadrants form is a common meme format. Starting in the upper left, perhaps the best way to explain this is that a rigid socialist might resent Waymo as a symbol of Big Tech. At the very least, a socialist who wields state control might want to nationalize Waymo if there are going to be autonomous cars.

“Protect the Waymos” voices support for the police and traditional property rules. This is a joke, keep in mind, so it’s not meant to make perfect sense. The Libertarian Right might be more likely to support individuals protecting themselves with their own weapons as opposed to relying on the police state.

Compare the meme world to the news. I feel like this might become quaint soon, so here’s what it looks like to use the Google News search function.

The Los Angeles Times reports

The autonomous ride-hailing service Waymo has suspended operations in downtown Los Angeles after several of its vehicles were set on fire during protests against immigration raids in the area.

At least five Waymo vehicles were destroyed over the weekend, the company said. Waymo removed its vehicles from downtown but continues to operate in other parts of Los Angeles.

The flaming Waymo image is all over the news and internet. For one thing, people are interested in it because it’s real. This was supposed to be the year of deepfakes, and yet it’s mostly real images and real gaffes that are still making the news.

At this moment in time, most Americans do not yet have Waymo in their cities. Oddly enough, losing an inventory of several cars might be well worth the publicity the company is receiving. Two million viewers of the video of the little driverless cars making an orderly exit from Los Angeles might come away thinking driverless cars are safe and sensible.

Here are more posts from my long-standing interest in cartoons and internet culture.

Kyla Scanlon also felt like the little cars on fire was worth a newsletter post. She wrote

We scroll past the burning car to see arguments about whether the burning was justified, then scroll past those to see memes about the arguments, then scroll past those to see counter-memes. The cycle feeds itself!