Excluding “Non-Excludable” Goods

Intro microeconomics classes teach that some goods are “non-excludable”, meaning that people who don’t pay for them can’t be stopped from using them. This can lead to a “tragedy of the commons”, where the good gets overused because people don’t personally bear the cost of using it and don’t care about the costs they impose on others. Overgrazing land and overfishing the seas are classic examples.

Source: Microeconomics, by Michael Parkin

Students sometimes get the impression that “excludability” is an inherent property of a good. But in fact, which goods are excludable is a function of laws, customs, and technologies, and these can change over time. Land might be legally non-excludable (and so over-grazed) when it is held in common, but become excludable when the land is privatized or when barbed wire makes enclosing it cheap. Over time, such changes have turned over-grazing into a relatively minor issue.

Overfishing remains a major problem, but this could be starting to change. Legal and technological changes have allowed for enclosed, private aquaculture on some coasts, which provide a large and growing share of all fish eaten by humans. Permitting systems put limits on catches in many countries’ waters, though the high seas remain a true tragedy of the commons for now.

While countries have tried to enforce limits on catches in their national waters, monitoring how many fish every boat is taking has been challenging, so illegal overfishing has remained widespread. But technology is in the process of changing this. For instance, ThayerMahan is developing hydrophone arrays that use sound to track boats:

Technologies like hydrophones and satellites, if used well, will increasingly make public waters more “excludable” and reduce “tragedy of the commons” overfishing.

Saving Money by Ordering Car Parts from Amazon or eBay

Here is a personal economical anecdote from this week. A medium-sized dead branch fell from a tall tree and ripped off the driver side mirror on my old Honda. My local repair shop said it would cost around $600 to replace it. That is a significant percentage of what the old clunker is worth. Ouch.

They kindly noted that most of that cost would was ordering a replacement mirror assembly from Honda, which would cost over $400 and take several days to arrive.  I asked if I could try to get a mirror from a junkyard, to save money. The repair guy said they would be willing to install a part I brought in, but suggested eBay or Amazon instead.

Back 20 years ago, before online commerce was so established, my local repair shop would routinely save us money by getting used parts from some sort of junkyard network.
So, I started looking into that route. First, junkyards are not junkyards anymore, they are “salvage yards.” Second, it turns out that to remove a side mirror from a Honda is not a simple matter. You have to remove the inside whole plastic door panel to get at the mirror mounting screws, and removing that panel has some complications. Also, I could not find a clear online resource for locating parts at regional salvage yards. It looks like you have to drive to a salvage yard, and perhaps have them search some sort of database to find a comparable vehicle somewhere that might have the part you want.


All this seemed like a lot of hassle, so I went to eBay, and found a promising looking new replacement part there for about $56, including shipping. It would take about a week to get here (probably being direct shipped from China). On Amazon, I found essentially the same part for about $63, that would get here the next day. For the small difference and price, I went the Amazon route, partly for the no hassle returns if the part turned out to be defective and partly because I get 5% back on my Amazon credit card there.
I just got the car back from the repair shop with the replacement mirror, and it works fine. The total cost, with labor was about $230, which is much better than the original $600+ estimate.


I’m not sure how broadly to generalize this experience. Some further observations:

( 1 ) For a really critical car part, I’d have to consider carefully if the Chinese knock-off would perform appreciably worse than some name-brand part – -although, I believe many repair shops often use parts that are not strictly original parts.

( 2 ) Commonly replaced parts like oil and air filters are typically cheaper to buy on-line than from your local Auto Zone or other local merchant. I like supporting local shops, so sometimes I eat the few extra $$ and shopping time, and buy from bricks and mortar.

( 3 ) Some repair shops make significant money on their markup on parts, and so they might not be happy about you bringing in your own parts. They also might decline to warrant the operation of that part. And many big box franchise repair shops may simply refuse to install customer-supplied parts.

( 4 ) For a newish car, still under warranty, the manufacturer warranty might be affected by using non-original parts.

( 5 ) Back to junk/salvage yards: there are some car parts, so-called hard parts, that are expected to last the life of the car. Things like the mounting brackets for engine parts. Typically, no spares of these are manufactured. So, if one of those parts gets dinged up in an accident, your only option may be used parts taken from a junker.

Illusions of Illusions of Reasoning

Even since Scott’s post on Tuesday of this week, a new response has been launched titled “The Illusion of the Illusion of the Illusion of Thinking

Abstract (emphasis added by me): A recent paper by Shojaee et al. (2025), The Illusion of Thinking, presented evidence of an “accuracy collapse” in Large Reasoning Models (LRMs), suggesting fundamental limitations in their reasoning capabilities when faced with planning puzzles of increasing complexity. A compelling critique by Opus and Lawsen (2025), The Illusion of the Illusion of Thinking, argued these findings are not evidence of reasoning failure but rather artifacts of flawed experimental design, such as token limits and the use of unsolvable problems. This paper provides a tertiary analysis, arguing that while Opus and Lawsen correctly identify critical methodological flaws that invalidate the most severe claims of the original paper, their own counter-evidence and conclusions may oversimplify the nature of model limitations. By shifting the evaluation from sequential execution to algorithmic generation, their work illuminates a different, albeit important, capability. We conclude that the original “collapse” was indeed an illusion created by experimental constraints, but that Shojaee et al.’s underlying observations hint at a more subtle, yet real, challenge for LRMs: a brittleness in sustained, high-fidelity, step-by-step execution. The true illusion is the belief that any single evaluation paradigm can definitively distinguish between reasoning, knowledge retrieval, and pattern execution.

As am writing a new manuscript about hallucination of web-enabled models, this is close to what I am working on. Conjuring up fake academic references might point to a lack of true reasoning ability.

Do Pro and Dantas believe that LLMs can reason? What they are saying, at least, is that evaluating AI reasoning is difficult. In their words, the whole back-and-forth “highlights a key challenge in evaluation: distinguishing true, generalizable reasoning from sophisticated pattern matching of familiar problems…”

The fact that the first sentence of the paper contains the bigram “true reasoning” is interesting in itself. No one dobuts that LLMs are reasoning anymore, at least within their own sandboxes. Hence there have been Champagne jokes going around of this sort:

If you’d like to read a response coming from o3 itself, Tyler pointed me to this:

Did Apple’s Recent “Illusion of Thinking” Study Expose Fatal Shortcomings in Using LLM’s for Artificial General Intelligence?

Researchers at Apple last week published with the provocative title, “The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity.”  This paper has generated uproar in the AI world. Having “The Illusion of Thinking” right there in the title is pretty in-your-face.

Traditional Large Language Model (LLM) artificial intelligence programs like ChatGPT train on massive amounts of human-generated text to be able to mimic human outputs when given prompts. A recent trend (mainly starting in 2024) has been the incorporation of more formal reasoning capabilities into these models. The enhanced models are termed Large Reasoning Models (LRMs). Now some leading LLMs like Open AI’s GPT, Claude, and the Chinese DeepSeek exist both in regular LLM form and also as LRM versions.

The authors applied both the regular (LLM) and “thinking” LRM versions of Claude 3.7 Sonnet and DeepSeek to a number of mathematical type puzzles. Open AI’s o-series were used to a lesser extent. An advantage of these puzzles is that researchers can, while keeping the basic form of the puzzle, dial in more or less complexity.

They found, among other things, that the LRMs did well up to a certain point, then suffered “complete collapse” as complexity was increased. Also, at low complexities, LLMs actually outperform LRMs. And (perhaps the most vivid evidence of lack of actual understanding on the part of these programs), when they were explicitly offered an efficient direct solution algorithm in the prompt, the programs did not take advantage of it, but instead just kept grinding away in their usual fashion.

As might be expected, AI skeptics were all over the blogosphere, saying, I told you so, LLMs are just massive exercises in pattern matching, and cannot extrapolate outside of their training set. This has massive implications for what we can expect in the near or intermediate future. Among other things, the optimism about AI progress is largely what is fueling the stock market, and also capital investment in this area: Companies like Meta and Google are spending ginormous sums trying to develop artificial “general” intelligence, paying for ginormous amounts of compute power, with those dollars flowing to firms like Microsoft and Amazon building out data centers and buying chips from Nvidia. If the AGI emperor has no clothes, all this spending might come to a screeching crashing halt.

Ars Technica published a fairly balanced account of the controversy, concluding that, “Even elaborate pattern-matching machines can be useful in performing labor-saving tasks for the people that use them… especially for coding and brainstorming and writing.”

Comments on this article included one like:

LLMs do not even know what the task is, all it knows is statistical relationships between words.   I feel like I am going insane. An entire industry’s worth of engineers and scientists are desperate to convince themselves a fancy Markov chain trained on all known human texts is actually thinking through problems and not just rolling the dice on what words it can link together.

And

if we equate combinatorial play and pattern matching with genuinely “generative/general” intelligence, then we’re missing a key fact here. What’s missing from all the LLM hubris and enthusiasm is a reflexive consciousness of the limits of language, of the aspects of experience that exceed its reach and are also, paradoxically, the source of its actual innovations. [This is profound, he means that mere words, even billions of them, cannot capture some key aspects of human experience]

However, the AI bulls have mounted various come-backs to the Apple paper. The most effective I know of so far was published by Alex Lawsen, a researcher at LLM firm Open Philanthropy. Lawsen’s rebuttal, titled “The Illusion of the Illusion of Thinking,  was summarized by Marcus Mendes. To summarize the summary, Lawsen claimed that the models did not in general “collapse” in some crazy way. Rather, the models in many cases recognized that they would not be able to solve the puzzles given the constraints input by the Apple researchers. Therefore, they (rather intelligently) did not try to waste compute power by grinding away to a necessarily incomplete solution, but just stopped. Lawsen further showed that the ways Apple ran the LRM models did not allow them to perform as well as they could. When he made a modest, reasonable change in the operation of the LRMs,

Models like Claude, Gemini, and OpenAI’s o3 had no trouble producing algorithmically correct solutions for 15-disk Hanoi problems, far beyond the complexity where Apple reported zero success.

Lawsen’s conclusion: When you remove artificial output constraints, LRMs seem perfectly capable of reasoning about high-complexity tasks. At least in terms of algorithm generation.

And so, the great debate over the prospects of artificial general intelligence will continue.

The Comeback of Gold as Money

According to Merriam-Webster, “money” is: “something generally accepted as a medium of exchange, a measure of value, or a means of payment.”  Money, in its various forms, also serves as a store of value.  Gold has maintained the store of value function all though the past centuries, including our own times; as an investment, gold has done well in the past couple of decades. I plan to write more later on the investment aspect, but here I focus on the use of physical gold as a means of payment or exchange, or as backing a means of exchange.

Gold, typically in the form of standardized coins, served means of exchange function for thousands of years. Starting in the Renaissance, however, banks started issuing paper certificates which were exchangeable for gold. For daily transactions, the public found it more convenient to handle these bank notes than the gold pieces themselves, and so these notes were used instead of gold as money.     

In the late nineteenth and early twentieth centuries, leading paper currencies like the British pound and the U.S. dollar were theoretically backed by gold; one could turn in a dollar and convert it to the precious metal. Most countries dropped the convertibility to gold during the Great Depression of the 1930’s, so their currencies became entirely “fiat” money, not tied to any physical commodity. For the U.S. dollar, there was limited convertibility to gold after World War II as part of the Bretton Woods system of international currencies, but even that convertibility ended in 1971. In fact, it was illegal for U.S. citizens to own much in the way of physical gold from FDR’s (infamous?) executive order in 1933 until Gerald Ford’s repeal of that order in 1977.

So gold has been essentially extinct as active money for nearly a hundred years. The elite technocrats who manage national financial affairs have been only too happy to dance on its grave. Keynes famously denounced the gold standard as a “barbarous relic”, standing in the way of purposeful management of national money matters.

However, gold seems to be making something of a comeback, on several fronts. Most notably, several U.S. states have promoted the use of gold in transactions. Deep-red Utah has led the way.  In 2011, Utah passed the Legal Tender Act, recognizing gold and silver coins issued by the federal government as legal tender within the state. This legislation allows individuals to transact in gold and silver coins without paying state capital gains tax.  The Utah House and Senate passed bills in 2025 to authorize the state treasurer to establish a precious metals-backed electronic payment platform, which would enable state vendors to opt for payments in physical gold and silver. The Utah governor vetoed this bill, though, claiming it was “operationally impractical.” 

Meanwhile, in Texas:

The new legislation, House Bill 1056, aims to give Texans the ability, likely through a mobile app or debit card system, to use gold and silver they hold in the state’s bullion depository to purchase groceries or other standard items.

The bill would also recognize gold and silver as legal tender in Texas, with the caveat that the state’s recognition must also align with currency laws laid out in the U.S. Constitution.

“In short, this bill makes gold and silver functional money in Texas,” Rep. Mark Dorazio (R-San Antonio), the main driving force behind the effort, said during one 2024 presentation. “It has to be functional, it has to be practical and it has to be usable.”

Arkansas and Florida have also passed laws allowing the use of gold and silver as legal tender. A potential problem is that under current IRS law, gold and silver are generally classified as collectibles and subject to potential capital gains taxes when transactions occur. Texas legislator Dorazio has argued that liability would go away if the metals are classified as functional money, although he’s also acknowledged the tax issue “might end up being decided by the courts.”

But as Europeans found back in the day, carrying around actual clinking gold coins for purchasing and making change is much more of a hassle than paper transactions. And so, various convenient payment or exchange methods, backed by physical gold, have recently arisen.

Since it is relatively easy and lucrative to spawn a new cryptocurrency (which is why there are thousands of them), it is not surprising that there are now several coins supposedly backed by bullion. These include include Paxos Gold (PAXG) and Tether Gold (XAUT). The gold of Paxos is stored in the worldwide vaults of Brinks, and is regularly audited by a credible third party. Tether gold supposedly resides somewhere in Switzerland. The firm itself is incorporated in the British Virgin Islands. Tether in general does not conduct regular audits; its official statements dance around that fact. These crypto coins, like bullion itself or various funds like GLD that hold gold, are in practice probably mainly an investment vehicle (store of value), rather than an active medium of exchange.

However, getting down to the consumer level of payment convenience, we now have a gold-backed credit card (Glint) and debit card (VeraCash Mastercard). Both of these hold their gold in Swiss vaults. The funds you place with these companies have gold allocated to them, so these are a (seemingly cost-effective) means to own gold. If you get nervous, you can actually (subject to various rules) redeem your funds for actual shiny yellow metal.

Papers about Economists Using LLMs

  1. The most recent (published in 2025) is this piece about doing data analytics that would have been too difficult or costly before. Link and title: Deep Learning for Economists

Considering how much of frontier economics revolves around getting new data, this could be important. On the other hand, people have been doing computer-aided data mining for a while. So it’s more of a progression than a revolution, in my expectation.

2. Using LLMs to actually generate original data and/or test hypotheses like experimenters: Large language models as economic agents: what can we learn from homo silicus? and Automated Social Science: Language Models as Scientist and Subjects

3. Generative AI for Economic Research: Use Cases and Implications for Economists

Korinek has a new supplemental update as current as December 2024: LLMs Learn to Collaborate and Reason: December 2024 Update to “Generative AI for Economic Research: Use Cases and Implications for Economists,” Published in the Journal of Economic Literature 61 (4)

4. For being comprehensive and early: How to Learn and Teach Economics with Large Language Models, Including GPT

5. For giving people proof of a phenomenon that many people had noticed and wanted to discuss: ChatGPT Hallucinates Non-existent Citations: Evidence from Economics

Alert: We will soon have an update for current web-enabled models! It would seem that hallucination rates are going down but the problem is not going away.

6. This was published back in 2023. “ChatGPT ranked in the 91st percentile for Microeconomics and the 99th percentile for Macroeconomics when compared to students who take the TUCE exam at the end of their principles course.” (note the “compared to”): ChatGPT has Aced the Test of Understanding in College Economics: Now What?

References          

Buchanan, J., Hill, S., & Shapoval, O. (2023). ChatGPT Hallucinates Non-existent Citations: Evidence from Economics. The American Economist69(1), 80-87. https://doi.org/10.1177/05694345231218454 (Original work published 2024)

Cowen, Tyler and Tabarrok, Alexander T., How to Learn and Teach Economics with Large Language Models, Including GPT (March 17, 2023). GMU Working Paper in Economics No. 23-18, Available at SSRN: https://ssrn.com/abstract=4391863 or http://dx.doi.org/10.2139/ssrn.4391863

Dell, M. (2025). Deep Learning for Economists. Journal of Economic Literature, 63(1), 5–58. https://doi.org/10.1257/jel.20241733

Geerling, W., Mateer, G. D., Wooten, J., & Damodaran, N. (2023). ChatGPT has Aced the Test of Understanding in College Economics: Now What? The American Economist68(2), 233-245. https://doi.org/10.1177/05694345231169654 (Original work published 2023)

Horton, J. J. (2023). Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus? arXiv Preprint arXiv:2301.07543.

Korinek, A. (2023). Generative AI for Economic Research: Use Cases and Implications for Economists. Journal of Economic Literature, 61(4), 1281–1317. https://doi.org/10.1257/jel.20231736

Manning, B. S., Zhu, K., & Horton, J. J. (2024). Automated Social Science: Language Models as Scientist and Subjects (Working Paper No. 32381). National Bureau of Economic Research. https://doi.org/10.3386/w32381

“Final Notice” Traffic Ticket Smishing Scam

Yesterday I got a scary-sounding text message, claiming that I have an outstanding traffic ticket in a certain state, and threatening me with the following if I did not pay within two days:

We will take the following actions:

1. Report to the DMV Breach Database

2. Suspend your vehicle registration starting June 2

3. Suspension of driving privileges for 30 days…

4. You may be sued and your credit score will suffer

Please pay immediately before execution to avoid license suspension and further legal disputes.

Oh, my!

A link (which I did NOT click on) was provided for “payment”.

I also got an almost (not quite) identical text a few days earlier. I was almost sure these were scams, but it was comforting to confirm that by going to the web and reading that, yes, these sorts of texts are the flavor of the month in remote rip-offs; as a rule, states do not send out threatening texts with payment links in them.

These texts are examples of “smishing”, which is phishing (to collect identity or bank/credit card information) via SMS text messaging. It must be a lucrative practice. According to spam blocker Robokiller, Americans received 19.2 billion spam robo texts in May 2025. That’s nearly 63 spam texts for every person in the U.S.

Beside these traffic ticket scams, I often get texts asking me to click to track delivery of some package, or to prevent the misuse of my credit card, etc. I have been spared text messages from the Nigerian prince who needs my help to claim his rightful inheritance; I did get an email from him some years back.

The FTC keeps a database called Sentinel on fraud complaints made to the FTC and to law enforcement agencies. People reported losing a total of $12 billion to fraud in 2024, an increase of $2 billion over the previous year. That is a LOT of money (and a commentary on how wealthy Americans are, if that much can get skimmed off with little net impact on society). The biggest single category for dollar loss was investment; the number of victims was smaller than for other categories, but the loss per victim ($9,200) was quite high. Other areas with high median losses per capita were Business and Job Opportunities ($2,250) and Mortgage Foreclosure Relief and Debt Management ($1,500).

Imposter scams like the texts I have gotten (sender pretending to be from state DMV, post office, bank, credit card company, etc.) were by far the largest category by number reported (845,806 in 2024). Of those imposter reports, 22% involved actual losses ($800 median loss), totaling a hefty $2,952 million. That is a juicy enough haul to keep those robo frauds coming.

How to not get scammed: Be suspicious of every email or text, especially ones that prey on emotions like fear or greed or curiosity and try to engage you to payments or for prying information out of you. If it purports to come from some known entity like Bank of America or your state DMV, contact said entity directly to check it out. If you don’t click on anything (or reply in any way to the text, like responding with Y or N), it can’t hurt you.

I’m not sure how much they can do, considering the bad guys tend to hijack legit phone numbers for their dirty work, but you can mark these texts as spam to help your phone carrier improve their spam detection algorithm. Also, reporting scam texts to the U.S. Federal Trade Commission and/or the FBI’s Internet Crime Complaint Center can help build their data set, and perhaps lead to law enforcement actions.

Later add: According to EZPass, here is how to report text scams:

You can report smishing messages to your cell carrier by following this FCC guidance.  This service is provided by most cell carriers.

  1. Hold down the spam TXT/SMS message with your finger
  2. Select the “Forward” option
  3. Enter 7726 as the recipient and press “Send”

Additionally, to report the message to the FBI, visit the FBI’s Internet Crime Complaint Center (ic3.gov) and select ‘File a Complaint’ to do so.  When completing the complaint, include the phone number where the smishing text originated, and the website link listed within the text.

We’re All Magical

The widespread availability and easy user interface of artificial intelligence (AI) has put great power at everyone’s fingertips. We can do magical things.

Before the internet existed we would use books to help us better interpret the world.  Communication among humans is hard. Expressing logic and even phenomena is complex. This is why social skills matter. Among other things, they help us to communicate. The most obvious example of a communication barrier is language. I remember having a pocket-sized English-Spanish dictionary that I used to help me memorize or query Spanish words. The book helped me communicate with others and to translate ideas from one language to another.

Math books do something similar but the translation is English-Math. We can get broader and say that all textbooks are translation devices. They define field-specific terms and ideas to help a person translate among topic domains, usually with a base-language that reaches a targeted generalizability. We can get extreme and say that all books are translators, communicating the content of one person’s head to another.

But sometimes the field-to-general language translation doesn’t work because readers don’t have an adequate grasp of either language. It isn’t necessarily that readers are generally illiterate. It may be that the level of generality and degree of focus of the translation isn’t right for the reader. Anyone who has ever tried to teach anything with math has encountered this.  Students say that the book doesn’t translate clearly, and the communication fails. The book gets the reader’s numeracy or understood definitions wrong. Therefore, there is diversity among readers about how ‘good’ a textbook is.

Search engines are so useful because you can enter some keywords and find your destination, even if you don’t know the proper nouns or domain-specific terms. People used to memorize URLs and that’s becoming less common. Wikipedia is so great because if you want to learn about an idea, they usually explain it in 5 different ways. They tell the story of who created something and who they interacted with. They describe the motivation, the math, the logic, the developments, and usually include examples. Wikipedia translates domain-specific ideas to multiple general languages of different cognitive aptitudes or interests. It scatters links along the way to help users level-up their domain-specific understanding so that they can contextualize and translate the part that they care about.

Historical translation technology was largely for the audience. More recently, translation technology has empowered the transmitters.

Continue reading

EconTalk Extra on Daisy Christodoulou

I wrote an Extra for the How Better Feedback Can Revolutionize Education (with Daisy Christodoulou) episode.

Can Students Get Better Feedback? is the title of my Extra.

Read the whole thing at the link (ungated), but here are two quotes:

For now, the question is still what kind of feedback teachers can give that really benefits students. Daisy Christodoulou, the guest on this episode, offers a sobering critique of how educators tend to give feedback in education. One of her points is that much of the written feedback teachers give is vague and doesn’t actually help students improve. She shares an example from Dylan William: a middle school student was told he needed to “make their scientific inquiries more systematic.” When asked what he would do differently next time, the student replied, “I don’t know. If I’d known how to be more systematic, I would have been so the first time.” 

Christodoulou also turns to the question many of us are now grappling with: can AI help scale meaningful feedback?

What is truth? The Bayesian Dawid-Skene Method

I just learned about the Bayesian Dawid-Skene method. This is a summary.

Some things are confidently measurable. Other things are harder to perceive or interpret. An expert researcher might think that they know an answer. But there are two big challenges: 1) The researcher is human and can err & 2) the researcher is finite with limited time and resources. Even artificial intelligence has imperfect perception and reason. What do we do?

A perfectly sensible answer is to ask someone else what they think. They might make a mistake too. But if their answer is formed independently, then we can hopefully get closer to the truth with enough iterations. Of course, nothing is perfectly independent. We all share the same globe, and often the same culture or language. So, we might end up with biased answer. We can try to correct for bias once we have an answer, so accepting the bias in the first place is a good place to start.  

The Bayesian Dawid-Skene (henceforth DS) method helps to aggregate opinions and find the truth of a matter given very weak assumptions ex ante. Here I’ll provide an example of how the method works.

Let’s start with a very simple question, one that requires very little thought and logic. It may require some context and social awareness, but that’s hard to avoid. Say that we have a list of n=100 images. Each image has one of two words written on it, “pass” and “fail”. If typed, then there is little room for ambiguity. Typed language is relatively clear even when the image is substantially corrupted. But these words are written, maybe with a variety of pens, by a variety of hands, and were stored under a variety of conditions. Therefore, we might be a little less trusting of what a computer would spit out by using optical character recognition (OCR). Given our own potential for errors and limited time, we might lean on some other people to help interpret the scripts.

Continue reading