What is truth? The Bayesian Dawid-Skene Method

I just learned about the Bayesian Dawid-Skene method. This is a summary.

Some things are confidently measurable. Other things are harder to perceive or interpret. An expert researcher might think that they know an answer. But there are two big challenges: 1) The researcher is human and can err & 2) the researcher is finite with limited time and resources. Even artificial intelligence has imperfect perception and reason. What do we do?

A perfectly sensible answer is to ask someone else what they think. They might make a mistake too. But if their answer is formed independently, then we can hopefully get closer to the truth with enough iterations. Of course, nothing is perfectly independent. We all share the same globe, and often the same culture or language. So, we might end up with biased answer. We can try to correct for bias once we have an answer, so accepting the bias in the first place is a good place to start.  

The Bayesian Dawid-Skene (henceforth DS) method helps to aggregate opinions and find the truth of a matter given very weak assumptions ex ante. Here I’ll provide an example of how the method works.

Let’s start with a very simple question, one that requires very little thought and logic. It may require some context and social awareness, but that’s hard to avoid. Say that we have a list of n=100 images. Each image has one of two words written on it, “pass” and “fail”. If typed, then there is little room for ambiguity. Typed language is relatively clear even when the image is substantially corrupted. But these words are written, maybe with a variety of pens, by a variety of hands, and were stored under a variety of conditions. Therefore, we might be a little less trusting of what a computer would spit out by using optical character recognition (OCR). Given our own potential for errors and limited time, we might lean on some other people to help interpret the scripts.

Continue reading

Discuss AI Doom with Joy on May 5

If you like to read and discuss with smart people, then you can make a free account in the Liberty Fund Portal. If you listen to this podcast over the weekend: Eliezer Yudkowsky on the Dangers of AI (2023) you will be up to speed for our asynchronous virtual debate room on Monday May 5.

Russ Roberts sums up the doomer argument using the following metaphor:

The metaphor is primitive. Zinjanthropus man or some primitive form of pre-Homo sapiens sitting around a campfire and human being shows up and says, ‘Hey, I got a lot of stuff I can teach you.’ ‘Oh, yeah. Come on in,’ and pointing out that it’s probable that we are either destroyed directly by murder or maybe just by out-competing all the previous hominids that came before us, and that in general, you wouldn’t want to invite something smarter than you into the campfire.

What do you think of this metaphor? By incorporating AI agents into society, are we inviting a smarter being to our campfire? Is it likely to eventually kill us out of contempt or neglect? That will be what we are discussing over in the Portal this week.

Is your P(Doom) < 0.05? Great – that means you believe that the probability of AI turning us into paperclips is less than 5%. Come one come all. You can argue against doomers during the May 5-9 week of Doom and then you will love Week Two. On May 12-16, we will make the optimistic case for AI!

See more details on all readings and the final Zoom meeting in my previous post.

Kaggle Wins for Data Sharing

I like to take existing datasets, clean them up, and share them in easier to use formats. When I started doing this back in 2022, my strategy was to host the datasets with the Open Science Foundation and share the links here and on my personal website.

OSF is great for allowing large uploads and complex projects, but not great for discovery. I saw several of my students struggle to navigate their pages to find the appropriate data files, and they seem to have poor SEO. Their analytics show that my data files there get few views, and most of the ones they get come from people who were already on the OSF site.

This year I decided to upload my new projects like County Demographics data to Kaggle.com in addition to OSF, and so far Kaggle is the clear winner. My datasets are getting more downloads on Kaggle than views on OSF. I’ve noticed that Kaggle pages tend to rank highly on Google and especially on Google Dataset Search. I think Kaggle also gets more internal referrals, since they host popular machine learning competitions.

Kaggle has its own problems of course, like one of its prominent download buttons only downloading the first 10 columns for CSV or XLSX files by default. But it is the best tool I have found so far for getting datasets in the hands of people who will find them useful. Let me know if you’ve found a better one.

Old Fashioned Function Keys

Your Function Keys Are Cooler Than You Think
by someone who used to press F1 by mistake

Ever notice the F keys on your keyboard? F1 through F12. Sitting at the top like unused shelf space. If you’re at a computer now, take a glance. I used to think they did nothing, or at least nothing for me. Maybe experts used them. Experts who know what BIOS and DOS are.  But for me, just little space fillers with no purpose. I frequently pressed F1 by accident rather than escape. A help window would pop up, wasting half a second of my life until I closed it.

But the Fn keys (function keys) are sneaky useful. They can save you serious time. No clicking. No dragging. No fumbling with touchpad mis-clicks.

When using a web browser, F5 refreshes the web page. Windows has added the same functionality for folders too, updating recently edited files. Fast and easy. F11 changes your web browser view to full screen. Great for long reads or historical documents. F12 shows the guts of a webpage. That’s perfect if you web scrape or need to know what things are called behind the scenes. Ctrl + F4 closes a tab. Alt + F4 shuts the whole application instance down. That last one works for almost all applications.

Excel? F4 saves so much of your life. It toggles absolute cell, row, and column references. Have you ever watched someone try to click on the right spot with their touchpad and manually press the ‘$’ sign… twice? I can feel myself slowly creeping toward death as my life wastes away. Whereas pressing F4 lets you get on with your life. F12 in most Microsoft applications is ‘Save As’. No need to find the floppy disk image on that small laptop screen. PowerPoint has its own tricks—F5 begins the presentation. Shift + F5 starts it from the current slide. Not bad. And don’t forget F7! That’s the spellcheck hotkey. But now it’s been expanded to include grammar, clarity, concision, and inclusivity.

Continue reading

Join Joy to discuss Artificial Intelligence in May 2025

Podcasts are emerging as one of the key mediums for getting expert timely opinions and news about artificial intelligence. For example, EconTalk (Russ Roberts) has featured some of the most famous voices in AI discourse:

EconTalk: Eliezer Yudkowsky on the Dangers of AI (2023)

EconTalk: Marc Andreessen on Why AI Will Save the World 

EconTalk: Reid Hoffman on Why AI Is Good for Humans

If you would like to engage in a discussion about these topics in May, please sign up for the session I am leading. It is free, but you do need to sign up for the Liberty Fund Portal.

The event consists of two weeks when you can do a discussion board style conversation asynchronously with other interested listeners and readers. Lastly, there is a zoom meeting to bring everyone together on May 21. You don’t have to do all three of the parts.

Further description for those who are interested:

Timeless: Artificial Intelligence: Doom or Bloom?

with Joy Buchanan

Time: May 5-9, 2025 and May 12-16, 2025

How will humans succeed (or survive) in the Age of AI? 

Russ Roberts brought the world’s leading thinkers about artificial intelligence to the EconTalk audience and was early to the trend. He hosted Nick Bostrom on Superintelligence in 2014, more than a decade before the world was shocked into thinking harder about AI after meeting ChatGPT. 

We will discuss the future of humanity by revisiting or discovering some of Robert’s best EconTalk podcasts on this topic and reading complementary texts. Participants can join in for part or all of the series. 

Week 1: May 5-9, 2025

An asynchronous discussion, with an emphasis on possible negative outcomes from AI, such as unemployment, social disengagement, and existential risk. Participants will be invited to suggest special topics for a separate session that will be held on Zoom on May 21, 2025, 2:00-3:30 pm EDT. 

Required Readings: EconTalk: Eliezer Yudkowsky on the Dangers of AI (2023)

EconTalk: Erik Hoel on the Threat to Humanity from AI (2023) with an EconTalk Extra Who’s Afraid of Artificial Intelligence? by Joy Buchanan

“Trurl’s Electronic Bard” (1965) by Stanisław Lem. 

In this prescient short story, a scientist builds a poetry-writing machine. Sound familiar? (If anyone participated in the Life and Fate reading club with Russ and Tyler, there are parallels between Lem’s work and Vasily Grossman’s “Life and Fate” (1959), as both emerged from Eastern European intellectual traditions during the Cold War.)

Optional Readings:Technological Singularity” by Vernor Vinge. Field Robotics Center, Carnegie Mellon U., 1993.

“‘I am Bing, and I Am Evil’: Microsoft’s new AI really does herald a global threat” by Erik Hoel. The Intrinsic Perspective Substack, February 16, 2023.

Situational Awareness” (2024) by Leopold Aschenbrenner 

Week 2: May 12-16, 2025

An asynchronous discussion, emphasizing the promise of AI as the next technological breakthrough that will make us richer.
Required Readings: EconTalk: Marc Andreessen on Why AI Will Save the World 

EconTalk: Reid Hoffman on Why AI Is Good for Humans

Optional Readings: EconTalk: Tyler Cowen on the Risks and Impact of Artificial Intelligence (2023)

ChatGPT Hallucinates Nonexistent Citations: Evidence from Economics” (2024) 

Joy Buchanan with Stephen Hill and Olga Shapoval. The American Economist, 69(1), 80-87.

What the Superintelligence can do for us (Joy Buchanan, 2024)

Dwarkesh Podcast “Tyler Cowen – Hayek, Keynes, & Smith on AI, Animal Spirits, Anarchy, & Growth

Week 3: May 21, 2025, 2:00-3:30 pm EDT (Zoom meeting)
Pre-registration is required, and we ask you to register only if you can be present for the entire session. Readings are available online. We will get to talk in the same zoom room!

Required Readings: Great Antidote podcast with Katherine Mangu-Ward on AI: Reality, Concerns, and Optimism

Additional readings will be added based partially on previous sessions’ participants’ suggestions

Optional Readings: Rediscovering David Hume’s Wisdom in the Age of AI (Joy Buchanan, EconLog, 2024)

Professor tailored AI tutor to physics course. Engagement doubled” The Harvard Gazette. 2024. 

Please email Joy if you have any trouble signing up for the virtual event.

EconTalk Extra on Erik Hoel

Sometimes a Russ Roberts podcast gets an “Extra” post following up on the topic. I wrote an Extra for the Erik Hoel on the Threat to Humanity from AI episode:

Who’s Afraid of Artificial Intelligence? is the title of my Extra

Hoel’s main argument is that if AI becomes more intelligent than humans, it could pose a serious threat. What if the AI agents start to treat humans the way we currently treat wild deer, not necessarily with malice but without much regard for the welfare of every human individual?

Things that are vastly more intelligent than you are really hard to understand and predict; and the wildlife next door, as much as we might like it, we will also build a parking lot over it at a heartbeat and they’ll never know why. 

County Demographic Data: A Clean Panel 1969-2023

Whenever researchers are conducting studies using state- or county-level data, we usually want some standard demographic variables to serve as controls; things like the total population, average age, and gender and race breakdowns. If the dataset for our main variables of interest doesn’t already have this, we go looking for a new dataset of demographic controls to merge in; but it has always been surprisingly hard to find a clean, easy-to-use dataset for this. For states, I’ve found the University of Kentucky’s National Welfare Database to be the best bet. But what about counties?

I had no good answer, and the best suggestion I got from others was the CDC SEER data. As so often, the government collected this impressively comprehensive dataset, but only releases it in an unusable format- in this case only as txt files that look like this:

I cleaned and reformatted the CDC SEER data into a neat panel of county demographics that look like this:

I posted my code and data files (CSV, XLSX, and DTA) on OSF and my data page as usual. I also posted the data files on Kaggle, which seems to be more user-friendly and turns up better on searches; I welcome suggestions for any other data repositories or file formats you would like to see me post.

HT: Kabir Dasgupta

Triumph of the Data Hoarders

Several major datasets produced by the federal government went offline this week. Some, like the Behavioral Risk Factor Surveillance Survey and the American Community Survey, are now back online; probably most others will soon join them. But some datasets that the current administration considers too DEI-inflected could stay down indefinitely.

This serves as a reminder of the value of redundancy- keeping datasets on multiple sites as well as in local storage. Because you never really know when one site will go down- whether due to ideological changes, mistakes, natural disasters, or key personnel moving on.

External hard drives are an affordable option for anyone who wants to build up their own local data hoard going forward. The Open Science Foundation site allows you to upload datasets up to 50 GB to share publicly; that’s how I’ve been sharing cleaned-up versions of the BRFSS, state-levle NSDUH, National Health Expenditure Accounts, Statistics of US Business, and more. If you have a dataset that isn’t online anywhere, or one that you’ve cleaned or improved to the point it is better than the versions currently online, I encourage you to post it on OSF.

If you are currently looking for a federal dataset that got taken down, some good places to check are IPUMS, NBER, Archive.org, or my data page. PolicyMap has posted some of the federal datasets that seem particularly likely to stay down; if you know of other pages hosting federal datasets that have been taken down, please share them in the comments.

After the Fall: What Next for Nvidia and AI, In the Light of DeepSeek

Anyone not living under a rock the last two weeks has heard of DeepSeek, the cheap Chinese knock-off of ChatGPT that was supposedly trained using much lower resources that most American Artificial Intelligence efforts have been using. The bearish narrative flowing from this is that AI users will be able to get along with far fewer of Nvidia’s expensive, powerful chips, and so Nvidia sales and profit margins will sag.

The stock market seems to be agreeing with this story. The Nvidia share price crashed with a mighty crash last Monday, and it has continued to trend downward since then, with plenty of zig-zags.

I am not an expert in this area, but have done a bit of reading. There seems to be an emerging consensus that DeepSeek got to where it got to largely by using what was already developed by ChatGPT and similar prior models. For this and other reasons, the claim for fantastic savings in model training has been largely discounted. DeepSeek did do a nice job making use of limited chip resources, but those advances will be incorporated into everyone else’s models now.

Concerns remain regarding built-in bias and censorship to support the Chinese communist government’s point of view, and regarding the safety of user data kept on servers in China. Even apart from nefarious purposes for collecting user data, ChatGPT has apparently been very sloppy in protecting user information:

Wiz Research has identified a publicly accessible ClickHouse database belonging to DeepSeek, which allows full control over database operations, including the ability to access internal data. The exposure includes over a million lines of log streams containing chat history, secret keys, backend details, and other highly sensitive information.

Shifting focus to Nvidia – – my take is that DeepSeek will have little impact on its sales. The bullish narrative is that the more efficient algos developed by DeepSeek will enable more players to enter the AI arena.

The big power users like Meta and Amazon and Google have moved beyond limited chatbots like ChatGPT or DeepSeek. They are aiming beyond “AI” to “AGI” (Artificial General Intelligence), that matches or surpasses human cognitive capabilities across a wide range of cognitive tasks. Zuck plans to replace mid-level software engineers at Meta with code-bots before the year is out.

For AGI they will still need gobs of high-end chips, and these companies show no signs of throttling back their efforts. Nvidia remains sold out through the end of 2025. I suspect that when the company reports earnings on Feb 26, it will continue to demonstrate high profits and project high earnings growth.

Its price to earnings is higher than its peers, but that appears to be justified by its earnings growth. For a growth stock, a key metric is price/earnings-growth (PEG), and by that standard, Nvidia looks downright cheap:

Source: Marc Gerstein on Seeking Alpha

How the fickle market will react to these realities, I have no idea.

The high volatility in the stock makes for high options premiums. I have been selling puts and covered calls to capture roughly 20% yields, at the expense of missing out on any rise in share price from here.

Disclaimer: Nothing here should be considered as advice to buy or sell any security.

DeepSeek vs. ChatGPT: Has China Suddenly Caught or Surpassed the U.S. in AI?

The biggest single-day decline in stock market history occurred yesterday, as Nvidia plunged 17% to shave $589 billion off the AI chipmaker’s market cap. The cause of the panic was the surprisingly good performance of DeepSeek, a new Chinese AI application similar to ChatGPT.

Those who have tested DeepSeek find it to perform about as well as the best American AI models, with lower consumption of computer resources. It is also available much cheaper. What really stunned the tech world is that the developers claimed to have trained the model for only about six million dollars, which is way, way less than the billions that a large U.S. firm like OpenAI, Google, or Meta would spend on a leading AI model. All this despite the attempts by the U.S. to deny China the most advanced Nvidia chips. The developers of DeepSeek claim they worked with a modest number of chips, models with deliberately curtailed capacities which met U.S. export allowances.

One conclusion, drawn by the Nvidia bears, is that this shows you *don’t* need ever more of the most powerful and expensive chips to get good development done. The U.S. AI development model has been to build more, huge, power-hungry data centers and fill them up with the latest Nvidia chips. That has allowed Nvidia to charge huge profit premiums, as Google and other big tech companies slurp up all the chips that Nvidia can produce. If that supply/demand paradigm breaks, Nvidia’s profits could easily drop in half, e.g., from 60+% gross margins to a more normal (but still great) 30% margin.

The Nvidia bulls, on the other hand, claim that more efficient models will lead to even more usage of AI, and thus increase the demand for computing hardware – – a cyber instance of Jevons’ Paradox (where the increase in the efficiency of steam engines in burning coal led to more, not less, coal consumption, because it made steam engines more ubiquitous).

I read a bunch of articles to try to sort out hype from fact here. Folks who have tested DeepSeek find it to be as good as ChatGPT, and occasionally better. It can explain its reasoning explicitly, which can be helpful. It is open source, which I think means the code or at least the “weights” have been published. It does seem to be unusually efficient. Westerners have downloaded it onto (powerful) PCs and have run it there successfully, if a bit slowly. This means you can embed it in your own specialized code, or do your AI apart from the prying eyes of ChatGPT or other U.S. AI providers. In contrast, ChatGPT I think can only be run on a powerful remote server.

Unsurprisingly, in the past two weeks DeepSeek has been the most-uploaded free app, surpassing ChatGPT.

It turns out that being starved of computing power led the Chinese team to think their way to several important innovations that make much better use of computing. See here and here for gentle technical discussions of how they did that. Some of it involved hardware-ish things like improved memory management. Another key factor is they figured out a way to only do training on data which is relevant to the training query, instead of training each time on the entire universe of text.

A number of experts scoff at the claimed six million dollar figure for training, noting that if you include all the costs that were surely involved in the development cycle, it can’t be less than hundreds of millions of dollars. That said, it was still appreciably cheaper than the usual American way. Furthermore, it seems quite likely that making use of answers generated by ChatGPT helped DeepSeek to rapidly emulate ChatGPT’s performance. It is one thing to catch up to ChatGPT; it may be tougher to surpass it. Also, presumably the compute-efficient tricks devised by the DeepSeek team will now be applied in the West, as well. And there is speculation that DeepSeek actually has use of thousands of the advanced Nvidia chips, but they hide that fact since it involved end-running U.S. export restrictions. If so, then their accomplishment would be less amazing.

What happens now? I wish I knew. (I sold some Nvidia stock today, only to buy it back when it started to recover in after-hours trading). DeepSeek has Chinese censorship built into it. If you use DeepSeek, your information gets stored on servers in China, the better to serve the purposes of the government there.

Ironically, before this DeepSeek story broke, I was planning to write a post here this week pondering the business case for AI. For all the breathless hype about how AI will transform everything, it seems little money has been made except for Nvidia. Nvidia has been selling picks and shovels to the gold miners, but the gold miners themselves seem to have little to show for the billions and billions of dollars they are pouring into AI. A problem may be that there is not much of a moat here – – if lots of different tech groups can readily cobble together decent AI models, who will pay money to use them? Already, it is being given away for free in many cases. We shall see…