What is truth? The Bayesian Dawid-Skene Method

I just learned about the Bayesian Dawid-Skene method. This is a summary.

Some things are confidently measurable. Other things are harder to perceive or interpret. An expert researcher might think that they know an answer. But there are two big challenges: 1) The researcher is human and can err & 2) the researcher is finite with limited time and resources. Even artificial intelligence has imperfect perception and reason. What do we do?

A perfectly sensible answer is to ask someone else what they think. They might make a mistake too. But if their answer is formed independently, then we can hopefully get closer to the truth with enough iterations. Of course, nothing is perfectly independent. We all share the same globe, and often the same culture or language. So, we might end up with biased answer. We can try to correct for bias once we have an answer, so accepting the bias in the first place is a good place to start.  

The Bayesian Dawid-Skene (henceforth DS) method helps to aggregate opinions and find the truth of a matter given very weak assumptions ex ante. Here I’ll provide an example of how the method works.

Let’s start with a very simple question, one that requires very little thought and logic. It may require some context and social awareness, but that’s hard to avoid. Say that we have a list of n=100 images. Each image has one of two words written on it, “pass” and “fail”. If typed, then there is little room for ambiguity. Typed language is relatively clear even when the image is substantially corrupted. But these words are written, maybe with a variety of pens, by a variety of hands, and were stored under a variety of conditions. Therefore, we might be a little less trusting of what a computer would spit out by using optical character recognition (OCR). Given our own potential for errors and limited time, we might lean on some other people to help interpret the scripts.

Continue reading

An Egg-cellent Consumer Surplus Calculation?

There was a recent Planet Money Podcast episode that includes a fun exercise. An NPR employee produces a dozen chicken eggs and wants to sell them at cost to another employee for $5. That’s the setup. How does the employee decide who should receive the eggs? Clearly, the price mechanism won’t work since the price is fixed. A lottery is also not allowed. The egg recipient could engage in arbitrage, reselling the eggs for a higher price. But that’s not very likely and would be socially awkward. The egg producer wants to make someone happy. Who would he make the happiest?

That’s the challenge that the Planet Money team tries to solve.

First, they started with a survey. Rather than asking coworkers to rank a long list of things that includes eggs, the survey adopts a more robust method of pairwise comparisons. Do you prefer toast vs eggs? Eggs vs oatmeal? Toast vs oatmeal? and so on. One problem that they encounter, however, is that there is a lot of diversity among preparations methods. My oatmeal is better than my eggs. But my brother’s oatmeal is not. As it turns out, there is not a standard quality of prepared oatmeal and prepared eggs. So the survey is a flop.

Then they consult an economist. They decide to try to measure “willingness to pay”, which is an economic concept that identifies the maximum that a person could pay for something without becoming worse off. They couldn’t really ask the coworkers what their WTP is. People are social creatures and have many reasons to lie, mislead, signal, and to simply not know. Since someone’s WTP reflects preferences and values, we need a way to solicit the true preference while avoiding lies and most mistakes. Here’s how the economist suggested that they reveal the coworker preferences.

  • Step 1: Tell the coworker these rules.
  • Step 2: Coworker reports their WTP for a single egg in dollars
  • Step 3: A random price will be chosen by a machine. If the price is above the self-reported WTP, the coworker is not allowed to buy the egg. If the price is below the WTP, then the coworker must buy the egg at the random price.

The idea is as follows.

Continue reading

95 Days of Trump Spending & Cutting

Generally, decisions to spend federal funds come is the authority of congress. But the Trump administration has very publicly made clear that it will try to cut the things that are within its authority (or that it thinks should be within that authority). Truly, the fiscal year with the new Republican unified government won’t begin until October of 2025. So, the last quarter is when we’ll see what the Republicans actually want – for better or for worse. In the meantime, we can look past the hyperbole and see what the accounting records say. The most recent data includes 95 days after inauguration.  First, for context, total spending is up $134 billion or 5.8% from this time last year to $2.45 trillion.

The Trump administration has been making news about their desire and success in cutting. Which programs have been cut the most? As a proportion of their budgets, below is a graph of were the five biggest cuts have happened by percent. The Cuts to the FCC and CPB reflect long partisan stances by Republicans. The cuts to the Federal Financing Bank reflect fewer loans administered by the US government and reflect the current bouts to cut spending. Cuts in the RRB- Misc refer to some types of railroad payments to employees. In the spirit of whiplash, the cuts to the US International Development Finance Corporation reverse the course set by the first Trump administration. This government corporation exists to facilitate US investment in strategically important foreign countries.

But some programs have *increased* spending since 2024. The five largest increases include the USDA, the US contributions to multilateral assistance, claims and judgments against the US, the federal railroad administration, and the international monetary fund. Funding for farmers and railroads reflect the old agricultural and new union Republican constituencies. The multilateral assistance and IMF spending reflects greater international involvement of the administration, despite its autarkic lip service.

Continue reading

It’s the Humidity

Recently, I learned what humidity is. That might sound stupid, so let me clarify. I knew that humidity is the water content of the air. I also knew that the higher the number, the more humid. Finally, I also knew that the dew point is the temperature at which the water falls out of the air. But, now I understand all of this in a way that I hadn’t previously.

First, what does it mean for there to be 70% humidity? As it turns out, it’s a moving target. There are two types of humidity: specific and relative. Specific humidity is the mass of water in, say, a kilogram of air. So, more humidity means more water. This is obvious. There’s a related concept called absolute humidity, which is more like mass of water per volume of air (sometimes used in place of specific humidity). Again, more humidity means more water. Neither of these is the way that humidity is reported on the weather channel.

Relative humidity is the number that you see in your weather app. What’s that? Relative to what? First, we need to know that warm air can hold more water than cool air. Pressure also matters, but atmospheric pressure doesn’t change enough to make its effect on humidity significant on relevant margins. So, all of this discussion, and the number in your phone, is at atmospheric pressure. Below is a graph that illustrates the maximum amount of water that can be in the air at different temperatures (red line). So, at 30 degrees Celsius (86 degrees Fahrenheit), there can be as much as 27 grams (0.95 oz or ~2 tablespoons) of water in the air.

More after the jump.

Continue reading

Old Fashioned Function Keys

Your Function Keys Are Cooler Than You Think
by someone who used to press F1 by mistake

Ever notice the F keys on your keyboard? F1 through F12. Sitting at the top like unused shelf space. If you’re at a computer now, take a glance. I used to think they did nothing, or at least nothing for me. Maybe experts used them. Experts who know what BIOS and DOS are.  But for me, just little space fillers with no purpose. I frequently pressed F1 by accident rather than escape. A help window would pop up, wasting half a second of my life until I closed it.

But the Fn keys (function keys) are sneaky useful. They can save you serious time. No clicking. No dragging. No fumbling with touchpad mis-clicks.

When using a web browser, F5 refreshes the web page. Windows has added the same functionality for folders too, updating recently edited files. Fast and easy. F11 changes your web browser view to full screen. Great for long reads or historical documents. F12 shows the guts of a webpage. That’s perfect if you web scrape or need to know what things are called behind the scenes. Ctrl + F4 closes a tab. Alt + F4 shuts the whole application instance down. That last one works for almost all applications.

Excel? F4 saves so much of your life. It toggles absolute cell, row, and column references. Have you ever watched someone try to click on the right spot with their touchpad and manually press the ‘$’ sign… twice? I can feel myself slowly creeping toward death as my life wastes away. Whereas pressing F4 lets you get on with your life. F12 in most Microsoft applications is ‘Save As’. No need to find the floppy disk image on that small laptop screen. PowerPoint has its own tricks—F5 begins the presentation. Shift + F5 starts it from the current slide. Not bad. And don’t forget F7! That’s the spellcheck hotkey. But now it’s been expanded to include grammar, clarity, concision, and inclusivity.

Continue reading

Now published: Human capital of the US deaf Population, 1850-1910

Myself and a student coauthor worked hard on our article that is now published in Social Science History. It’s the first modern statistical analysis of the historical deaf population. We bring an economic lens and statistical treatment to a topic that previously included much anecdotal evidence and case study. We hope that future authors can improve on our work in ways that meet and surpass the quantitative methods that we employed.

Our contributions include:

  • A human capital model of deafness that’s agnostic about its productivity implications and treats deaf individuals as if they made decisions rationally.
  • A better understanding of school attendance rates and the ages at which they attended.
  • Deaf children were much more likely to be neither in school nor employed earlier in US history.
  • The negative impact of state ‘school for the deaf’ availability on subsequent economic outcomes among deaf adults. We speculate that they attended schools due to the social benefits of access to community.
  • Deaf workers did not avoid occupations where their deafness would be incidentally detectable by trade partners, implying that animus discrimination was not systemically important for economic outcomes.
Continue reading

Messy Disability Records in the Historical Censuses

The historical US Census roles of disability among free persons are a mess. Specifically for the 1850-1870 censuses, the census bureau was not professionalized and the pay was low (a permanent office wasn’t founded until 1902). So, the enumerators were temporary employees and weren’t experts of their art. To boot, their handwriting wasn’t always crystal clear. Second, training for disability enumeration was even less complete and enumerators did their best with whom they encountered and how they understood the instructions. Finally, the digitized data in IPUMS doesn’t perfectly match the census reports. What a mess.

Guilty by Association

Disabled people and their families often misreported their status out of embarrassment or shame. Given that enumerators had quotas to fill, they were generally not inclined to investigate claimed statuses strenuously. Furthermore, disabled people were humans and not angels. Sometimes they themselves didn’t want to be associated with other types of disabled people. In particular, the disability designation in question (13) on the 1850 census questionnaire asked  “Whether deaf and dumb, blind, insane, idiotic, pauper or convict”. Saying “yes” may put you in company that you don’t prefer to keep.

Summer censuses also sometimes missed deaf students who were traveling to or from a residential school.

Enumerator Discretion

The enumerator’s job was to write the disability that applied. What counts as deaf and dumb? That’s largely at the enumerator’s discretion. Some enumerators wrote ‘deaf’ even though that wasn’t an option. Was that shorthand for ‘Deaf and Dumb’? Or were they specifying that the person was deaf only and not dumb? We don’t know. But we do know that they didn’t follow the instructions. What if a person was both insane and blind? Then what should be written? “Blind/Insane” or “Blind and Insane” or “In-B” and any number of combinations were written. Some of them are easier to read than others.

Data Reading Errors

IPUMS is the major resource for using census data. The historical data was entered by foreign data-entry workers who didn’t always speak English. So, the records aren’t perfect. Some of the records are corroborated with Optical Character Recognition (OCR), but the historical script is sometimes hard to read. Finally, the fine folks at familysearch.org and Brigham Young University have used Church of Latter Day Saints (LDS) volunteers to proof data entries. Regardless, we know that the IPUMS data isn’t perfect and that the disability data is far from perfect. Usually, reports don’t dwell on it. They simply say that the data is incomplete.

The disability data is incomplete for a lot of reasons related to the respondent, the enumerator, the instructions, and the digital data creation. What a mess.

Optimal Protein Consumption in the 21st Century: A Model

I’ve discussed complete proteins before. I’ve talked about the ubiquity of protein, animal protein prices, vegetable protein prices, and a little but about protein hedonics. My coblogger Jeremy also recently posted about egg prices over the past century. Charting the cost of eggs is great for identifying egg affordability. But a major attraction of eggs is that they are a ‘complete protein’. So how much of that can we afford?

Here I’ll outline a model of the optimal protein consumption bundle. What does this mean? This means consuming the quantities of protein sources that satisfy the recommended daily intake (RDI) of the essential amino acids and doing so at the lowest possible expenditure. Clearly, this post includes a mix of both nutrition and economics.  Since a comprehensive evaluation that includes all possible foods would be a heavy lift, here I’ll just outline the method with a small application.

Consider a list of prices for 100 grams of Beef, Eggs, and Pork.* We can also consider a list that identifies the quantity that we purchase in terms of hundreds of grams. Therefore, the product of the two yields the total that we spend on our proteins.

Of course, not all proteins are identical. We need some characteristics by which to compare beef, eggs, and pork. Here, I’ll use the grams of essential amino acids in 100 grams of each protein source. Because there are different RDIs for each amino acid, I express each amino acid content as a proportion of the RDI (represented by the standard molecular letter).

Then, we can describe how much of the RDI of each amino acid that a person consumes by multiplying the amino acid contents by the quantities of proteins consumed.

Our goal is to find the minimum expenditure, B, by varying the quantities consumed, Q, such that the minimum of C is equal to one. If the minimum element of C is greater than one, then a person could consume less and spend less while still satisfying their essential amino acid RDI. If the minimum element is less than one, then they aren’t getting the minimum RDI.

How do we find such a thing? Well, not algebraically, that’s for sure. I’ll use some linear programming (which is kind of like magic, there’s no process to show here).

The solution results in consuming only 116.28 grams of Pork and spending $1.093 per day. The optimal amino acid consumption is also below. Clearly, prices change. So, if eggs or beef became cheaper relative to pork, then we’d get different answers.

In fact, we have the price of these protein sources going back almost every month to 1998. While pork is exceptionally nutritious, it hasn’t always been most cost effective. Below are the prices for 1998-2025. See how the optimal consumption bundle has changed over time – after the jump.

Continue reading

A Forgotten Data Goldmine: Foreign Commerce and Navigation Reports

Economists rely on trade data. The historical Foreign Commerce and Navigation of the United States reports detailed monthly figures on imports, exports, and re-exports. This dataset spans decades, providing a crucial resource for researchers studying price movements, consumption patterns, and the effects of war on global trade.

The U.S. Department of Commerce compiled these reports to track the nation’s commercial activity. The data cover a vast range of commodities, including coffee, sugar, wheat, cotton, wool, and petroleum. Officials recorded trade flows at a granular level, enabling economists to analyze seasonal fluctuations, wartime distortions, and postwar recoveries. Their inclusion of re-export figures allows for precise estimates of domestic consumption. Researchers who ignore re-exports risk overstating demand by treating imports as goods consumed rather than goods in transit.

Continue reading

Trump Cutting & Spending: Day 45

It’s hard to keep up with all of the Trump administration’s activities. There is such a flurry of activity related to funding, regulations, and executive actions that no one can keep up with everything. Individuals and news outlets have scarce resources and attention. There’s the added typical challenge of filtering out fact from analysis. If only there was way to summarize the administration’s activities in an objective and meaningful sense.

Luckily, numbers don’t lie – and the federal government publishes a lot of numbers. Specifically, they publish the Daily Treasury Statement which identifies each day’s various categories of outlays. We can look at the raw number of spending to get a sense for where and whether Trump is changing spending within the federal government.

Lauren Bauer at The Hamilton Project noticed that the US Treasury has an API for those daily statements.  She created a nice online tool at Brookings that is relatively user friendly. Individuals can visit and see each day’s spending or the cumulative spending throughout the year. Below is the cumulative federal spending for 2024 and 2025. As of March 5th, the US has spent a total of 5.2% more in 2025 than in the year prior (that’s on track with the growth rate of GDP). Importantly, she makes all of the data available for download so that individuals can conduct their own analysis. I lean on her data here.

Where have the cuts been happening? The below graph includes the 5 spending areas that have been most deeply cut relative to the same day in 2024.* The red line denotes inauguration day. The USAID cuts made big news, and it seems like they knew something was happening around the time of inauguration. It looks like they were trying to get spending out the door before the taps were shut off. The FCC and the Library of Congress were also affected by the funding freeze that was announced in late January.

President Trump claims to have made cutting waste a priority. With Elon Musk in tow, the administration has made waves by disrupting USAID, the NSF, and federal payroll. We’re 45 days into the administration. We can use the data provided by the Treasury and made accessible by Bauer to evaluate how the Trump administrations has been spending and cutting according to the numbers.

One way to evaluate spending is to compare the cumulative spending over the course of 2024 and 2025. That is, spending on the 45th day of the year should be more or less comparable in 2024 vs 2025. It’s still early in the year and since various payments can be quite irregular, there’s a lot of noise in the data so far. But we should be able to see big changes. Smaller changes will be easier to see as the year goes on.

The 5 areas of greatest cumulative spending growth relative to 2024 are graphed below.* It does look like some funding was trying to get out of the door prior to Trump taking office, but that’s just speculation on my part.   FEMA spending was up, likely due to the fires in California.  Much more US Treasury spending is happening, specifically for Claims, Judgments, & Relief. We might see that remain elevated as the new administration keeps ‘trying’ things and then being stopped by injunctions, being the subject of lawsuits, and owing compensation.

While big percent changes in outlays can have massive implications for individual programs, Musk and Trump will need to cut huge amounts in order to claim any kind of victory over profligate spending. (Just so we’re all on the same page, they will fail if they refuse to touch old-age entitlements.) Where have the biggest spending cuts happened as measured by actually dollars? See below.*** The deepest and most consistent cuts are coming from the reductions in federal employee insurance payments. Similarly, the USAID and FCC cuts amount to a $2 billion cut from this time last year. Department of Education spending and the hospital insurance trust fund are down, but are also more volatile in their expenditures. Those one-time spikes in the data are due to pay dates between 2024 and 2025 being offset by a day or two.

Continue reading