Michigan Consumer Surveys: Individual-Response Data

I’ve now posted individual-level responses to the 1978-2025 Michigan Consumer Surveys to Kaggle in CSV and Stata formats. The University of Michigan’s Consumer Surveys are a widely followed source for data on consumer confidence and inflation expectations:

Their official site is good if you just want summary tables or charts like this:

But what if you want detailed crosstabs to see how sentiment differs for different groups, or microdata so that you can run regressions? With enough clicks you can get this from what UMich calls their “cross-section archive“. But it is pretty hidden, my student looking into this thought they just didn’t offer individual-level data; and even once you get their data, it is in an unlabelled CSV file with hard-to-understand variable names and codes. So I wanted to make it clear that the full data with all responses for all years is available, and if you use my Stata version it is even reasonably easy to understand (the code I adapted for labelling it is on OSF). Then you can run your regressions, or make charts like this:

The College-Only Covid Recovery

If you’re new here, a reminder that you can find other cleaned-up versions of popular datasets on my data page.

National Health Expenditure Accounts Historical State Data: Cleaned, Merged, Inflation Adjusted

The government continues to be great at collecting data but not so good at sharing it in easy-to-use ways. That’s why I’ve been on a quest to highlight when independent researchers clean up government datasets and make them easier to use, and to clean up such datasets myself when I see no one else doing it; see previous posts on State Life Expectancy Data and the Behavioral Risk Factor Surveillance System.

Today I want to share an improved version of the National Health Expenditure Accounts Historical State Data.

National Health Expenditure Accounts Historical State Data: The original data from the Centers for Medicare and Medicaid Services on health spending by state and type of provider are actually pretty good as government datasets go: they offer all years (1980-2020) together in a reasonable format (CSV). But it comes in separate files for overall spending, Medicare spending, and Medicaid spending; I merge the variables from all 3 into a single file, transform it from a “wide format” to a “long format” that is easier to analyze in Stata, and in the “enhanced” version I offer inflation-adjusted versions of all spending variables. Excel and Stata versions of these files, together with the code I used to generate them, are here.

A warning to everyone using the data, since it messed me up for a while: in the documentation provided by CMMS, Table 3 provides incorrect codes for most variables. I emailed them about this but who knows when it will get fixed. My version of the data should be correct now, but please let me know if you find otherwise. You can find several other improved datasets, from myself and others, on my data page.