Whenever researchers are conducting studies using state- or county-level data, we usually want some standard demographic variables to serve as controls; things like the total population, average age, and gender and race breakdowns. If the dataset for our main variables of interest doesn’t already have this, we go looking for a new dataset of demographic controls to merge in; but it has always been surprisingly hard to find a clean, easy-to-use dataset for this. For states, I’ve found the University of Kentucky’s National Welfare Database to be the best bet. But what about counties?
I had no good answer, and the best suggestion I got from others was the CDC SEER data. As so often, the government collected this impressively comprehensive dataset, but only releases it in an unusable format- in this case only as txt files that look like this:

I cleaned and reformatted the CDC SEER data into a neat panel of county demographics that look like this:

I posted my code and data files (CSV, XLSX, and DTA) on OSF and my data page as usual. I also posted the data files on Kaggle, which seems to be more user-friendly and turns up better on searches; I welcome suggestions for any other data repositories or file formats you would like to see me post.
HT: Kabir Dasgupta
2 thoughts on “County Demographic Data: A Clean Panel 1969-2023”