The economics of damned lies

Economists have become almost comically skeptical of estimated effects. A researcher estimating the effect of X on Y has always had to consider the bias and efficiency of their estimator, where bias is the result of unconsidered or unobserved forces pulling your estimated effect in one particular direction away from the truth (too positive or too negative), and efficiency is the overall noisiness of the estimate, where a less efficient estimater provides too large a range of possible effect sizes.

Under the umbrella of efficiency were concerns about random measurement error – the basic and unavoidable difficulties in accurately recording the the underlying “true” value. Filed under “everywhere and always”, measurement error is often simply the cost of doing business, while nonetheless limiting the precision which the world can be known and, in turn, the precision with which decision making or policy can be calibrated.

Coping with bias has been in many ways the story of empirical economics and the “credibiilty revolution” of the last 25 years. It’s why “identitication strategy” is the fourth slide of almost any microeconomics presentation, why the econometrics of every great applied economics working paper is seemingly obsolete before it finds itself in print, and why there is a genuine possibility I will retire with a half dozen ulcers before I finish this blog post. Economists make themselves crazy thinking, strategizing, and internalizing criticism about the potential bias in their estimates. Selection bias, omitted variable bias, reverse causality, and even observer bias lurk in the shadows of our minds. To be an expert in causal inference is to anticipate and guard against myriad sources of bias in your empirical analysis. For many living economists, however, there is a new bogeyman.

Systemic measurement error.

Sounds banal enough. And if you’re a chemist, it is. The gauge is consistently measuring every temperature too high, mass too low, electromagnetic spectra too red. Something to test for every day. Vigilence and repetition, the solution. For economists, however, the answer is less simple.

What happens when the data is rigged to make the results too good? Unemployment too low. Wages too high. Expenditures too productive. <Redacted> too <redacted>. Economists have looked for cheaters as a research subject and rooted out fraud within scientific endeavor itself. But it is precious few who have made it their job to sift through manipulated public data and carefully distill the true underlying numbers. And for good reason — as soon as you declare the data unreliable, you open the door to your own personal bias. Your politics, career ambitions, or even just your good hearted desire to observe people being more decent than our own pessimissim might otherwise allow for. To allow yourself to manipulate the potentially fraudulent data is to potentially make a bad situation worse.

Replicability and transparency of analysis was important before, but now we’re entering an even more tedious and slow landscape because critics aren’t just going to want to adjudicate your analysis, they’re going to want to adjudicate every observation in your data set. Or perhaps I am being too negative. There is a genuine upside. As people look to distill and correct for systemic measurement error, they’re going to create greater demand for 1) parallel analysis of similar questions using different techniques on the same data and 2) great forensic analysis of data and the institutions that create it. Never forget that sovietology was a genuine research career. More work to be done, but it can be done.

More work that has to be done. Sigh. My stomach hurts.

Leave a comment