Interpolation Vs Transition

Sometimes you read an academic article and the author fills in the data gaps with interpolation. That is, they assume some functional form of the data and then replace the missing values with the estimated ones. Often, lacking an informed opinion about functional form, authors will just linearly interpolate between the closest known values. Sometimes this method is OK. But sometimes we can do better.

Historical census data provides a good example because the frequency was only every ten years. Say that we want to know more about child migration patterns between 1850 and 1860. What happened in the intervening years? Who knows. Let’s look at the data.

Using data on individuals who have been linked across censuses allows us to fill in the gaps a little bit. For simplicity, let’s just look at whether a child migrant lived in an urban location and whether they lived on a farm. That means that there are 4 possible ways to describe their residence. Below is a summary of where children migrants lived at the age of zero in 1850 and where the same children lived a decade later at the age of ten in 1860 given that they moved counties.

When I’m the mean time did these children move from one place and to the other? We don’t know exactly. The popular answer is to say that they moved uniformly throughout the decade. That’s ‘fine’. But it assumes that the rate at which people departed places was rising and the rate at which they arrived places was falling. Maybe that’s true, but we don’t really know. Below-left is a graph that shows the linear interpolation.

The nice thing about linear interpolation is that everyone is accounted for at each point in time. The total number of people don’t rise or fall in the intervening interpolation period. But if we were to assume that children departed/arrived at each type of place at a constant rate (maybe a more reasonable assumption), then suddenly we lose track of people. That is, the sum of people dips below 100% as people depart faster than they arrive.

What’s the alternative to linear interpolation?

Continue reading