IPUMS Data Intensive Workshop & Conference

I just returned from the Full Count IPUMS data workshop at the Data-Intensive Research Conference that was hosted by the Network on Data Intensive Research on Aging and IPUMS. The theme of this conference was “Linking Records”.

It was the best workshop and conference that I’ve ever attended. I’d attended the conference remotely in the past. But attending the workshop was exceptional. Myself and about 20 other people were flown to the Minneapolis Population Center and put up in a hotel during our stay (that made the conference a low-stress affair). The whole workshop was well organized, the speakers built on one another’s content, and there was a hands-on lab for us to complete. I felt my human capital growing by the hour.  

Continue reading

Venezuelans Vote Overwhelmingly Against Maduro

Venezuela held an election this week; President Maduro says he won, while the opposition and independent observers say he lost. Disputed elections like this are fairly common across the world, but where Venezuela really stands out is not how people vote at the ballot box- it is how they vote with their feet.

Reuters notes that “A Maduro win could spur more migration from Venezuela, once the continent’s wealthiest country, which in recent years has seen a third of its population leave.”

I don’t think we emphasize enough how crazy the scale of this is. After every US Presidential election, you hear some people who supported the losing side talk about leaving the country, but they almost never do. Leaving your home country behind is a dramatic step, one people only want to take if they think things are much better elsewhere. The US, even with a party you don’t like in power, has generally stayed a good place to live. The total number of Americans who have moved abroad for any reason (I would guess most feel more pulled by the host country rather than pushed by the US) is about 3 million. That is less than 1% of all Americans; by contrast more than 46 million people have immigrated to the US from other countries, and many more would come if we allowed it.

Even in poor countries, seeing anything like one third of the population leave is dramatic, especially when almost all the migration happens in only 10 years as in Venezuela:

Source. Note this only goes through 2020, and emigration has grown since

This makes Venezuela the largest refugee crisis in the history of the Americas, and depending on how you count the partition of India, perhaps the largest refugee crisis in human history that was not triggered by an invasion or civil war.

Instead, it has been triggered by the Maduro regime choosing terrible policies that have needlessly and dramatically impoverished the country:

I hope that the Venezuelan government will soon come to represent the will of its people. I’m not sure how that is likely to happen, though I guess positive change is mostly likely to come from Venezuelans themselves (perhaps with help from Colombia and Brazil); when the US tries to play a bigger role we often make things worse. But what has happened in Venezuela for the past 10 years is clearly much worse than the “normal” bad economic policies and even democratic backsliding that we see elsewhere. People everywhere complain about election results and economic policy, but nowhere else have I seen such a case of people going past simple cheap talk, taking the very expensive step of voting against the regime with their feet.

Fiscal Illusion: It’s Real (People Underestimate How Much They Pay in Taxes)

The concept of “fiscal illusion” has long existed in public finance, but it is difficult to test. The basic theory is that people will underestimate how much they pay in taxes, as well as underestimate government expenditures. A forthcoming paper in Public Choice by Kaetana Numa uses survey data from the United Kingdom to test the theory, and finds support. From the abstract of “Fiscal illusion at the individual level“:

“providing personalized fiscal information reduces support for higher taxes and spending and increases support for lower taxes and spending. These findings indicate that taxpayers underestimate both their tax liabilities and the costs of public services.”

The paper uses a “novel personalized fiscal calculator” to estimate how much tax an individual would actually owe. It then randomizes which taxpayers get this information, and finds that “the treated respondents… were less supportive of raising taxes and more supportive of cutting taxes than the respondents in the control condition.”

And the results are large. For all taxes, in the treated group that saw their personalized fiscal calculator, 61 percent support cutting taxes, versus just 50 percent in the control group. The differences show up across the major taxes that individuals pay in the UK, including the income tax, national insurance contributions (both employer and employee sides), and the VAT. There is no tax category where the treatment group is more likely to want to increase the tax, though the VAT and the smaller Fuel duty and Council tax are about equal on the percent wanting an increase (but the median response for these last two is to decrease the tax — in both the control and treatment groups).

Do these results from the UK hold up in other developed nations? Possibly. In a 2014 Eurobarometer survey, the percent of EU citizens that could correctly identify their nation’s VAT rate varied widely. The high was 89 percent in Germany correctly identifying the rate, down to 31 percent in Ireland. The average was 65 percent — though the UK was at the low end with only about 47 percent correctly identifying the VAT rate.

Fiscal illusion appears to be a real issue, and probably an important one in the UK.

Sources on AI use of Information

  1. Consent in Crisis: The Rapid Decline of the AI Data Commons

Abstract: General-purpose artificial intelligence (AI) systems are built on massive swathes of public web data, assembled into corpora such as C4, Refined Web, and Dolma. To our knowledge, we conduct the first, large-scale, longitudinal audit of the consent protocols for the web domains underlying AI training corpora. Our audit of 14, 000 web domains provides an expansive view of crawlable web data and how consent preferences to use it are changing over time. We observe a proliferation of AI specific clauses to limit use, acute differences in restrictions on AI developers, as well as general inconsistencies between websites’ expressed intentions in their Terms of Service and their robots.txt. We diagnose these as symptoms of ineffective web protocols, not designed to cope with the widespread re-purposing of the internet for AI. Our longitudinal analyses show that in a single year (2023-2024) there has been a rapid crescendo of data restrictions from web sources, rendering ~5%+ of all tokens in C4, or 28%+ of the most actively maintained, critical sources in C4, fully restricted from use. For Terms of Service crawling restrictions, a full 45% of C4 is now restricted. If respected or enforced, these restrictions are rapidly biasing the diversity, freshness, and scaling laws for general-purpose AI systems. We hope to illustrate the emerging crisis in data consent, foreclosing much of the open web, not only for commercial AI, but non-commercial AI and academic purposes.

AI is taking out of a commons information that was provisioned under a different set of rules and technology. See discussion on Y Combinator 

2. “ChatGPT-maker braces for fight with New York Times and authors on ‘fair use’ of copyrighted works” (AP, January ’24)

3. Partly handy as a collection of references: “HOW GENERATIVE AI TURNS COPYRIGHT UPSIDE DOWN” by a law professor. “While courts are litigating many copyright issues involving generative AI, from who owns AI-generated works to the fair use of training to infringement by AI outputs, the most fundamental changes generative AI will bring to copyright law don’t fit in any of those categories…” 

4. New gated NBER paper by Josh Gans “examines this issue from an economics perspective”

Joy: AI companies have money. Could we be headed toward a world where OpenAI has some paid writers on staff? Replenishing the commons is relatively cheap if done strategically, in relation to the money being raised for AI companies. Jeff Bezos bought the Washington Post. It cost a fraction of his tech fortune (about $250 million). Elon Musk bought Twitter. Sam Altman is rich enough to help keep the NYT churning out articles. Because there are several competing commercial models, however, the owners of LLM products face a commons problem. If Altman pays the NYT to keep operating, then Anthropic gets the benefit, too. Arguably, good writing is already under-provisioned, even aside from LLMs.

Inflation in the G7 and Russia

Among the former G8 countries, Russia has by far the highest cumulative inflation rate since January 2020, almost double the amount of inflation we’ve seen in the US and in most G7 countries. No doubt the effects of the wartime economy are contributing to this, but even in February 2022 before they invaded Ukraine, their inflation still had clearly been worse.

The US is on the high end for this group, but pretty close to the median. Japan looks really good on inflation, but that’s probably not much comfort to them since their economy is still smaller than before the pandemic. By this measure, the US looks pretty good (chart from Joey Politano):

GDP estimates for Russia are a little tricky because of the war, but according to IMF estimates, Russia’s economy in 2023 was about 5.6% larger than 2019 in real terms.

See also: Food Inflation in the G7 and Russia

One Up on Wall Street in the Meme Stock Era

Peter Lynch was one of the most successful investors of the 1970’s and 1980’s as the head of the Fidelity Magellan Fund. In 1989 he explained how he did it and why he thought retail investors could succeed with the same strategies in the bestselling book “One Up on Wall Street”. Given the meme stock exuberance of retail investors in the past few years, I thought the book might be due for a comeback.

Instead interest seems flat, and when I do hear Peter Lynch mentioned it is by institutional investors more than retail. But the book seems to me like it is still valuable, so I’ll share some highlights here. This one could easily have been written this year:

Where did the Dow close? I’m more interested in how many stocks went up versus how many went down. These so-called advance/decline numbers paint a more realistic picture. Never has this been truer than in the recent exclusive market, where a few stocks advance while the majority languish. Investors who buy “undervalued” small stocks or midsize stocks have been punished for their prudence. People are wondering: How can the S&P 500 be up 20 percent and my stocks are down? The answer is that a few big stocks in the S&P 500 are propping up the averages.

I see why the book hasn’t caught on with meme stock traders:

Nobody believes in long-term investing more passionately than I do… I think of day-trading as at-home casino care.

I’ve never bought a future nor an option in my entire investing career, and I can’t imagine buying one now. It’s hard enough to make money in regular stocks without getting distracted by these side bets, which I’m told are nearly impossible to win unless you’re a professional trader.

So where does he think retail investors have a chance to get “One Up on Wall Street”?

Continue reading

The Crime Wave May Be Over

Crime of all forms certainly spiked in 2020 and 2021 in most of the US, and continued to remain high for a time after that. But recent data, especially homicide data compiled by AH Datalytics, suggest that crime is falling. When measured by homicide rates, the worst of crimes and the least likely to be underreported, homicide rates across 272 major cities in the US is down 17.6% in 2024 compared with the same period in 2023. And among the 20 cities with the most homicides in 2023, just one (Birmingham, the 20th on the list) saw an increase from 2023 to 2024.

But is this just coming down from a relative high? Are homicide rates still elevated from pre-pandemic? I went through the cities with the most homicides on the AH Datalytics list, and for those where I could find comparable data pre-pandemic, I created the following charts. As you will see, lots of these cities are down to or below pre-pandemic levels (for the period in 2024 that is comparable to prior years). Not every single city, of course, but most are close to 2019 or prior years.

From Cubicles to Code – Evolving Investment Priorities from 1990 to 2022

I’ve written before about how we can afford about 50% more consumption now that we could in 1990. But it’s not all bread and circuses. We can also afford more capital. In fact, adding to our capital stock helps us produce the abundant consumption that we enjoy today. In order to explore this idea I’m using the BEA Saving and Investment accounts. The population data is from FRED.

The tricky thing about investment spending is that we need to differentiate between gross investment and net investment. Gross investment includes spending on the maintenance of current capital. Net investment is the change in the capital stock after depreciation – it’s investment in additional capital not just new capital.  Below are two pie charts that illustrate how the composition of our *gross investment* spending has changed over the past 30 years. Residential investment costs us about the same proportion of our investment budget as it did historically. A smaller proportion of our investment budget is going toward commercial structures and equipment (I’ve omitted the change in inventories). The big mover is the proportion of our investment that goes toward intellectual property, which has almost doubled.

It’s easiest for us to think about the quantities of investment that we can afford in 2022 as a proportion of 1990. Below are the inflation-adjusted quantities of investment per capita. On a per-person basis, we invest more in all capital types in 2022 than we did in 1990. Intellectual property investment has risen more than 600% over the past 30 years. The investment that produces the most value has moved toward digital products, including software. We also invest 250% more in equipment per person than we did in 1990. The average worker has far more productive tools at their disposal – both physical and digital. Overall real private investment is 3.5 times higher than it was 30 years ago.

Continue reading

“Cheapflation”: Inflation Really Does Hit the Bottom Harder

During the peak of the Covid inflation in 2022 I speculated that food inflation was worst for the cheapest products:

typical McDouble now costs well over $2 in most of the US, while a typical Big Mac is still well under $6. You used to be able to get 4-5 McDoubles for the price of a Big Mac; now you typically get less than 3 and sometimes, as in Keene, less than 2.

What’s going on here? First, the McDouble was always absurdly cheap. Second, prices rise most quickly where demand is inelastic, and demand is less elastic for goods that are cheaper and goods that are more like “necessities” than “luxuries”.

That post was just based on a couple anecdotes from my personal experience, but a new NBER working paper by Alberto Cavallo and Oleksiy Kryvtsov confirms that this really was a general trend:

We use micro price data for food products sold by 91 large multi-channel retailers in ten countries between 2018 and 2024. Measuring unit prices within narrowly defined product categories, we analyze two key sources of variation in prices within a store: temporary price discounts and differences across similar products. Price changes associated with discounts grew at a much lower average rate than regular prices, helping to mitigate the inflation burden. By contrast, cheapflation—a faster rise in prices of cheaper goods relative to prices of more expensive varieties of the same good—exacerbated it. Using Canadian Homescan Panel Data, we estimate that spending on discounts reduced the change in the average unit price by 4.1 percentage points, but expenditure switching to cheaper brands raised it by 2.8 percentage points….

The prices of cheaper brands grew between 1.3 to 1.9 times faster than the prices of more expensive brands—and only when inflation surged, not before or after.

Zoning Matters for Rising Housing Costs, Especially After 1980

From a new working paper “The Price of Housing in the United States, 1890-2006” by Ronan C. Lyons, Allison Shertzer, Rowena Gray & David N. Agorastos (emphasis added):

“Zoning was adopted by almost every city in our sample during the 1920s. We see a slightly steeper gradient over the next two periods (coefficients of .48 and .29, respectively). In these periods it is possible both that the existing zoning regimes were causing higher price growth and that home price appreciation was incentivizing cities to adopt even more restrictive measures, particularly by the 1970s (Fischel, 2015; Molloy et al., 2020). The gradient in the final period (1980-2006) is even steeper, however (coefficient of .67), suggesting a closer relationship between zoning and home price appreciation towards the end of the 20th century.”

The authors acknowledge that they cannot establish causality with their data, but this is consistent with existing research, such as a paper by Gyourko and Krimmel that I previously discussed.