Business Analytics Textbook with R

There have been moments in my career as a data analytics instructor that I have considered writing my own textbook, just so I could have one that works. When I started in 2017, Samford University was one of the first schools to seriously reshape the undergraduate business school curriculum in response to the increase in demand for analytics skills. The pickings for appropriate textbooks were slim. Students in my class have already taken “business statistics”, which is a class I had to take as an undergraduate as well. I was trying to smash together business case studies, analytics that was more advanced than basic stats but also not beyond the undergrads, all while using a software program for applications.

I am pleased with what I see in my review copy of the new book by Saltz & Stanton Data Science for Business with R

Continue reading

Data Landfill

This semester, the textbook I am using to teach data analytics is Business Intelligence by Sharda, Delen, and Turban. In Chapter 3, the authors describe how a data warehouse fits into a business enterprise. A data warehouse (DW) is more than a spreadsheet. It is more than a two-dimensional transactional database. A DW takes expertise to build and maintain. If done correctly, users within the company will be able to quickly access important data that they need to make decisions. Having a good DW is essential for any large enterprise today.

Near the end of the chapter, the authors list problems that are encountered when technologists go in to build a DW for an enterprise.

Continue reading

Emily Oster on Vaccines in February 2021

My third post on Covid data heroes features Dr. Emily Oster. Emily is a mom. Lot’s of economists are moms, but few have incorporated it quite as much into their careers. Emily has written a book on pregnancy and a new one on what to do with the kids after they are born. She does a great job explaining scientific research in a way that is easy to understand.

Emily made a big push to collect data on schools and covid back when there was crippling uncertainty about how dangerous it is to let children go to school in person.

She has a great email newsletter and substack. Her latest post is called “Vaccines & Transmission Redux Redux”. In this post, she distills the latest research to give practical advice on when kids can see grandparents once the vaccines are out.

For a long time now, some families have been avoiding close contact with elderly relatives. When can we go back to normal?

Continue reading

The Massive SolarWinds Hack: A Work of Art

With all the uproar around the election in December, the news of the SolarWinds data breach did not get the attention it deserved. Some well-resourced foreign organization, almost certainly in Russia, succeeded in infiltrating the data systems of an astounding 18,000 or more U.S. organizations. These included major federal agencies such as the Pentagon, the Department of Homeland Security, the State Department, the Department of Energy, the National Nuclear Security Administration, and the Treasury, and other big targets like Microsoft, Cisco, Intel, and Deloitte, and organizations like the California Department of State Hospitals, and Kent State University. Security watchdogs run out of adjectives (“11 out of 10”) in characterizing the magnitude of this hack.

At the same time, security experts cannot help admiring the sheer artistry of this exploit. Hackers themselves often view their codes as a work of art. According to one cybersecurity expert, “Programmers and hackers like to sign their work like artists…So they sign that code in various ways. Often, they’ll leave their initials or they’ll try to be cute and put some sort of cryptic message.” So how was this hack accomplished?

Continue reading

Talking about redistribution in the lab

I am grateful to Yang Zhou for inviting me to talk about a working paper (with Gavin Roberts) on Friday. Yang told me that this audience is not familiar with lab experiments, so I’m going to take a few minutes out of my time to set the stage for my research.

There is a new book out, Causal Inference by Scott Cunningham, that is the talk of #EconTwitter (Cunningham, 2021). The book is 500 pages of dense prose and code. Here is a review saying that Cunningham left out many key things that a practitioner would need to know. Causal inference from naturally occurring data is hard!

Lab experiments bring something important to the research community. Lab experiments give the researcher a lot of control, which is why they are particularly useful for causal inference  (Samek, 2019).

Continue reading

Going back to the gym?

Who doesn’t want to be stronger? You can get on the floor and do 5 pushups right now. Did you do it? Probably not. (If you did, great work.) For most people, nothing is stopping you from getting strong, except yourself.

I just keep sitting around. Going to a gym and meeting with an instructor in person used to be a way around this problem. This takes our human foibles and makes them work to our advantage. The sunk cost fallacy can work for us.

If you bought a stock and it’s a loser, you should sell! Too many people keep holding and go down with the ship.

However, knowing themselves, many people also go to the gym and sign up for a class. Not wanting to walk away from their investment, they actually do the classes.

The WSJ reports that many gyms are closing after Covid-19 forced the customers out. The article describes the machines people have brought into their homes to replace gyms. The Peloton is a signature of the year 2020. The new trend brings a live human trainer into the process of exercising alone at home.

The new machines can collect data on the user. This data is transmitted to instructors and maybe even friends. Now, from the comfort of your own home, you can “sign up for a class” again.

Had Covid struck in 1980, people might have bought fitness machines for their basements and they might even have bought a VHS to pop in and exercise with. But they would have been missing the link to a human who knows where they are supposed to be, which apparently provides more motivation.

The market has loved Peloton and smart money seems to think it will continue to do well, even with a vaccine already rolling out.

Continue reading

Fitbit got 2 billion and all I got was an email

I made a Fitbit account years ago, even though I don’t wear one. As a user, I got an email on Jan 14, 2021 alerting me that they just sold Fitbit to Google. The email assured me that Google will not try to muscle Fitbit users away from iPhones or iOS. Google has said that it will keep Fitbit data “separate from other Google ad data.” TechCrunch had some more details for me, including how many billions of dollars Fitbit was getting out of this deal.

Is it so bad to see adsbased on your sleep habits? What if you had a bad night and then saw more coffee ads the next day? Seems fine. Is it more “creepy” than seeing an ad for something you just bought?

I don’t actually know much about Google’s data structure. But I can imagine ways that a large tech company could use Fitbit data in a way that users would not like. What if Google knows that you didn’t sleep well this week. Say someone else is using Google search to find a person to recruit for a desirable job in Public Relations. What if predictive models indicate that people who don’t get at least 6.5 hours of sleep per night are low performers? What if you ended up not getting linked up with your dream job, because you weren’t sleeping well one week? This is all speculative. What if Google starts measure how your heart rate responds to viewing various website that you access through Chrome? Have they agreed to not do that as part of the acquisition deal?

In 2018, Tyler sat down with Eric Schmidt, a senior executive of Google. Tyler asked him why Google doesn’t use their massive stores of data to inform investments for a hedge fund. Here was the reply:

SCHMIDT: Well, I’ll give you a more generic answer, which is, from the moment I joined the company, there were many people who said, “Why don’t you take this information and do something that will use it for marketing purposes?”

And the answer is always the same, which is that you need people’s permission to do that, and you can be sure you won’t get that permission, if you follow that reasoning. So we decided that was a pretty bright line. For example, if a tech company that were a consumer company were bundled with a hedge fund, you would have to disclose that it was being used in that context. The people would go crazy.

But the other thing that’s true — and Google was good about this — is we took the position that it was important for us to disclose everything we were doing as well as we could.

I’ll give you a governance argument. In a large company, the employees are independent citizens of humanity, and if they see corruption in your leadership — in other words, if they see you doing things which are inconsistent with the values, you will be criticized.

Schmidt doesn’t deny that Google could take advantage of data in order to become a successful hedge fun. He says that it would look bad, and Google doesn’t want to look bad even to its own employees. Hmmm, right? I don’t bring this up to accuse Google of wrongdoing. It just makes you wonder how things will unfold in the future. One can, at least, see why the acquisition of Fitbit was scrutinized.

I use Google products heavily on my laptop. I don’t have many “smart” devices aside from my smartphone. I wore the Fitbit step tracker for a few days, but I didn’t find the information to be helpful. It’s not like the Fitbit does the dishes for me or drives me to the gym. Get me that smart device and I’ll look at any ads you want.

QE, Stock Prices, and TINA

The U.S. economy as quantified by GDP has been sputtering along in slow growth mode for a number of years. It took a huge hit in 2020 due to covid shutdowns and has not nearly recovered. But stock prices have been rocketing upwards, and this past year is no exception. Markets took a cliff-dive in March, but have since way overshot to the upside.

Here is a plot of the past five decades of U.S. GDP and of the Wilshire 5000 index, which approximates the total stock market capitalization in the U.S.:

Chart Source: St. Louis Fed, as plotted by Lyn Alden Schwartzer

These two curves have crisscrossed each other over the past five decades, but in recent years the stock market has roared to the upside. One of Warren Buffet’s favorite metrics as to whether stock are overvalued is to consider the ratio of these two quantities, i.e. the market-capitalization-to-GDP (Cap/GDP) ratio:

Source: Lyn Alden Schwartzer

The ratio is much higher than it has even been. The last time it got this high was in 2000, and that did not end well.

Continue reading

Excess Mortality in 2020

My last post of 2020 tried to end the year on an optimistic note: the rapid innovation of a new vaccine was truly a marvel. But I also warned you that I would have a post in the new year talking about the deaths of 2020 during the pandemic. And here it is.

Throughout 2020, I have tried to keep up with the most recent data, not only on officially coded COVID-19 deaths, but also on other measures. An important one is known as excess mortality, which is an attempt to measure the number of deaths in a year that are above the normal level. Defining “normal” is sometimes challenging, but looking at deaths for recent years, especially if nothing unusual was happening, is one way to define normal. The team at Our World in Data has a nice essay explaining the concept of excess mortality.

One thing to remember about death data is that it is often reported with a lag. The CDC does a good job of regularly posting death data as it is reported, but these numbers can be unfortunately deceptive. For example, while the CDC has some death data reported through 51 weeks of 2020, but they note that death data can be delayed for 1-8 weeks, and some states report slower than others (for reasons that are not totally clear to me, North Carolina seems to be way behind in reporting, with very little data reporting after August).

So there’s the caution. What can we do with this data? Since 2019 was a pretty “normal” year for deaths, we can compare the deaths in 2020 to the same weeks of data in 2019. In the chart at the right, I use the first 48 weeks of the year (through November), as this seems to be fairly complete data (but not 100% complete!). The red line in the chart shows excess deaths, the difference between 2019 and 2020 deaths. From this, we can see that there were over 357,000 excess deaths in 2020 in the first 11 months of the year, or about a 13.6% increase over the prior year.

Is 13.6% a large increase? In short, yes. It is very large. I’ll explain more below, but essentially this is the largest increase since the 1918 flu pandemic.

Continue reading

2020 Holiday Viewing

Forget “The Christmas Prince” or “The Prince Christmas” or whatever is on Netflix. Why not spend your holiday refreshing this new vaccine dashboard?

Here’s the announcement:

I personally know a few health care workers who got their shots (do not say “jab” to me) this past week. It’s all very exciting! Here at University of Alabama at Birmingham (UAB), the medical community has freezers, fortunately.

Here’s VP Mike Pence getting his vaccine:

Jeremy and Doug have both talked about allocation this week. Economists get really jazzed about allocating scarce resources. It’s been frustrating to watch first tests and now vaccines not be available on a market. Excellent points are also made every week over at Marginal Revolution on how we are missing an opportunity to get the incentives right. Supply. Curves. Slope. Up. (Thousands. Dying. Every. Week.)