This semester, the textbook I am using to teach data analytics is Business Intelligence by Sharda, Delen, and Turban. In Chapter 3, the authors describe how a data warehouse fits into a business enterprise. A data warehouse (DW) is more than a spreadsheet. It is more than a two-dimensional transactional database. A DW takes expertise to build and maintain. If done correctly, users within the company will be able to quickly access important data that they need to make decisions. Having a good DW is essential for any large enterprise today.
Near the end of the chapter, the authors list problems that are encountered when technologists go in to build a DW for an enterprise.
Problem #3 is “Engaging in politically naive behavior.”
Do not simply state that a data warehouse will help managers make better decisions. This may imply that you feel they have been making bad decisions until now.
Wow. What else can go wrong?
Problem #4 is “Loading the warehouse with information just because it is available.”
Do not let the data warehouse become a data landfill. This would unnecessarily slow the use of the system.
Saving data has become very cheap. Here is a blog that documents the decrease in the cost of hardware for data storage. It is also true that “evidence based decision-making” can give an organization a competitive advantage. What does not follow is that any data should be loaded into your data warehouse.
A data warehouse is useful if users can get to the relevant data quickly for important decisions. When there is too much data being saved, then the irrelevant data becomes clutter.
I relate to this in my own life because of my phone camera. Today, I can take hundreds of pictures on my smartphone and they stay on my phone’s memory. At the end of the year, I’d like less than 50 pictures that are high quality. What actually happens is that I’m overwhelmed with the clutter in my photo gallery.
I don’t want to return to the old days of film. However, the fact that film was costly to purchase and develop forced people to make better choices about what to take pictures of. At the end of the year, it seems like it would have been easier to find the handful of images that are worth saving and curating for the future.
Even though the marginal cost of saving more data has gone down, there is still a reason not to save too much data. The cost of the human hours needed to wade through it all is prohibitive. According to Glassdoor, the average annual base pay for a data warehouse administrator is $118,153.