You are probably tiring of hearing that there is a data revolution going on around us. But the data we have and the ways in which we are collecting it has changed. Viktor Mayer-Schönberger and Kenneth Cukier in Big Data coin the term to refer to taking information about anything (including things we have never considered information or data) and transforming it into data using a format that quantifies it.
You are probably creating data through the process of datafication right now. Do you have a smartphone at hand? It’s beaconing its position every few moments to a cell tower nearby. Through the cellular backhaul, your location is datafied and noted. You’ve done nothing overtly and yet you are leaving data tracks everywhere you go—when you carry a smartphone, make a credit card purchase, log into a website, use your keycard to enter a parking lot, or order a movie on Netflix. It’s all data now. We’re in the middle of a data (information) revolution and the revolution is continuing; in order to extract truth from the huge quantity of data created in the process, new methods are required.
The traditional statistical methods (the ones you and I learned in college) are still valid, but we need new ways of interpreting and sifting through the data. Traditional statisticians tend to have complex models and focus on smaller data sets. They have been very productive in doing so, but the new data scientists are used to working with vast amounts of relatively unstructured data.
Hal Varian is the chief economist at Google. I first became aware of Varian because of his 1978 microeconomics textbook written while he was on the faculty at the University of Michigan. This past January (2013), Varian gave a few talks and chaired some meetings at the Annual American Economics Association (AEA) meetings in San Diego, California. He’s still an economist, but now he advises Google on almost everything that is a part of its business model. There is no set single price for ads on Google, and they are sold by auction. He recognizes that businesses that realize the instantaneity of commerce today will flourish, while those that ignore it will wither away.
When the submarine cable came to Australia and allowed almost instantaneous information to be passed from London to Sydney, it changed international trade forever. Firms that embraced the new speed of information grew and prospered, and others were little heard from again. The same thing is taking place today. The Internet and everything associated with it (including your smartphone) provides an almost instantaneous path for worldwide communication and datafication.
Varian notes that a great deal of the new data created is free or very nearly so. Google provides some of this new information almost without charge. At the AEA meetings, he spoke of Google’s flu data (http://www.google.org/flutrends/us/#US). This is an example of Google noticing that as people use the search engine with particular phrases, they provide information about the likely severity of flu cases. That information seems to be available even more quickly than the Centers for Disease Control and Prevention (CDC) statistics that are collected and published in an entirely different and more traditional manner.
All of this brings us to the upcoming Exploring Big Data: Implications for Forecasting and Demand Planning webinar scheduled on June 25/26. We will use that session to introduce you to some of the uses and concepts associated with big data and to offer at least an explanation of one technique that you will see used for analyzing such data. The webinar will be conducted in an Ask the Expert format, and I invite you to submit your questions about big data, so that I can address them during the webinar. I look forward to hearing from you. Ask your question today, and don’t forget to register for the webinar!
Mayer-Schönberger, Victor, and Kenneth Cukier. Big Data. New York: Houghton Mifflin Harcourt, 2013.