According to a research report from International Data Group (IDG) published earlier in the year, companies are getting serious about Big Data. Nearly half of all respondents (49%) stated that they either have or have begun implementation of a Big Data solution. Of those companies investing, respondents cited the CEO as the “Largest Supporter of Big Data Efforts.” Manufacturers, distributors and retailers stand to gain a great deal from Big Data, provided they understand what it is, how it differs from traditional reporting and how cloud computing drives the ROI, even for small companies.
Are you currently partnering with your CEO to purchase, manage and execute on a Big Data strategy? Or are you struggling to understand what Big Data is, let alone understand what it can do for your business?
Traditionally, a business intelligence initiative has most often started with the idea of a data warehouse. A data warehouse has become a well-understood thing: a central repository of data from many sources, broad in scope (usually containing several years of data) and organized for business reporting, aggregations and trend analysis. A data warehouse is typically comprised of data directly related to your business. It is easy to grasp why we need to add, average or summarize this type of data in order to ask “how is our business doing?”
Big Data is a different type of data, designed to answer a different set of questions. Big Data is usually data which we would not traditionally consider valuable. However, patterns may help us optimize decision making. For example, what are our customers (as a population) doing when they are not in our retail store? This is a more outward-facing question, and most retailers would like to know what other shops their customers patronize. These insights can lead to joint marketing opportunities.
Another example: through which regions of the world have the raw materials we need to manufacture our products been flowing? Could the cost of these materials be impacted by global weather patterns? Supply chains are impacted by external variables. To optimize them, analysts may look outward.
One more: a distributor of home healthcare products decides to divert inventory from the northeastern U.S. to the southeastern U.S. because instances of the word “cough,” “cold” or “fever” spike 15% in tweets coming from Atlanta. In this case, a revenue opportunity was seized.
Where does this data come from? The universe of publicly available data is rapidly expanding. Companies offering consumer subscription data, online reviews, location data and digital media are started every day. Mobile devices and social media have driven this expansion, but no less important are technologies which permit machines to transmit data without human intervention. Another example: “Smart Grid” technologies use machine-generated data, helping energy companies optimize capacity based upon shifting consumer demands.
Big Data is statistical analysis applied to massive volumes of data, and, for this reason, Big Data comes with a new set of challenges. Without an extremely high volume, useful statistical patterns cannot be inferred. Big Data can include both transactional and unstructured data; e.g., images, video, GPS coordinates, documents, tweets, etc. Therefore, Big Data tools must be able to infer structure from data, rather than transforming the data to comply with a pre-designed structure. Finally, Big Data is generated faster than traditional tools can process it. Volume, Variety and Velocity (the 3 Vs) are all characteristics and challenges of Big Data analysis.
To perform this statistical analysis, we need tools which can process the 3 Vs of Big Data. We need inexpensive storage for our high volumes of data. We need software which infers structure from the data it analyzes and can be distributed to thousands of servers all working together. And we need “hyper-scale” ability, so that we can scale up or down based upon the velocity of events being analyzed. These challenges would be insurmountable for small to mid-size organizations without the benefits of cloud computing.
Big Data does not require a massive capital investment. Cloud computing makes computing power highly scalable and the costs continue to go down. Internet software giants have been working on (and benefiting from) Big Data for over a decade. Today, open source tools such as Hadoop are freely available to companies who wish to complement their traditional reporting environments.