Every day three quintillion bytes of data is generated. This data comes from different sources like digital pictures, videos, posts on social media, e-businesses, intelligent sensors, and log storage in the IT industry.
According to McKinsey, “big data” refers to datasets whose size is far beyond the ability of typical database software tools to capture, store, manage and analyse.
One of the major sources of data is the log storage, which is present in the IT industries because the IT industry stores a lot of information in the form of logs of data.
This data is so vast that the traditional system becomes unable to handle such kinds of logs as this data is semi-structured in nature and is growing with great velocity.
Sensor data refers to the data coming out of sensors. An enormous amount of sensor data is also a challenge for big data. One example of sensor data is the Large Hadron Collider (LHC).
LHC is the world’s largest and highest-energy particle accelerator. The data-flow in its experiments consists of twenty-five to two hundred petabytes of information that has to be processed and held on.