Big Data

Updated: Thu, 30 Apr 2015 by Rad

Big Data is a broad term for data sets so large or complex that traditional data processing applications are inadequate. Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, and information privacy.

The term often refers simply to the use of predictive analytics or other certain advanced methods to extract value from data, and seldom to a particular size of data set.

Big Data is large collection of data growing in size, due to cheap and numerous information-sensing mobile devices, aerial (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFID) readers, and wireless sensor networks.

Different meaning to different people

"Big Data" means different things to different people and there isn't, and probably never will be, a commonly agreed upon definition out there. But the phenomenon is real and it is producing benefits in so many different areas.

The basic idea behind the phrase 'Big Data' is that everything we do is increasingly leaving a digital trace, which we can use and analyse.

"Big Data therefore refers to that data being collected and our ability to make use of it."

Bernard Marr, http://www.datasciencecentral.com
Big Data
Big Data
Credit: Camelia.boban (Own work),via Wikimedia Commons. Licensed under Creative Commons CC BY-SA .

Challenges: One of the challenges with big data is to properly estimate your uncertainty. Often Big Data means a huge amount of data that isn't exactly what you want.

Industry experts agree that big data involves one or more of the following aspects:
  • velocity - it moves extremely fast through various sources such as online systems, sensors, social media, web clickstream capture, and other channels
  • variety - Big Data is made of many types of data from many sources - structured and semi-structured, as well as unstructured - emails, text messages, documents, video, images, sensors, traffic data and the like
  • volume - it may involve terabytes to petabytes of data
  • complexity - it must be able to traverse multiple data centers, the cloud and geographical zones
  • scale and perfromance - with big data you want to be able to scale very rapidly and elastically
  • data security - Big Data carries some big risks when it contains credit card data, personal ID information and other sensitive assetsy

Data vizualization

Data visualization is becoming an increasingly important component of analytics in the age of big data. Fom the beginning of recorded time until 2003, humans had created 5 exabytes (5 billion gigabytes) of data. In 2011, the same amount was created every two days


Big data is set to offer companies tremendous insight. But with terabytes and etabytes of data pouring in to organizations today, traditional architectures and infrastructures are not up to the challenge.

Studies show the brain processes images 60,000x faster than text. The final step in your big data analytics workflow, the big data analytics visualization is a visual representation of the insights gained from your analysis. Any time your data changes, visualization should automatically update with the newest results.

Big Data Vizualization
Vizualization example
Credit: Jer Thorp, Tokyo,Cairo: Comparing Obama's Foreign Policy Speecheon Flickr. Licensed under Creative Commons.

Data Vizualization includes

  • graphs
  • maps
  • tables
  • shapes
  • business infographics

Big vendors in Big Data

For organizations of all sizes, data management has shifted from an important competency to a critical differentiator that can determine market winners and has-beens.

Fortune 1000 companies and government bodies are starting to benefit from the innovations and are defining new initiatives and reevaluating existing strategies to examine how they can transform their businesses using Big Data.

In the process, they are learning that Big Data is not a single technology, technique or initiative. Rather, it is a trend across many areas of business and technology.

The Big Data landscape is dominated by two classes of technology: systems that provide operational capabilities for real-time, interactive workloads where data is primarily captured and stored; and systems that provide analytical capabilities for retrospective, complex analysis that may touch most or all of the data. These classes of technology are complementary and frequently deployed together.

Some of top Big Data analytics platforms

< back to glosary

Big Data - from around the web

< back to glosary




External IT glossary resources.