UDP-Understanding Big Data to Gain Insight and Knowledge Lesson

Understanding Big Data to Gain Insight and Knowledge

What did you eat for breakfast? What did you wear today? What songs did you download from iTunes or listen to today? What TV shows did you record or watch? What videos did you watch on YouTube? What social media sites and web pages did you visit?

You create data every day.  📅

In 2012, every day 2.5 quintillion bytes of data were created. During the next eight years, the amount of digital data produced will exceed 40 zettabytes, which is the equivalent of 5,200 GB of data for every man, woman, and child on Earth, according to an updated Digital Universe Study released.

As a society, we're producing and capturing more data each day than was seen by everyone since the beginning of the earth.

Look at the infographic below to get an idea of how much data was created through the Internet every minute in 2013.

Big Data Infographic

  • Twitter users tweet nearly 300,000 times.
  • Facebook users share nearly 2.5 million pieces of content.
  • Instagram users post nearly 220,000 new photos.
  • YouTube users upload 72 hours of new video content.
  • Apple users download nearly 50,000 apps.
  • Email users send over 200 million messages.
  • Amazon generates over $80,000 in online sales.

Data is streamed from our phones, credit cards, televisions, and computers. That data is collected through the Internet, sensors, and cameras. The datasets are growing too large to manage with common software tools.    

When data becomes so large and complex it is referred to as "Big Data ".  Big data is a collection of data so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. As storage becomes cheaper and computers become more powerful, the need for advanced computing solutions to address large-scale data analysis problems has become increasingly important.

Data first starts at the lowest level as a binary sequence of 1s and 0s. The chart below is a Data Management chart that explains how much data is available at each measurement.

Data Management Chart

What is the difference between data and information?

Data is raw, unorganized facts that need to be processed. A student's test score could reflect a piece of data.

Information is the processed data that has been organized and structured to provide answers to questions. The average test score of all students is information on how well the class did as a whole and how well students did compared to the average.

Unstructured Data

Most data resides as unstructured data. Unstructured data refers to data that is not organized using a computational tool such as a spreadsheet or database. Unstructured data can be textual and numeric. The data lies on servers, networks, and in the cloud.  

Structured Data

For data to be useful, it needs to be processed and structured. This process takes the raw unstructured data and uses computational tools to turn it into useful information.  

This process of collecting the data into a structure is called Data Mining. Data Mining involves using computational tools to find patterns in large amounts of data to produce useful information. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use.

How is Big Data Useful?

When data has been properly captured and analyzed, it can provide important insights and knowledge to help make decisions and predictions. Businesses may use this data to make informed decisions, improve customer service, and create personalized marketing campaigns. Properly analyzed data may help organizations understand their inefficiencies and opportunities that may lead to growth. Lastly, analyzed data may help organizations understand their customers, observe their competitors, and tailor their products or services. It has applications in:

  • Retail
  • Sports
  • Finance
  • Government
  • Environment
  • People
  • Manufacturing
  • Education
  • Medicine
  • Disease control
  • And many more....  

[CC BY 4.0] UNLESS OTHERWISE NOTED | IMAGES: LICENSED AND USED ACCORDING TO TERMS OF SUBSCRIPTION