Types of Data Under Big data
Summary
TLDRIn this video, we explore the three main types of data in Big Data: structured, unstructured, and semi-structured. Structured data, making up about 20% of total data, is organized in rows and columns, exemplified by databases. Unstructured data, which comprises around 80%, lacks a clear format and includes files like PDFs, images, and videos. Semi-structured data falls in between, containing some organizational properties, seen in XML and JSON files. The discussion emphasizes the challenges and characteristics of handling diverse data types in the realm of Big Data.
Takeaways
- π Big Data is characterized by handling massive amounts of data, which is a growing subject of study.
- π Data is categorized into three main types: structured, unstructured, and semi-structured data.
- π Structured data is organized in rows and columns, making it easily searchable and analyzable, with databases as a primary example.
- π Unstructured data lacks a clear format, comprising about 80% of the total data, including text documents, images, videos, and satellite images.
- π Semi-structured data contains some organizational properties but doesn't fit into traditional databases, including XML and JSON files.
- π Structured data accounts for roughly 20% of existing data, generated from various sources, including sensors and surveys.
- π Human-generated structured data includes information such as names, addresses, and demographics.
- π Unstructured data cannot be stored in traditional database formats, complicating its management and analysis.
- π Various types of human-generated unstructured data include social media content, videos, and images.
- π NoSQL databases are efficient for storing semi-structured data, providing a flexible structure for varied data types.
Q & A
What are the three main categories of data under Big Data?
-The three main categories of data under Big Data are structured data, unstructured data, and semi-structured data.
How is structured data defined?
-Structured data is defined as data that can be represented in the form of rows and columns, making it easy to store in databases. Examples include data from web logs, sensor data, and survey results.
What percentage of total existing data is structured data?
-Structured data accounts for nearly 20% of the total existing data.
Can you give examples of unstructured data?
-Examples of unstructured data include data files, PDFs, images, videos, and machine-generated data such as satellite images.
What is the significance of unstructured data in Big Data?
-Unstructured data is significant in Big Data as it comprises approximately 80% of the total existing data and lacks a clear format for storage.
What characterizes semi-structured data?
-Semi-structured data is characterized by containing some organizational properties but cannot be stored using traditional database formats. It may appear structured in some instances and unstructured in others.
What are some examples of semi-structured data?
-Examples of semi-structured data include XML files, JSON documents, and spreadsheet files.
How is structured data typically stored?
-Structured data is typically stored in traditional databases using a row and column format.
What is the role of NoSQL databases in handling semi-structured data?
-NoSQL databases are designed to efficiently store semi-structured data, accommodating its unique organizational properties.
Why is it important to understand the different types of data in Big Data?
-Understanding the different types of data in Big Data is crucial for effectively managing, analyzing, and extracting valuable insights from vast amounts of data.
Outlines
This section is available to paid users only. Please upgrade to access this part.
Upgrade NowMindmap
This section is available to paid users only. Please upgrade to access this part.
Upgrade NowKeywords
This section is available to paid users only. Please upgrade to access this part.
Upgrade NowHighlights
This section is available to paid users only. Please upgrade to access this part.
Upgrade NowTranscripts
This section is available to paid users only. Please upgrade to access this part.
Upgrade NowBrowse More Related Video
Structured and Unstructured Storage in the Cloud
The 5Vs of Big Data (characteristics) #BigData #bigdataanalytics
Challenges and Current Trends of Big Data Technologies: Part 1
Pengantar Data Mining - #1 Intuisi Kenapa Harus Menggunakan Data Mining
2. What is data? Different types of data? Structured | Semi-structured | Unstructured data
Data Warehouse vs Data Lake vs Data Lakehouse
5.0 / 5 (0 votes)