SAQs
1. Define Data and Relations.
Data
- Data is raw, unprocessed facts and figures that are collected and stored.
- It exists in various forms like numbers, text, images, audio, and video.
- Data can be structured (organized in rows and columns), semi-structured (has some organization but not rigid), or unstructured (no defined structure).
Relations
- Relations are mathematical constructs used in relational databases to represent data.
- A relation is a set of tuples (rows), each containing a fixed number of attributes (columns).
- Each attribute represents a specific characteristic or property of the data.
- Relations are based on the concept of set theory, with elements (tuples) and relationships (attributes) defined.
- They are crucial for querying and manipulating data in relational databases.
2. Define what is visualization.
Visualization
- Visualization is the process of creating visual representations of data to gain insights and understanding.
- It involves transforming data into charts, graphs, maps, and other visual forms.
- Visualizations help to:
- Identify patterns and trends
- Highlight anomalies and outliers
- Communicate complex information effectively
- Engage audiences and facilitate decision-making.
- Common visualization techniques include bar charts, line graphs, scatter plots, heatmaps, and dashboards.
3. Define Hadoop ecosystem.
Hadoop Ecosystem