The Hadoop ecosystem consists of multiple components, but the two core components that define Hadoop are:
HDFS (Hadoop Distributed File System): Provides fault-tolerant, scalable storage across distributed clusters. It is the backbone for storing massive datasets in a distributed fashion.
MapReduce Framework: Provides the parallel computing and data processing layer in Hadoop, enabling batch analysis over distributed datasets.
Option A: Correct, HDFS is essential.
Option B: Correct, MapReduce is essential.
Option C: Incorrect, Spark is a newer processing framework, but it is not originally part of Hadoop core.
Option D: Correct answer since both HDFS and MapReduce are considered the fundamental parts of Hadoop.
Option E: Incorrect, because Spark is not a core Hadoop component (though it integrates with Hadoop).
[Reference:, DASCA Data Scientist Knowledge Framework (DSKF) – Big Data Ecosystems: Hadoop Architecture & Components., ]
Submit