Hadoop is an open source big data framework developed by Doug Cutting in the year 2006. It is managed by the Apache Software Foundation. The project was named after Hadoop, a yellow stuff toy which Cutting’s son had.
Hadoop is designed to store and process huge volumes of data efficiently.
Hadoop framework comprises of two main components:
1) A Hadoop cluster is made up of two nodes. A node is a technical term described to denote a machine or a computer present within a cluster. The two nodes present in the Hadoop cluster are:
2) Demon is a technical tern used to describe a background process that is running on a Linux machine.
3) Name node and Data node are responsible for storing and managing the data. They are commonly referred to as Storage Node
4) Job Tracker and Task Tracker are responsible for processing the data and are commonly referred to as Compute Node.
5) Generally, Name node and Job Tracker are configured on the same machine whereas Data node and Task Tracker are configured on multiple machines, but can have instances of running in more than one machine at the same time.
6) There is also a secondary name node as part of the Hadoop cluster.
Read Next Key Features of Hadoop