Hadoop Interview Questions Part-3
1) What are the most common input formats in Hadoop?
There are three most favored feedback types in Hadoop:
Text Reviews Format: Conventional feedback framework in Hadoop.
Key Value Reviews Format: used for basically published written text information where the information are broken into lines
Sequence Computer data file Reviews Format: used for learning information in sequence
2) Figure out DataNode and how does NameNode cope with DataNode failures?
DataNode stores details in HDFS; it is a node where actual details prevails in the data file system. Each datanode provides a beat amount idea to tell that it is in lifestyle. If the namenode does not get a idea from datanode for 15 minutes, it opinions it to be dead or out of place, and starts replication of stops that were organized on that details node such that they are organized on some more details node.A BlockReport contains record of all stops on a DataNode. Now, this method starts to copy what were held in dead DataNode.
The NameNode manages the replication of information blocksfrom one DataNode to other. In the process, the replication details transactions directly between DataNode such that the details never goes the NameNode.
3) What are the primary methods of a Reducer?
The three primary methods of a Reducer are:
setup(): this method is used for developing various aspects like feedback details sizing, assigned storage space storage cache.
public gap set up (context)
reduce(): core of the Reducer always known as once per key with the associated reduced task
public gap reduce(Key, Value, context)
cleanup(): this method is known as to fresh short-term information, only once at the end of the task
public gap clean-up (context)
4) What is SequenceFile in Hadoop?
Extensively used in MapReduce I/O types, SequenceFile is a set data file containing binary key/value places. The map answers are stored as SequenceFile inner. It provides Viewers, Writer and Sorter classes. The three SequenceFile types are:
Uncompressed key/value details.
Record compressed key/value details – only ‘values’ are compressed here.
Block compressed key/value details – both key elements and concepts are collected in ‘blocks’ individually and compressed. The size of the ‘block’ is configurable.
5) What is Job Monitoring system aspect in Hadoop?
Job Tracker’s primary function is resource management (managing the perform trackers), tracking resource availability and process life-cycle management (tracking the taks enhancement and error tolerance).
It is a there are runs using another node, not on a DataNode often.
Job Monitoring system provides with the NameNode to identify details place.
Finds the best Process Monitoring system Nodes to function tasks on given nodes.
Monitors individual Process Trackers and sends the overall job returning again to the client.
It routes the efficiency of MapReduce workloads local to the slave node.
6) What is the use of RecordReader in Hadoop?
Since Hadoop separates details into various stops, RecordReader is used to analyze the slit details into individual record. For example, if our feedback details are separated like:
Row1: Welcome to
It will be research as “Welcome to Intellipaat” using RecordReader.
7) What is Risky Performance in Hadoop?
One restriction of Hadoop is that by circulating the duties on several nodes, there are possibilities that few slowly nodes restrict the relax of the system. There are various factors for the duties to be slowly, which are sometimes quite difficult to identify. Instead of determining and solving the slow-running projects, Hadoop tries to identify when the work operates more slowly than predicted and then releases other comparative process as back-up. This back-up procedure in Hadoop is Risky Performance.
Become a DBA professional by joining the DBA course in this field.
Stay connected to CRB Tech for more technical optimization and other updates and information.