How to Setup Hadoop 2.8.0 (Single Node Cluster) on CentOS
loader image
DevOps Technical

How to Setup Hadoop 2.8.0 (Single Node Cluster) on CentOS


Apache Hadoop 2.8.0 is a minor release in the 2.x.y release line, building upon the previous stable release 2.7.3.

The following are the features and improvements that are said to be available in Apache Hadoop 2.8.0

  • Common
    • Support async call retry and failover which can be used in async DFS implementation with retry effort.
    • Cross Frame Scripting (XFS) prevention for UIs can be provided through a common servlet filter.
    • S3A improvements: add ability to plug in any AWSCredentialsProvider, support read s3a credentials from Hadoop credential provider API in addition to XML configuration files, support Amazon STS temporary credentials
    • WASB improvements: adding append API support
    • Build enhancements: replace dev-support with wrappers to Yetus, provide a docker based solution to setup a build environment, remove CHANGES.txt and rework the change log and release notes.
    • Add posixGroups support for LDAP groups mapping service.
    • Support integration with Azure Data Lake (ADL) as an alternative Hadoop-compatible file system.
  • HDFS
    • WebHDFS enhancements: integrate CSRF prevention filter in WebHDFS, support OAuth2 in WebHDFS, disallow/allow snapshots via WebHDFS
    • Allow long-running Balancer to log in with keytab
    • Add ReverseXML processor which reconstructs an fsimage from an XML file. This will make it easy to create fsimages for testing, and manually edit fsimages when there is corruption
    • Support nested encryption zones
    • DataNode Lifeline Protocol: an alternative protocol for reporting DataNode liveness. This can prevent the NameNode from incorrectly marking DataNodes as stale or dead in highly overloaded clusters where heartbeat processing is suffering delays.
    • Logging HDFS operation’s caller context into audit logs
    • A new datanode command for evicting writers which is useful when data node decommissioning is blocked by slow writers.
  • YARN
    • NodeManager CPU resource monitoring in Windows.
    • NM shut down more graceful: NM will unregister to RM immediately rather than waiting for the timeout to be LOST (if NM work preserving is not enabled).
    • Add ability to fail a specific AM attempt in the scenario of AM attempt gets stuck.
    • CallerContext support in YARN audit log.
    • ATS versioning support: a new configuration to indicate timeline service version.
    • Allow node labels get specified in submitting MR jobs
    • Add a new tool to combine aggregated logs into HAR file


This blog will help you to install Hadoop 2.8.0 on CentOS operating system and this includes basic configuration required to start working with Hadoop. I have explained the entire process in simple and easy steps.


Step 1 – Installing Java

Java is required for running Hadoop on any system, So before installing hadoop make sure java is installed on your system


If Java is not installed in the system then install it by using the following commands. To Install Java OpenJDK 8


After installing Java configure Java Environment Variables /etc/profile.d/

export JAVA_HOME=/usr/lib/jvm/java-openjdk


export PATH=$PATH:$JAVA_HOME/bin

Step 2 – Setup Hadoop user account

It is recommended to create non-root user account for hadoop environment


Setup key based ssh to its own account


Let’s check key based login and exit from Hadoop


Step 3 – Download Hadoop source file

Download Hadoop 2.8.0 source file, For different version, refer


Step 4 – Configure Hadoop Pseudo-Distributed Mode

  1. Setup Environment Variables

Edit ~/.bashrc file and append following values at end of file.

export HADOOP_HOME=/usr/local/hadoop








Now apply the changes in current running environment


Edit $HADOOP_HOME/etc/hadoop/ and set JAVA_HOME

# Change Java home path as per java installed on your system

export JAVA_HOME=/usr/lib/jvm/java-openjdk

  1. Edit Configuration Files

Hadoop contains many configuration files, which need to be configured as per requirements of your hadoop environment.


  1. i) Edit core-site.xml

  1. ii) Edit hdfs-site.xml

iii) Edit mapred-site.xml


  1. iv) Edit yarn-site.xml

  1. Format Hadoop Namenode

Once hadoop single node cluster setup has done, it’s time to initialize HDFS file system by formatting


Sample output:


Step 5 – Start Hadoop Cluster

Let’s start your Hadoop cluster using the scripts provides by hadoop. Just navigate to your Hadoop sbin directory and execute scripts one by one.


Run to start namenode, datanode and secondary namenodes


Sample output:


Sample output:


Sample output:


Step 6 – Check Hadoop Services

Access 50070 for getting information about NameNode


Access 8088 for getting information about cluster


Access 50090 for getting information about secondary namenode.


Access 50075 for getting information about DataNode


Step 7 – Test Hadoop Setup

  1. i) Make the HDFS directories

$ bin/hdfs dfs -mkdir /user

$ bin/hdfs dfs -mkdir /user/hadoop

Manage Hadoop Services

To start all hadoop instances run the below commands


To stop all hadoop instances run the below commands


Hope this article helped you to easily setup Hadoop 2.8.0 (Single Node Cluster) on CentOS. If you have any doubts or queries please comment below. For updates follow agiratechnologies.

The following two tabs change content below.


An enthusiastic Tech Lead with 7 plus years of experience in Web development arena. Owns legitimate experience in Ruby, Ruby On Rails, AngularJs, DevOps. Golang, Another add on, This young tech freak never miss a chance to get his hands on planting and Gardening even in his busy weekends.

Leave a Reply

Your email address will not be published. Required fields are marked *

Join our 30,000+ subscribers, never miss out anything on our latest blogs, tips, tutorials, updates & more.