About Trainer

  • 10+ years of IT experience
  • 2+ years of onsite experience
  • Certified
  • Working as an Architect
  • Excellent in Communication

About Hadoop Administration

 
 
 
 

Online Training in Hadoop Administration

Course Pre-requisites:
    Basic Java Unix Commands
Lab setup: Unix/Linux/Solaris/CentOS Servers
Training Material:
  • Class PPT’s (Soft Copy)
  • Session Recordings
  • Manuals
  • Projects and Tasks recordings
Hadoop Administration Online Training Course Outline
  • Basic understanding of Big Data
  • Introduction to Hadoop
  • Principles and Operations of HDFS
  • Analysis using MapReduce and YARN
  • Hadoop Cluster Installation and Configuration
  • Working with Hive, Impala and Pig
  • Hadoop Security
  • Maintenance and Monitoring
Course Contents:
Unit 1: The Case for Apache Hadoop
  • Why Hadoop?
  • Core Hadoop Components
  • Fundamental Concepts
Unit 2: HDFS
  • HDFS Features
  • Writing and Reading Files
  • NameNode Memory Considerations
  • Overview of HDFS Security
  • Using the Namenode Web UI
  • Using the Hadoop File Shell
Unit 3: Getting Data into HDFS
  • Ingesting Data from External Sources with Flume
  • Ingesting Data from Relational Databases with Sqoop
  • REST Interfaces
  • Best Practices for Importing Data
Unit 4: YARN and MapReduce
  • What Is MapReduce?
  • Basic MapReduce Concepts
  • YARN Cluster Architecture
  • Resource Allocation
  • Failure Recovery
  • Using the YARN Web UI
  • MapReduce Version 1
Unit 5: Planning Your Hadoop Cluster
  • General Planning Considerations
  • Choosing the Right Hardware
  • Network Considerations
  • Configuring Nodes
  • Planning for Cluster Management
Unit 6: Hadoop Installation and Initial Configuration
  • Deployment Types
  • Installing Hadoop
  • Specifying the Hadoop Configuration
  • Performing Initial HDFS Configuration
  • Performing Initial YARN and MapReduce
Unit 7: Configuration
  • Hadoop Logging
Unit 8: Installing and Configuring Hive, Impala, and Pig
  • Hive
  • Impala
  • Pig
Unit 9: Hadoop Clients
  • What is a Hadoop Client?
  • Installing and Configuring Hadoop Clients
  • Installing and Configuring Hue
  • Hue Authentication and Authorization
Unit 10: Cloudera Manager
  • The Motivation for Cloudera Manager
  • Cloudera Manager Features
  • Express and Enterprise Versions
  • Cloudera Manager Topology
  • Installing Cloudera Manager
  • Installing Hadoop Using Cloudera Manager
  • Performing Basic Administration Tasks
Unit 11: Using Cloudera Manager Advanced Cluster Configuration
  • Advanced Configuration Parameters
  • Configuring Hadoop Ports
  • Explicitly Including and Excluding Hosts
  • Configuring HDFS for Rack Awareness
  • Configuring HDFS High Availability
Unit 12: Hadoop Security
  • Why Hadoop Security Is Important
  • Hadoop’s Security System Concepts
  • What Kerberos Is and How it Works
  • Securing a Hadoop Cluster with Kerberos
Unit 13: Managing and Scheduling Jobs
  • Managing Running Jobs
  • Scheduling Hadoop Jobs
  • Configuring the FairScheduler
  • Impala Query Scheduling
Unit 14: Cluster Maintenance
  • Checking HDFS Status
  • Copying Data Between Clusters
  • Adding and Removing Cluster Nodes
  • Rebalancing the Cluster
  • Cluster Upgrading
Unit 15: Cluster Monitoring and Troubleshooting
  • General System Monitoring
  • Monitoring Hadoop Clusters
  • Common Troubleshooting Hadoop Clusters
  • Common Misconfigurations
 
 
 
 

Testimonials

  • The fact that they have a set format of teaching, which creates a schedule for a student, and a step-by-step process that covers the entire Netezza syllabus. I can surely assure eTraining provides a great help for Netezza curriculum.
    - Shannon D.

  • The BEST I liked about the tutors in eTraining is the amount of practical knowledge these guys have and the way they bring in the same to students perspective while Teaching. Glad I took DB2 course from eTraining.
    - Rubeena Khan

  • I found classes very helpful. What you Teach is AWESOME
    - Gopal