Courses

    Pages & Posts

      • IT Training
        • Amazon Cloud (AWS)
        • Analytics and Big Data
        • Blockchain
        • Cisco
        • Citrix
        • Cloud Computing
        • CompTIA Certification
        • Cybersecurity
        • Deep Learning and AI
        • Development Tools
        • DevOps
        • F5
        • FlexPod
        • IBM
        • IT Security
        • Java
        • Juniper Networks
        • Linux
        • Microsoft
        • Mobile Computing
        • NetApp
        • Oracle
        • Salesforce
        • VMware
        • Web Development, HTML, and JavaScript
        • WebSphere
      • Business Training
        • Business Analysis
        • Enterprise Architecture
        • HIPAA Certification & Training
        • IT Governance
        • ITIL® Certification
        • Project Management
        • Salesforce
        • Scrum & Agile
        • Seminars
      • Services
        • APM Practice
        • Authorized Cisco Training
        • BPM Practice
        • Cloudera Training
        • Distance Learning
        • ITIL Certification
        • Linux Certification Training
        • Media Security Solutions
        • Microsoft 365
        • Microsoft Official Courses On Demand
        • Microsoft Professional Program
        • NetApp Training
        • NIST Cybersecurity Framework
        • Salesforce Training
        • Training Facilities
        • Visual Studio TFS Training
        • Enterprise Training Solutions
        • Room Rentals
        • State & Local Government
      • Student Resources
        • Navigating Babbage Simmel Academy Courses
        • Ohio Workforce Training Voucher Program
        • Student Guide / Daily Schedule
        • College Credits / Tuition Reimbursement
        • Career Programs (WIA)
        • Student Policies
        • Testing & Certifications
        • Promotions
        • Local Hotels
        • Local Restaurants
      • Cloudera Administrator Training for Apache Hadoop

      Course Details

      Download PDF
      HADOOP-ADMIN
      4 Days
      $3,195.00
      Request a Course Date

      Cloudera Administrator Training for Apache Hadoop

      Share this course

      Tweet Share
      Cloudera University’s four-day administrator training course for Apache Hadoop provides participants with a comprehensive understanding of all the steps necessary to operate and maintain a Hadoop cluster using Cloudera Manager. From installation and configuration through load balancing and tuning, Cloudera’s training course is the best preparation for the real-world challenges faced by Hadoop administrators.

      Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, learning topics such as:

      • Cloudera Manager features that make managing your clusters easier, such as aggregated logging, configuration management, resource management, reports, alerts, and service management

      • The internals of YARN, MapReduce, Spark, and HDFS

      • Determining the correct hardware and infrastructure for your cluster

      • Proper cluster configuration and deployment to integrate with the data center

      • How to load data into the cluster from dynamically-generated files using Flume and from RDBMS using Sqoop

      • Configuring the FairScheduler to provide service-level agreements for multiple users of a cluster

      • Best practices for preparing and maintaining Apache Hadoop in production

      • Troubleshooting, diagnosing, tuning, and solving Hadoop issues

      • Course Outline
      • Audience

      Course Outline

      The Case for Apache Hadoop

      • Why Hadoop?
      • Fundamental Concepts
      • Core Hadoop Components

      Hadoop Cluster Installation

      • Rationale for a Cluster Management Solution
      • Cloudera Manager Features
      • Cloudera Manager Installation
      • Hadoop (CDH) Installation

      The Hadoop Distributed File System (HDFS)

      • HDFS Features
      • Writing and Reading Files
      • NameNode Memory Considerations
      • Overview of HDFS Security
      • Web UIs for HDFS
      • Using the Hadoop File Shell

      MapReduce and Spark on YARN

      • The Role of Computational Frameworks
      • YARN: The Cluster Resource Manager
      • MapReduce Concepts
      • Apache Spark Concepts
      • Running Computational Frameworks on YARN
      • Exploring YARN Applications Through the Web UIs, and the Shell
      • YARN Application Logs

      Hadoop Configuration and Daemon Logs

      • Cloudera Manager Constructs for Managing Configurations
      • Locating Configurations and Applying Configuration Changes
      • Managing Role Instances and Adding Services
      • Configuring the HDFS Service
      • Configuring Hadoop Daemon Logs
      • Configuring the YARN Service

      Getting Data Into HDFS

      • Ingesting Data From External Sources With Flume
      • Ingesting Data From Relational Databases With Sqoop
      • REST Interfaces
      • Best Practices for Importing Data

      Planning Your Hadoop Cluster

      • General Planning Considerations
      • Choosing the Right Hardware
      • Virtualization Options
      • Network Considerations
      • Configuring Nodes

      Installing and Configuring Hive, Impala, and Pig

      • Hive
      • Impala
      • Pig

      Hadoop Clients Including Hue

      • What Are Hadoop Clients?
      • Installing and Configuring Hadoop Clients
      • Installing and Configuring Hue
      • Hue Authentication and Authorization

      Advanced Cluster Configuration

      • Advanced Configuration Parameters
      • Configuring Hadoop Ports
      • Configuring HDFS for Rack Awareness
      • Configuring HDFS High Availability

      Hadoop Security

      • Why Hadoop Security Is Important
      • Hadoop’s Security System Concepts
      • What Kerberos Is and how it Works
      • Securing a Hadoop Cluster With Kerberos
      • Other Security Concepts

      Managing Resources

      • Configuring cgroups with Static Service Pools
      • The Fair Scheduler
      • Configuring Dynamic Resource Pools
      • YARN Memory and CPU Settings
      • Impala Query Scheduling

      Cluster Maintenance

      • Checking HDFS Status
      • Copying Data Between Clusters
      • Adding and Removing Cluster Nodes
      • Rebalancing the Cluster
      • Directory Snapshots
      • Cluster Upgrading

      Cluster Monitoring and Troubleshooting

      • Cloudera Manager Monitoring Features
      • Monitoring Hadoop Clusters
      • Troubleshooting Hadoop Clusters
      • Common Misconfigurations

      Audience

      This course is best suited to systems administrators and IT managers who have basic Linux experience. Prior knowledge of Apache Hadoop is not required.

      Related Courses

      Cloudera Data Analyst Training: Using Pig, Hi...

      Cloudera University’s four-day data analyst training course focusing on Apache...

      View course details

      Cloudera Developer Training for Apache Spark...

      Cloudera University’s three-day training course for Apache Spark enables parti...

      View course details

      Cloudera Essentials for Apache Hadoop...

      This one-day course gives decision-makers an overview of Apache Hadoop and how i...

      View course details

      Cloudera Training for Apache HBase...

      Cloudera University’s three-day training course for Apache HBase enables par...

      View course details

      Designing and Building Big Data Applications...

      Cloudera University’s four-day course for designing and building Big Data appl...

      View course details

      College Credit, CEUs, PDUs and CDUs
      When you take courses with the Babbage Simmel, be sure you get the credit you deserve. Curriculum offered by Babbage Simmel can earn you college credit, CEUs, PDUs or CDUs.

      College Credit
      Select curriculum offered by Babbage Simmel is part of the accredited Ashland University undergraduate course catalogs. For questions please E-Mail: info@babsim.com or call 614-481-4345.

      Continuing Education Units (CEUs)
      Continuing Education Units (CEUs) are nationally recognized standard units of measurement earned for satisfactory completion of qualified programs of continuing education. If you need more information about CEUs, please E-Mail: info@babsim.com or call 614-481-4345.

      Professional Development Units (PDUs)
      Professional Development Units (PDUs) can be issued by PMI® for formal learning activities related to project management. Project Management Professionals (PMPs®) are required to earn a minimum of 60 PDUs every 3 years to maintain certification. For more information about this program go to the PMI® web site or call 1-855 746 4849.

      Continuing Development Units (CDUs)
      CDUs may be earned by attending professional development (e.g. courses, seminars) offered by organizations endorsed by IIBA® and designated as an EEP vendor. As an IIBA Endorsed Education Provider (EEP) Babbage Simmel's IIBA® endorsed courses qualify for CDU credit. For more information about CDUs go the IIBA® web site or call 1-647-426-3735.

      Our babsimLIVE distance learning brings the classroom learning experience to you by seating you virtually into a real-life instructor-led classroom taught by award winning world-class instructors with other IT professionals like yourself. From the comfort of your home, workplace, or at the Babbage Simmel Columbus Campus, you acquire the training you need, when you want it, in the environment that is most comfortable for you to be successful.

      About Us Contact Us Blog Find A Course

      © Copyright 2019 • Babbage Simmel. All Rights Reserved. Columbus Web Design by Jetpack | Privacy Policy