Cloudera University’s four-day course for designing and building Big Data applications prepares you to analyze and solve real-world problems using Apache Hadoop and associated tools in the enterprise data hub (EDH). You will work through the entire process of designing and building solutions, including ingesting data, determining the appropriate file format for storage, processing the stored data, and presenting the results to the end-user in an easy-to-digest form. Go beyond MapReduce to use additional elements of the EDH and develop converged applications that are highly relevant to the business.
Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, learning topics such as:
Creating a data set with Kite SDK
Developing custom Flume components for data ingestion
Managing a multi-stage workflow with Oozie
Analyzing data with Crunch
Writing user-defined functions for Hive and Impala
Writing Avro Objects with a Custom Flume Interceptor
Managing Workflows with Apache Oozie
The Need for Workflow Management
What is Apache Oozie?
Defining an Oozie Workflow
Validation, Packaging, and Deployment
Running and Tracking Workflows Using the CLI
Hue UI for Oozie
Processing Data Pipelines with Apache Crunch
What is Apache Crunch?
Understanding the Crunch Pipeline
Comparing Crunch to Java MapReduce
Working with Crunch Projects
Reading and Writing Data in Crunch
Data Collection API Functions
Utility Classes in the Crunch API
Working with Tables in Apache Hive
What is Apache Hive?
Basic Query Syntax
Creating and Populating Hive Tables
How Hive Reads Data
Using the RegexSerDe in Hive
Developing User-Defined Functions
What are User-Defined Functions?
Implementing a User-Defined Function
Deploying Custom Libraries in Hive
Registering a User-Defined Function in Hive
Executing Interactive Queries with Impala
What is Impala?
Comparing Hive to Impala
Running Queries in Impala
Support for User-Defined Functions
Data and Metadata Management
Understanding Cloudera Search
What is Cloudera Search?
Supported Document Formats
Indexing Data with Cloudera Search
Collection and Schema Management
Indexing Data in Batch Mode
Indexing Data in Near Real Time
Presenting Results to Users
Solr Query Syntax
Building a Search UI with Hue
Accessing Impala through JDBC
Powering a Custom Web Application with Impala and Search
This course is best suited to developers, engineers, and architects who want to use use Hadoop and related tools to solve real-world problems. Participants should have already attended Cloudera Developer Training for Apache Hadoop or have equivalent practical experience. Good knowledge of Java and basic familiarity with Linux are required. Experience with SQL is helpful.
College Credit, CEUs, PDUs and CDUs When you take courses with the Babbage Simmel, be sure you get the credit you deserve. Curriculum offered by Babbage Simmel can earn you college credit, CEUs, PDUs or CDUs.
College Credit Select curriculum offered by Babbage Simmel is part of the accredited University of Findlay's undergraduate course catalogs. For questions please E-Mail: firstname.lastname@example.org or call 614-481-4345.
Continuing Education Units (CEUs) Continuing Education Units (CEUs) are nationally recognized standard units of measurement earned for satisfactory completion of qualified programs of continuing education. If you need more information about CEUs, please E-Mail: email@example.com or call 614-481-4345.
Professional Development Units (PDUs) Professional Development Units (PDUs) can be issued by PMI® for formal learning activities related to project management. Project Management Professionals (PMPs®) are required to earn a minimum of 60 PDUs every 3 years to maintain certification. For more information about this program go to the PMI® web site or call 1-855 746 4849.
Continuing Development Units (CDUs) CDUs may be earned by attending professional development (e.g. courses, seminars) offered by organizations endorsed by IIBA® and designated as an EEP vendor. As an IIBA Endorsed Education Provider (EEP) Babbage Simmel's IIBA® endorsed courses qualify for CDU credit. For more information about CDUs go the IIBA® web site or call 1-647-426-3735.
Our babsimLIVE distance learning brings the classroom learning experience to you by seating you virtually into a real-life instructor-led classroom taught by award winning world-class instructors with other IT professionals like yourself. From the comfort of your home, workplace, or at the Babbage Simmel Columbus Campus, you acquire the training you need, when you want it, in the environment that is most comfortable for you to be successful.