This course is designed for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Pig and Hive. Topics include: Hadoop, YARN, HDFS, MapReduce, data ingestion, workflow definition and using Pig and Hive to perform data analytics on Big Data. Labs are executed on a 7-node HDP cluster.
At the completion of the course students will be able to:
Describe Hadoop ecosystem tools and frameworks
Describe the HDFS architecture
Use the Hadoop client to input data into HDFS
Transfer data between Hadoop and a relational database
Explain YARN and MaoReduce architectures
Run a MapReduce job on YARN
Use Pig to explore and transform data in HDFS
Use Hive to explore Understand how Hive tables are defined and implementedand analyze data sets
Use the new Hive windowing functions
Explain and use the various Hive file formats
Create and populate a Hive table that uses ORC file formats
Use Hive to run SQL-like queries to perform data analysis
Use Hive to join datasets using a variety of techniques, including Map-side joins and Sort-Merge-Bucket joins
Write efficient Hive queries
Create ngrams and context ngrams using Hive
Perform data analytics like quantiles and page rank on Big Data using the DataFu Pig library
Explain the uses and purpose of HCatalog
Use HCatalog with Pig and Hive
Define a workflow using Oozie
Schedule a recurring workflow using the Oozie Coordinator
College Credit, CEUs, PDUs and CDUs When you take courses with the Babbage Simmel, be sure you get the credit you deserve. Curriculum offered by Babbage Simmel can earn you college credit, CEUs, PDUs or CDUs.
College Credit Select curriculum offered by Babbage Simmel is part of the accredited University of Findlay's undergraduate course catalogs. For questions please E-Mail: email@example.com or call 614-481-4345.
Continuing Education Units (CEUs) Continuing Education Units (CEUs) are nationally recognized standard units of measurement earned for satisfactory completion of qualified programs of continuing education. If you need more information about CEUs, please E-Mail: firstname.lastname@example.org or call 614-481-4345.
Professional Development Units (PDUs) Professional Development Units (PDUs) can be issued by PMI® for formal learning activities related to project management. Project Management Professionals (PMPs®) are required to earn a minimum of 60 PDUs every 3 years to maintain certification. For more information about this program go to the PMI® web site or call 1-855 746 4849.
Continuing Development Units (CDUs) CDUs may be earned by attending professional development (e.g. courses, seminars) offered by organizations endorsed by IIBA® and designated as an EEP vendor. As an IIBA Endorsed Education Provider (EEP) Babbage Simmel's IIBA® endorsed courses qualify for CDU credit. For more information about CDUs go the IIBA® web site or call 1-647-426-3735.
Our babsimLIVE distance learning brings the classroom learning experience to you by seating you virtually into a real-life instructor-led classroom taught by award winning world-class instructors with other IT professionals like yourself. From the comfort of your home, workplace, or at the Babbage Simmel Columbus Campus, you acquire the training you need, when you want it, in the environment that is most comfortable for you to be successful.