Hadoop for Business Analysts (HBA)

Course Description Schedule Course Outline
 

Course Content

Apache Hadoop is the most popular framework for processing Big Data. Hadoop provides rich and deep analytics capability, and it is making in-roads in to tradional BI analytics world. This course introduces analysts to the core components of Hadoop eco system and its analytics.

Who should attend

Business Analysts

Prerequisites

Learners will need to come to class meeting the following prerequisites:

  • Be able to navigate Linux command line
  • Basic knowledge of Linux editors (VI / nano) for modifying code

Course Objectives

By the end of this course, you should be able to:

  • Understand the core components of Hadoop eco system
  • Understand how to perform analytics leveraging the core components in the Hadoop eco-system
  • Understanding HDFS
  • Understanding MapReduce
  • Using Pig and Hive to support analytics
  • Leveraging Business Intelligence (BI) tools for big data analytics on Hadoop

Detailed Course Outline

Module 1: Introduction to Hadoop
  • Hadoop history, concepts
  • Eco system
  • Distributions
  • High level architecture
  • Hadoop myths
  • Hadoop challenges
  • Hardware / software
Module 2: HDFS Overview
  • Concepts (horizontal scaling, replication, data locality, rack awareness)
  • Architecture (Namenode, Secondary namenode, Data node)
  • Data integrity
  • Future of HDFS : Namenode HA, Federation
  • Lab exercises
Module 3 : Map Reduce Overview
  • Mapreduce concepts
  • Daemons : jobtracker / tasktracker
  • Phases : driver, mapper, shuffle/sort, reducer
  • Thinking in map reduce
  • Future of mapreduce (yarn)
  • Lab exercises
Module 4 : Pig
  • Pig vs java map reduce
  • Pig latin language
  • User defined functions
  • Understanding pig job flow
  • Basic data analysis with Pig
  • Complex data analysis with Pig
  • Multi datasets with Pig
  • Advanced concepts
  • Lab exercises
Module 5: Hive
  • Hive concepts
  • Architecture
  • Data types
  • Hive data management
  • Hive vs sql
  • Lab exercises
Module 6: BI Tools for Hadoop
  • BI tools and Hadoop
  • Overview of current BI tools landscape
Conclusion
  • Choosing the best tool for the job
Classroom Training

Duration 2 days

Price
  • United States: US$ 1,750
Enroll now
Online Training

Duration 2 days

Price
  • United States: US$ 1,750
Enroll now