Hadoop for Business Analysts (HBA)

Course Description Schedule Course Outline

About this Course

Apache Hadoop is the most popular framework for processing Big Data. Hadoop provides rich and deep analytics capability, and it is making in-roads in to tradional BI analytics world. This course introduces analysts to the core components of Hadoop eco system and its analytics.

Who should attend

Business Analysts

Class Prerequisites

Learners will need to come to class meeting the following prerequisites:

  • Be able to navigate Linux command line
  • Basic knowledge of Linux editors (VI / nano) for modifying code

What You Will Learn

By the end of this course, you should be able to:

  • Understand the core components of Hadoop eco system
  • Understand how to perform analytics leveraging the core components in the Hadoop eco-system
  • Understanding HDFS
  • Understanding MapReduce
  • Using Pig and Hive to support analytics
  • Leveraging Business Intelligence (BI) tools for big data analytics on Hadoop

Outline: Hadoop for Business Analysts (HBA)

Module 1: Introduction to Hadoop
  • Hadoop history, concepts
  • Eco system
  • Distributions
  • High level architecture
  • Hadoop myths
  • Hadoop challenges
  • Hardware / software
Module 2: HDFS Overview
  • Concepts (horizontal scaling, replication, data locality, rack awareness)
  • Architecture (Namenode, Secondary namenode, Data node)
  • Data integrity
  • Future of HDFS : Namenode HA, Federation
  • Lab exercises
Module 3 : Map Reduce Overview
  • Mapreduce concepts
  • Daemons : jobtracker / tasktracker
  • Phases : driver, mapper, shuffle/sort, reducer
  • Thinking in map reduce
  • Future of mapreduce (yarn)
  • Lab exercises
Module 4 : Pig
  • Pig vs java map reduce
  • Pig latin language
  • User defined functions
  • Understanding pig job flow
  • Basic data analysis with Pig
  • Complex data analysis with Pig
  • Multi datasets with Pig
  • Advanced concepts
  • Lab exercises
Module 5: Hive
  • Hive concepts
  • Architecture
  • Data types
  • Hive data management
  • Hive vs sql
  • Lab exercises
Module 6: BI Tools for Hadoop
  • BI tools and Hadoop
  • Overview of current BI tools landscape
  • Choosing the best tool for the job
Classroom Training

Duration 2 days

  • United States: US$ 1,750
Enroll now
Online Training

Duration 2 days

  • United States: US$ 1,750
Enroll now