HADOOP Training by Learning Hub, Magarpatta City Pune (www.learninghub.co.in) +91-9325793756: October 2013

Course Content of HADOOP

Introduction

The Motivation for Hadoop

· Problems with Traditional Large-Scale Systems

· Requirements for a New Approach

· Introducing Hadoop

Hadoop: Basic Concepts

· The Hadoop Project and Hadoop Components

· The Hadoop Distributed File System

· Hands-On Exercise: Using HDFS

· How MapReduce Works

· Hands-On Exercise: Running a MapReduce Job

· How a Hadoop Cluster Operates

· Other Hadoop Ecosystem Projects

Writing a MapReduce Program

· The MapReduce Flow

· Basic MapReduce API Concepts

· Writing MapReduce Drivers, Mappers and Reducers in Java

· Writing Mappers and Reducers in Other Languages Using the Streaming API

· Speeding Up Hadoop Development by Using Eclipse

· Hands-On Exercise: Writing a MapReduce Program

· Differences Between the Old and New MapReduce APIs

Unit Testing MapReduce Programs

· Unit Testing

· The JUnit and MRUnit Testing Frameworks

· Writing Unit Tests with MRUnit

· Hands-On Exercise: Writing Unit Tests with the MRUnit Framework

Delving Deeper into the Hadoop API

· Using the ToolRunner Class

· Decreasing the Amount of

· Intermediate Data with Combiners

· Hands-On Exercise: Writing and Implementing a Combiner

· Setting Up and Tearing Down Mappers and Reducers by Using the Configure and Close Methods

· Writing Custom Partitioners for Better Load Balancing

· Hands-On Exercise: Writing

· a Partitioner

· Accessing HDFS Programmatically

· Using The Distributed Cache

· Using the Hadoop API’s Library of Mappers, Reducers and Partitioners

Practical Development Tips and Techniques

· Strategies for Debugging MapReduce Code

· Testing MapReduce Code Locally by Using LocalJobReducer

· Writing and Viewing Log Files

· Retrieving Job Information with Counters

· Determining the Optimal Number of Reducers for a Job

· Creating Map-Only MapReduce Jobs

· Hands-On Exercise: Using Counters and a Map-Only Job

Data Input and Output

· Creating Custom Writable and WritableComparable Implementations

· Saving Binary Data Using SequenceFile and Avro Data Files

· Implementing Custom Input Formats and Output Formats

· Issues to Consider When Using File Compression

· Hands-On Exercise: Using SequenceFiles and File Compression

Common MapReduce Algorithms

· Sorting and Searching Large Data Sets

· Performing a Secondary Sort

· Indexing Data

· Hands-On Exercise: Creating an Inverted Index

· Computing Term Frequency — Inverse Document Frequency

· Calculating Word Co-Occurrence

· Hands-On Exercise: Calculating Word

· Co-Occurrence

o Hands-On Exercise: Implementing Word Co-Occurrence with a Customer WritableComparable

Joining Data Sets in MapReduce Jobs

· Writing a Map-Side Join

· Writing a Reduce-Side Join

Integrating Hadoop into the Enterprise Workflow

· Integrating Hadoop into an Existing Enterprise

· Loading Data from an RDBMS into HDFS by Using Sqoop

· Hands-On Exercise: Importing Data with Sqoop

· Managing Real-Time Data Using Flume

· Accessing HDFS from Legacy Systems with FuseDFS and HttpFS

Machine Learning and Mahout

· Introduction to Machine Learning

· Using Mahout

· Hands-On Exercise: Using a Mahout Recommender

An Introduction to Hive and Pig

· The Motivation for Hive and Pig

· Hive Basics

· Hands-On Exercise: Manipulating Data with Hive

· Pig Basics

· Hands-On Exercise: Using Pig to Retrieve Movie Names from Our Recommender

· Choosing Between Hive and Pig

An Introduction to Oozie

· Introduction to Oozie

· Creating Oozie Workflows

· Hands-On Exercise: Running an Oozie Workflow

Thanks

Naresh

Learning Hub2nd Floor, Above HDFC Bank

Next to Noble Polyclinic

MAGARPATTA CITY
PUNE - 411013
PH: 9325793756

Skype id : learning.hub01

Email: learninghub01@gmail.com

www.learninghub.co.in

HADOOP Training by Learning Hub, Magarpatta City Pune (www.learninghub.co.in) +91-9325793756

Tuesday, October 22, 2013

HADOOP Training at Learning Hub, Magarpatta City, Pune. (+91-9325793756)