Loading...
Big Data/Hadoop2017-02-19T16:31:26+00:00

Hadoop/Big Data


Hadoop is a software framework for storing and processing Big Data. It is an open-source tool build on java platform and focuses on improved performance in terms of data processing on clusters of commodity hardware.

Hadoop/Big Data Course Duration: 40 - 45 hours

Introduction to Big Data

  • What is Big Data

  • 3Vs of Big Data

  • Sources of Big data flood

  • Explore data problem

  • Solution for Big data

Module-1

  • Introductionto Hadoop Ecosystem

  • Breaking data into chunks

  • Why Hadoop cluster?

  • Why Hadoop2 came after Hadoop1?

  • How Hadoop works

  • Core components of Hadoop

  • NameNode backup in Hadoop1.x

  • HDFS

Module-2

  • Introduction to HDFS

  • Design of HDFS

  • HDFS data flow

  • Blocks in HDFS

  • HDFS high level architecture

  • Processing on Input Split

  • Relation between Hadoop block and split

  • HDFS file-write

  • Hadoop Installation,Hadoop EcoSystem

  • File read

  • Hadoop configuration files

  • Demo of HDFS commands

  • Key components

Module-3

  • MapReduce using Java and Python

  • MapReduce Definition

  • Real life examples

  • Building principles

  • Mapper-reducer functions

  • MapReduce Example,Demo

  • Demo to build a MR application – Word count

  • More real world usecases for MapReduce

Module-4

  • Apache Pig

  • Introduction

  • Architecture

  • Installation

  • Apache Pig environments

  • Data type

  • ETL commands

Module-5

  • Apache ETL Commands

  • UDF introduction

  • Why UDF ?

  • UDF example

  • Advanced Pig

  • Demo of real world usecases

  • Advanced joins in Pig

Module 6

  • Introduction to Hive,Hive Data Model and Type System,Hands-On on Hive Operations

  • Inroduction

  • Function

  • Hive architechture

  • Data storage

  • Introduction to HQL

  • Hive query lifecycle on Hadoop

  • Basic operations in Hive

  • Create table and load data

  • Altering & dropping tables

Module 7

  • Advanced Hive

  • Joins and union

  • Partitioning and bucketing

Module 8

  • What is UDF?

  • Why UDF ?

  • UDF Demo,Thrift Server,

  • UDF demo using Hive

  • Thrift server demo

Module 9

  • HBase Introduction,HBase Architecture

  • NoSQL databases

  • HBase v/s RDBMS

  • CAP theorem

  • HBase: column family and HBase data model

  • HMaster and slave

  • HBase components

  • Zookeeper

  • Hbase work flow

Module 10

  • Hands-On on HBase CRUD Operations

  • Put and Get

  • Scan

  • Filters in HBase

  • Delete

  • Data loading techniques

Module 11

  • HBase Thrift Server and Rest server

  • What is HBase thrift server

  • Integrating HBase with your application

  • Example for sending request and response from Thrift server

  • Hbase rest server

Module 12

  • Apache Sqoop,Apache Flume,Apache Oozie

  • Import and export of structured data on Hadoop

  • Introduction to Apache Sqoop

  • Sqoop architecture

  • Import and export in Sqoop

  • Sqoop commands

  • Sqoop installation

  • Injecting unstructured data into Hadoop

Module 13

  • Apache Flume introduction

  • Flume architecture

  • Flume component

  • Introduction to Oozie

  • Oozie co-ordinator

  • Oozie workflows

  • Oozie scheduler hands-on

Module 14

  • Introduction to Spark

  • Introduction to scala

  • Basics Features of SPARK and Scala available in Hue

  • Why Spark demand is increasing in market

  • How can we use Spark with Hadoop Eco System

  • Datasets for practice purpose

Module 15

  • Spark use cases with real time scenarios

  • Spark Practical with advanced concepts

  • Scala platform with complex use cases

  • Real time project use cases examples based on Spark and Scala

  • How we can reduce

Additional Benefits:

  • We provide real time scenarios examples, how to work in real time projects

  • We guide for resume preparation by giving sample resume

  • Will give you 2 POC (proof Of Concept) with Data set so that you can practice before going for interview

  • We provide hands –on in class room itself so that you can understand concepts 100%

  • We give assignments for weekdays practice
  • Fresher’s those wish to start a great career in Hadoop and Big Data.

  • Experienced professionals not satisfied with current job profile and wish to switch into Hadoop and Big Data.

  • Manager’s for handling the team.

  • Business Analyst for the Technical touch.
  • Work on Live projects

  • 20% theory and 80% practicals

  • Free WI-Fi

  • Syllabus recommended by Hadoop/Big Data

  • Live Industry Oriented Scenarios

  • Learn from industry experts.

Please Contact for schedule.
Mob: +91 9765152405
Email: info@digidatalogic.com

QUICK ENQUIRY