Big Data

Big Data

Big Data

Big data can analyzed for insights, It helps to make better decisions and strategic moves.


    • Data Growth
    • Data Challenges(4V)
    • Why Big Data and What is Big Data
    • Overview of Big Data Tools, Different vendors providing hadoop and where it fits in the industry
    • Setting up Development environment & performing Hadoop
    • Installation on User’s laptop
      • Hadoop daemons
      • Starting stopping daemons using command line and cloudera manager
    • Linux Primer
    • Ubuntu Primer
    • Downloading Installing Apache Hadoop All In One configuration
    • HDFS commands
    • Mapreduce Program Execution
    • Exploring HDFS blocks & Meta Data
  1. Significance of HDFS in Hadoop
    • Features of HDFS
    • 5 daemons of Hadoop
      • Name Node and its functionality
      • Data Node and its functionality
      • Secondary Name Node and its functionality
      • Job Tracker and its functionality
      • Task Tracker and its functionality
    • Data Storage in HDFS
      • Introduction about Blocks
      • Data replication
    • Map Reduce Story
    • Map Reduce Architecture
    • How Map Reduce works
    • Developing Map Reduce
    • Map Reduce Programming Model
      • Different phases of Map Reduce Algorithm.
      • Different Data types in Map Reduce.
      • How Write a basic Map Reduce Program.
    • Resource Manager
    • Node Manager
    • Job Flow Sequence Revisited
    • Classical version of Apache Hadoop (MRv1)
    • Limitations of classical MapReduce
    • Addressing the scalability ,resource utilization issue and need to support different programming paradigms
    • YARN: The next generation of Hadoop's compute platform (MRv2)
    • Architecture of YARN
    • Application submission in YARN
    • Type of yarn schedulers (FIFO, Capacity and Fair)
  2. PIG

    • Introduction to Apache Pig
      • Pig architecture
    • Map Reduce Vs. Apache Pig
    • SQL vs. Apache Pig
    • Different data types in Pig
    • Modes of Execution in Pig
    • Grunt shell
    • Loading data
    • Exploring Pig
    • Latin commands
    • Hive introduction and architecture
    • Hive vs RDBMS
    • HiveQL and the shell
    • Managing tables (external vs managed)
    • Data types and schemas
    • Partitions and buckets
    • Installation
    • Hive Services, Hive Server and Hive Web Interface (HWI)
    • Meta store
    • Derby Database
    • Working with Tables
    • Primitive data types and complex data types
    • Working with Partitions
    • Hive Bucketed Tables and Sampling
    • External partitioned tables
    • Differences between ORDER BY, DISTRIBUTE BY and SORT BY
    • Log Analysis on Hive
    • Hands on Exercises
    • Architecture
    • HBase vs. RDBMS
    • Column Families and Regions
    • Write pipeline
    • Read pipeline
    • HBase commands
    • HBase Installation
    • HBase concepts
    • HBase Data Model and Comparison between RDBMS and NOSQL    
    • Master & Region Servers
    • HBase Operations (DDL and DML) through Shell and Programming and HBase Architecture
    • Catalog Tables
    • The ZooKeeper Service: Data Model
    • Operations
    • Implementation
    • Consistency
    • Sessions
    • States
    • Understanding the Flume architecture and how is it different from sqoop
    • Flume Agent Setup
    • Setting up data
    • Types of sources, channels, sinks Multi Agent Flow
    • Different Flume implementations
    • Hands-on exercises (configuring and running flume agent to load streaming data from web server)
    • What is Sqoop?
    • How its works?
    • architecture
    • Data Imports
    • Data Exports
    • Integration with Hadoop Ecosystem
    • Understanding the Spark architecture and why it is better than MapReduce
    • Working with RDD’s.
    • Hands on examples with various transformations on RDD
    • Perform Spark actions on RDD
    • Spark Sql concepts : Dataframes & Datasets
    • Hands on examples with Spark SQL to create and work with dataframes and datasets
    • Create Spark DataFrames from an existing RDD
    • Create Spark DataFrames from external files
    • Create Spark DataFrames from hive tables
    • Perform operations on a DataFrame
    • Using Hive tables in Spark


Your data is safe with us, No unnecessary marketing calls!

Thank you for contacting us !

Our Team will get in touch with you soon or call 8097057778 now to get answer for all your queries !

Like Our Facebook page to be up to date in industry !