Hadoop Training in Delhi
Get a Call
Introduction of Hadoop training in Delhi
Module 1
Understanding Hadoop
-
-
- The Three Vs of Big Data, Six Key Hadoop DATA TYPES, Sentiment Use Case
- Getting Twitter Feeds into Hadoop, Use HCatalog to Define a Schema, Use Hive to Determine Sentiment, View Spikes in Tweet Volume, View Sentiment by Country, Geo location Use Case
- The Geo location Data, Getting the Raw Data into Hadoop, The Truck Data, Getting the Truck Data into Hadoop, HCatalog Stores a Shared Schema
- Data Analysis, Use Hive to Compute Truck Mileage, About Hadoop, Relational Databases vs. Hadoop, About Hadoop 2.x
- New in Hadoop 2.x, The Hadoop Ecosystem, The Hortonworks Data Platform (HDP), The Path to ROI
- Lab: Start an HDP 2.1 Cluster
-
Module 2
Lab: Start an HDP 2.1 Cluster
-
-
- About HDFS, Hadoop and RDBMS differentiate, HDFS Components, The NameNode, The DataNodes, DataNode Failure, HDFS Commands
-
Module 3
Inputting Data into HDFS
-
-
- Examples of HDFS Commands, HDFS File Permissions, Options for Data Input, The Hadoop Client, Web HDFS, A Flume Example
- Overview of Sqoop, The Sqoop Import Tool, Importing a Table, Importing Specific Columns, Importing from a Query, The Sqoop Export Tool, Exporting to a Table.
- Lab: Importing RDBMS Data into HDFS
- Lab: Exporting HDFS Data to an RDBMS
-
Module 4
The MapReduce Framework
-
-
- Understanding MapReduce, The Key/Value Pairs of MapReduce, WordCount in MapReduce
- Demo: Understanding MapReduce
- Lab: Running a MapReduce Job
-
Module 5
Introduction to Pig
-
-
- About Pig, Pig Latin
- The Grunt Shell
- Demo: Understanding Pig
- Pig Latin Relation Names
- Pig Latin Field Names& Data Types
- Pig Complex Types
- Defining a Schema
- Lab: Getting Started with Pig
- The GROUP Operator, GROUP ALL, Relations without a Schema, The FOREACH…GENERATE Operator, Specifying Ranges in FOREACH, Field Names in FOREACH, FOREACH with Groups, The FILTER Operator, The LIMIT Operator
- Lab: Exploring Data with Pig
-
Module 6
Advanced Pig Programming
-
-
- The ORDER BY Operator, The CASE Operator, Parameter Substitution, DISTINCT, PARALLEL, FLATTEN, Operator, Performing an Inner and outer Join, Invoking a UDF, Tips for Optimizing Pig Scripts
- Lab: Joining Datasets
- Preparing Data for Hive
-
Module 7
Hive Programming
-
-
- About Hive, Comparing Hive to SQL, Hive Architecture, Submitting Hive Queries, Defining a Hive-Managed Table, Defining an External Table, Defining a Table LOCATION, Loading Data into Hive, Performing Queries
- Understanding Hive Tables, Hive Partitions, Hive Buckets, Skewed Tables, Demo: Understanding Partitions and Skew, Using Distribute By, Storing Results to a File, Specifying MapReduce Properties
- Lab: Analyzing Big Data with Hive
- Lab: Understanding MapReduce in Hive
- Hive Join Strategies, Shuffle Joins, Map (Broadcast) Joins, Sort-Merge-Bucket Joins, Invoking a Hive UDF, Computing ngrams in Hive
Demo: Computing programs
-
Module 8
Using Hcatalog
-
-
- About Hcatalog, HCatalog in the Ecosystem
- Defining a New Schema
- Using HCatLoader with Pig
- Using HCatStorer with Pig, The Pig SQL Command
- Lab: Using HCatalog with Pig
-
Module 9
Advanced Hive Programming
-
-
- Performing a Multi-Table/File Insert
- Understanding Views, Defining Views, Using Views, The TRANSFORM Clause, The OVER Clause, Using Windows, Hive Analytics Function Lab: Advanced Hive Programming
- Hive File Formats, Hive SerDes, Hive ORC Files, Computing Table Statistics, Hive Cost-Based Optimization (CBO), Using Hive CBO, Vectorization, Using HiveServer2, Understanding Hive on Tez, Using Tez for Hive Queries
- Demo: Hive Optimizations
- Hive Optimization Tips, Hive Query Tunings, Lab: Streaming Data with Hive and Python
-
Module 10
Hadoop 2 and YARN
-
-
- About HDFS Federation, Multiple Federated NameNodes, Multiple Namespaces
- Overview of HDFS HA, Quorum Journal Manager, Configuring Automatic Failover
- About YARN, Open-source YARN Use Cases
- The Components of YARN
- The life cycle of a YARN Application
- A Cluster View Example
-
Module 11
Defining Workflow with Oozie
-
- Submitting a Workflow Job, Fork and Join Nodes
- Defining an Oozie Coordinator Job
- Schedule a Job Based on Time
- Schedule Based on Data Availability
- Lab: Defining an Oozie Workflow
For the complete breakdown of Hadoop course,
Upcoming Batches:
Course Reviews
No Reviews found for this course.
0 Responses on Hadoop Training in Delhi"