2.500,00 EGP
3.000,00 EGP
-
LevelAll Levels
-
Total Enrolled11
-
Duration5 hours 17 minutes
-
Last UpdatedDecember 31, 2024
-
CertificateCertificate of completion
Hi, Welcome back!
2.500,00 EGP
3.000,00 EGP
-
LevelAll Levels
-
Total Enrolled11
-
Duration5 hours 17 minutes
-
Last UpdatedDecember 31, 2024
-
CertificateCertificate of completion
Course content:
Module 1: Introduction to Big Data Engineering
-
Data keep on growing
01:34 -
Define Big data
03:19 -
Big data characteristics
02:04 -
Big data applications and challenges
01:32 -
How to deal with Big Data
02:24 -
The Real Time Analytics
01:25 -
How big data works
00:31 -
Data warehouse vs Data Lake vs Data Lakehouse
02:18 -
ETL VS ELT
01:28 -
Solutions for Big Data Analytics
02:07 -
The Network (Internet)
00:44 -
Big Data Engineering
03:49
Module 2 : Big Data Solutions Using Hadoop Technology
-
Introduction to Hadoop
00:57 -
Hadoop History
03:02 -
Components of Hadoop
01:12 -
RDBMS VS Hadoop
03:19 -
Hadoop Ecosystem
02:21 -
Understanding Hadoop Distributed File System (HDFS)
03:52 -
MapReduce – Programming model
02:00 -
YARN (Yet Another Resource Negotiator)
01:39 -
Resources Files
00:09 -
Start practicing on hadoop
04:22 -
Virtualization Technology
01:31 -
Install virtualbox on windows
01:59 -
Install putty software on windows
01:09 -
Install winscp on windows
01:21 -
Install Cloudera and Setting Up Hadoop
09:20 -
HDFS command part1
21:04 -
HDFS command part2
04:34 -
HDFS command part3
06:03 -
HDFS command part4
04:27 -
HDFS command part5
03:33 -
HDFS command part6
09:57 -
HDFS command part7
08:55 -
HDFS command part8
02:53 -
Introduction to Hive
01:53 -
Hive Architecture
06:11 -
Hive Data Model
01:06 -
Hive Query Language (HQL)
00:32 -
Data Types in Hive
01:03 -
DAG (Directed Acyclic Graph) in Hive
04:02 -
Hive Installation
01:50 -
Create Database
02:30 -
Creating and Managing Tables in Hive
07:23 -
Loading Data into Hive Tables
12:55 -
Managed Table in hive
02:07 -
External Table in hive
07:28 -
How to run a Hive query
09:49 -
The partitioning in Hive
00:50 -
Static Partitioning
04:08 -
Dynamic Partitioning
05:45 -
Hive Bucketing
09:52 -
Hive Join Operations
07:29 -
Hive Optimization Techniques
01:55 -
Introduction to Sqoop
02:21 -
Sqoop Architecture
00:53 -
Key Features of Sqoop
02:20 -
Sqoop Connectors
01:22 -
Sqoop Commands Overview
00:56 -
Sqoop Installation
00:36 -
MySQL Database
02:32 -
Importing Data from RDBMS to HDFS
06:22 -
Exporting Data from HDFS to RDBMS
09:23 -
Adding more mappers to a Sqoop
04:21 -
handling portions of data with Sqoop
06:29 -
Incremental Data Import in Sqoop
15:59 -
Data Compression with Sqoop
04:20 -
Avro format
03:53 -
SequenceFile format
03:16 -
Parquet format
02:46 -
Create sqoop job
04:14 -
Sqoop Performance Optimization
01:00 -
Common Sqoop Errors and Troubleshooting
01:15 -
Project : Order Data ETL Pipeline with Hadoop-Hive -Sqoop
31:36 -
Hadoop quiz
What you will learn:
- Learn the fundamentals of Big Data, Hadoop, and real-time analytics.
- Gain hands-on experience with HDFS, Hive for data management, and Sqoop for data import/export between RDBMS and HDFS. Master tools for processing and optimizing large datasets in the Hadoop ecosystem.
Course requirements:
- Basic knowledge of programming (preferably in Java or Python)
- Familiarity with databases and SQL
- A computer with at least 8GB of RAM and 50GB of free storage
- Virtual Machine software (e.g., VirtualBox or VMware)
- Cloudera installation for Hadoop setup
- Internet connection for downloading resources and updates
- Follow step-by-step instructions for setting up the environment
- Active participation in hands-on exercises and quizzes
This course includes:
- Step-by-step Cloudera installation guide
- Detailed instructions for setting up VM for Hadoop environment
- Video lectures and hands-on tutorials
- Downloadable resources
- Code examples and project files
Reviews
No Review Yet