Course curriculum

  • 1
  • 2

    Module 1: Overview

    • Segment - 01 - Course Structure and Approach
    • Segment - 02 - Course Pre-requisites
    • Segment - 03 - Course Outcomes
  • 3

    Module 2: Environment setup

    • Segment - 04 - Google Cloud Account Setup
    • Segment - 05 - Creating a Dataproc Cluster
    • Segment - 06 - GCP Account Best Practices
    • Segment - 07 - Twitter Developer Account Setup
  • 4

    Module 3: Getting Started with Big Data Journey

    • Segment - 08 - Definition of Big Data
    • Segment - 09 - Data Lake Overview
    • Segment - 10 - Key Roles in Big Data Science Project
    • Segment - 11 - Big Data Logical Architecture
    • Segment - 12 - Typical Big Data Pipeline
    • Segment - 13 - Hadoop Overview
    • Segment - 14 - Bonus Demystifying JVM vs JDK vs JRE
  • 5

    Module 4: Hadoop Filesystem

    • Segment - 15 - HDFS Overview
    • Segment - 16 - Small FS vs HDFS
    • Segment - 17 - HDFS Architecture
    • Segment - 18 - Hands-on HDFS
  • 6

    Module 5: Distributed Processing using MapReduce and Beyond

    • Segment - 19 - Introduction to MR
    • Segment - 20 - Logical & Physical Architecture of MR
    • Segment - 21 - YARN (Distributed OS)
    • Segment - 22 - YARN Architecture
    • Segment - 23 - Hands-on Spark Job on YARN
  • 7

    Module 6: Data Persistence in Big Database

    • Segment - 24 - RDBMS USPs & its Limitations
    • Segment - 25 - Polyglot Persistence
    • Segment - 26 - Why HBase and Limitations
    • Segment - 27 - HBase Terms
    • Segment - 28 - HBase Physical Storage
    • Segment - 29 - HBase Architecture
    • Segment - 30 - Installation HBase on DataProc cluster
    • Segment - 31 - Installing Confluent Kafka
    • Segment - 32 - KSQLDB Troubleshoot
    • Segment - 33 - Hands-on HBase
  • 8

    Module 7: Data Ingestion using Sqoop

    • Segment - 34 - Sqoop Overview
    • Segment - 35 - Sqoop Architecture
    • Segment - 36 - Sqoop Installation
    • Segment - 37 - Hands-on Sqoop
  • 9

    Module 8: Data Analysis using Hive Impala

    • Segment - 38 - Hive Overview
    • Segment - 39 - Hive Architecture
    • Segment - 40 - Impala Overview
    • Segment - 41 - Impala Architecture
    • Segment - 42 - Text vs Binary Data Formats
    • Segment - 43 - Avro Format
    • Segment - 44 - Hive Hands-on
    • Segment - 45 - Hands-on Sqoop + Hive Integration
    • Segment - 46 - Hands-on Schema Evolution
  • 10

    Module 9: Data Processing using Spark

    • Segment - 47 - Spark Overview
    • Segment - 48 - Spark Logical Architecture
    • Segment - 49 - Spark Physical Architecture
    • Segment - 50 - Spark Core Vs Spark SQL
    • Segment - 51 - Spark Execution Modes
    • Segment - 52 - Hands-on Spark on Jupyter
  • 11

    Module 10: Streaming Events through Kafka

    • Segment - 53 - Introduction to Apache Kafka
    • Segment - 54 - Evolution of Kafka
    • Segment - 55 - Why Kafka?
    • Segment - 56 - Apache Kafka Vs Confluent Kafka
    • Segment - 57 - Kafka Architecture
    • Segment - 58 - Kafka Demo - Producer Consumer
  • 12

    Module 11: Building Dataflows using NiFi

    • Segment - 59 - NiFi Overview
    • Segment - 60 - NiFi UseCases
    • Segment - 61 - NiFi Limitations
    • Segment - 62 - NiFi Components and its Architecture
    • Segment - 63 - Hands-on NiFi Installation on GCP
    • Segment - 64 - Hands-on Twitter Data Ingestion Using Nifi Part 1
    • Segment - 65 - Hands-on Twitter Data Ingestion Using Nifi Part 2
    • Segment - 66 - Hands-on Twitter Data Ingestion Using Nifi Part 3
  • 13

    Module 12: Epilogue

    • Segment - 67 - Conclusion