Course curriculum

  • 1
  • 2

    Module 01: Course Introduction

    • Segment - 01 - Installing R and R Studio
  • 3

    Module 02: Read in Data From Different Sources in R

    • Segment - 02 - Read in CSV & Excel Data
    • Segment - 03 - Read in Unzipped Folder
    • Segment - 04 - Read in Online CSV
    • Segment - 05 - Read in Google Sheets
    • Segment - 06 - Read in Data from Online HTML Tables-Part 1
    • Segment - 07 - Read in Data from Online HTML Tables-Part 2
    • Segment - 08 - Read Data from a Database
  • 4

    Module 03: Data Pre-processing and Visualization

    • Segment - 09 - Remove Missing Values
    • Segment - 10 - Introduction to dplyr for Data Summarizing-Part 1
    • Segment - 11 - Introduction to dplyr for Data Summarizing-Part 2
    • Segment - 12 - Exploratory Data Analysis(EDA): Basic Visualizations with R
    • Segment - 13 - More Exploratory Data Analysis with xda
    • Segment - 14 - Data Exploration & Visualization With dplyr & ggplot2
    • Segment - 15 - Testing for Correlation
    • Segment - 16 - Chi Square Test
  • 5

    Module 04: Machine Learning for Data Science

    • Segment - 17 - How is Machine Learning Different from Statistical Data Analysis?
    • Segment - 18 - What is Machine Learning (ML) About?
  • 6

    Module 05: Unsupervised Learning in R

    • Segment - 19 - K-Means Theory
    • Segment - 20 - Other Ways of Selecting Cluster Numbers
    • Segment - 21 - Fuzzy K-Means Clustering
    • Segment - 22 - Weighted k-means
    • Segment - 23 - Hierarchical Clustering in R
    • Segment - 24 - Expectation-Maximization (EM) in R
    • Segment - 25 - DBSCAN Clustering in R
    • Segment - 26 - Cluster a Mixed Dataset
    • Segment - 27 - Should We Even Do Clustering?
  • 7

    Module 06: Feature/Dimension Reduction

    • Segment - 28 - Introduction
    • Segment - 29 - Principal Component Analysis (PCA)
    • Segment - 30 - More on PCA
    • Segment - 31 - Multidimensional Scaling
    • Segment - 32 - Singular Value Decomposition (SVD)
  • 8

    Module 07: Feature Selection to Select the Most Relevant Predictors

    • Segment - 33 - Removing Highly Correlated Predictor Variables
    • Segment - 34 - Variable Selection Using LASSO Regression
    • Segment - 35 - Variable Selection With FSelector
    • Segment - 36 - Boruta Analysis for Feature Selection
  • 9

    Module 08: Supervised Learning Theory

    • Segment - 37 - Some Basic Supervised Learning Concepts
    • Segment - 38 - Pre-processing for Supervised Learning
  • 10

    Module 09: Supervised Learning: Classification

    • Segment - 39 - What are GLMs?
    • Segment - 40 - Logistic Regression Models as Binary Classifiers
    • Segment - 41 - Binary Classifier with PCA
    • Segment - 42 - Some Pointers on Evaluating Accuracy
    • Segment - 43 - Obtain Binary Classification Accuracy Metrics
    • Segment - 44 - More on Binary Accuracy Measures
    • Segment - 45 - Our Multi-class Classification Problem
    • Segment - 46 - Classification Trees
    • Segment - 47 - More on Classification Tree Visualization
    • Segment - 48 - Examine Individual Variable Importance for Random Forests
    • Segment - 48 - Random Forest (RF) Classification
    • Segment - 50 - GBM Classification
    • Segment - 51 - Support Vector Machines (SVM) for Classification
    • Segment - 52 - More SVM for Classification
    • Segment - 53 - Variable Importance in SVM Modelling with rminer