# Course curriculum

• 1

### Module 01: Course Introduction

• Segment - 01 - Installing R and R Studio
• 2

### Module 02: Read in Data From Different Sources in R

• Segment - 02 - Read in CSV & Excel Data
• Segment - 03 - Read in Unzipped Folder
• Segment - 04 - Read in Online CSV
• Segment - 06 - Read in Data from Online HTML Tables-Part 1
• Segment - 07 - Read in Data from Online HTML Tables-Part 2
• Segment - 08 - Read Data from a Database
• 3

### Module 03: Data Pre-processing and Visualization

• Segment - 09 - Remove Missing Values
• Segment - 10 - Introduction to dplyr for Data Summarizing-Part 1
• Segment - 11 - Introduction to dplyr for Data Summarizing-Part 2
• Segment - 12 - Exploratory Data Analysis(EDA): Basic Visualizations with R
• Segment - 13 - More Exploratory Data Analysis with xda
• Segment - 14 - Data Exploration & Visualization With dplyr & ggplot2
• Segment - 15 - Testing for Correlation
• Segment - 16 - Chi Square Test
• 4

### Module 04: Machine Learning for Data Science

• Segment - 17 - How is Machine Learning Different from Statistical Data Analysis?
• Segment - 18 - What is Machine Learning (ML) About?
• 5

### Module 05: Unsupervised Learning in R

• Segment - 19 - K-Means Theory
• Segment - 20 - Other Ways of Selecting Cluster Numbers
• Segment - 21 - Fuzzy K-Means Clustering
• Segment - 22 - Weighted k-means
• Segment - 23 - Hierarchical Clustering in R
• Segment - 24 - Expectation-Maximization (EM) in R
• Segment - 25 - DBSCAN Clustering in R
• Segment - 26 - Cluster a Mixed Dataset
• Segment - 27 - Should We Even Do Clustering?
• 6

### Module 06: Feature/Dimension Reduction

• Segment - 28 - Introduction
• Segment - 29 - Principal Component Analysis (PCA)
• Segment - 30 - More on PCA
• Segment - 31 - Multidimensional Scaling
• Segment - 32 - Singular Value Decomposition (SVD)
• 7

### Module 07: Feature Selection to Select the Most Relevant Predictors

• Segment - 33 - Removing Highly Correlated Predictor Variables
• Segment - 34 - Variable Selection Using LASSO Regression
• Segment - 35 - Variable Selection With FSelector
• Segment - 36 - Boruta Analysis for Feature Selection
• 8

### Module 08: Supervised Learning Theory

• Segment - 37 - Some Basic Supervised Learning Concepts
• Segment - 38 - Pre-processing for Supervised Learning
• 9

### Module 09: Supervised Learning: Classification

• Segment - 39 - What are GLMs?
• Segment - 40 - Logistic Regression Models as Binary Classifiers
• Segment - 41 - Binary Classifier with PCA
• Segment - 42 - Some Pointers on Evaluating Accuracy
• Segment - 43 - Obtain Binary Classification Accuracy Metrics
• Segment - 44 - More on Binary Accuracy Measures
• Segment - 45 - Our Multi-class Classification Problem
• Segment - 46 - Classification Trees
• Segment - 47 - More on Classification Tree Visualization
• Segment - 48 - Examine Individual Variable Importance for Random Forests
• Segment - 48 - Random Forest (RF) Classification
• Segment - 50 - GBM Classification
• Segment - 51 - Support Vector Machines (SVM) for Classification
• Segment - 52 - More SVM for Classification
• Segment - 53 - Variable Importance in SVM Modelling with rminer