Master Data Analysis with Python
In this course, you will be introduced to the DataFrame and the Series, the two primary containers of data within pandas. You will learn the components of these objects and a few basic operations and also know what subset selection methods you should
1
Resources
Module 1.1 - What is Pandas?
Segment 1 - What is Pandas
Segment 2 - Which Version of Pandas to Use
Segment 3 - Pandas Examples
Module 1.2 - The DataFrame and Series
Segment 4 - Introduction to the DataFrame and Series
Segment 5 - DataFrame Components
Segment 6 - Selecting a Series
Segment 7 - Components of a Series
Segment 8 - Getting Help in a Jupyter Notebook
Segment 9 - Exercises
Modules 1.3 - Data Types and Missing Values
Segment 10 - Introduction to Data Types and Missing Values
Segment 11 - Finding the Data Type of Each Column
Segment 12 - Getting More Metadata
Segment 13 - Exercises
Module 1.4 - Setting a Meaningful Index
Segment 14 - Setting an Index of a DataFrame
Segment 15 - Accessing the Index, Columns, and Data
Segment 16 - Accessing the Components of a Series
Segment 17 - The Default Index
Segment 18 - Setting an Index on Read
Segment 19 - Choosing a Good Index
Segment 20 - Exercises
Module 1.5 - Five-Step Process for Data Exploration
Segment 21 - Five-Step Process for Data Exploration
2
Module 2.1 - Selecting Subsets of Data from DataFrames with Just Brackets
Segment 22 - Introduction to Subset Selection
Segment 23 -Selecting with Just the Brackets
Segment 24 -Exercises
Module 2.2 - Selecting Subsets of Data from DataFrames with loc
Segment 25 - Simultaneous Row and Column Subset Selection
Segment 26 - Slice Notation with loc
Segment 27 - Other Subset Selections with loc
Segment 28 -Exercises
Module 2.3 - Selecting Subsets of Data with Iloc
Segment 29 - Simultaneous Row and Column Subset Selection
Segment 30 -Exercises
Module 2.4 - Selecting Subsets of Data from a Series
Segment 31 - Selecting Subsets of Data from a Series
Segment 32 -Exercises
Module 2.5 - Boolean Selection Single Condition
Segment 33 - Boolean Selection Single Conditions
Segment 34 - Practical Boolean Selection
Segment 35 - Exercises
Module 2.6 - Boolean Selection Multiple Conditions
Segment 36 - Different Logical Operators for Boolean Series
Segment 37 - Inverting a Condition with the Not Operator
Segment 38 - Many Equality Conditions in a Single Column
Segment 39 - Exercises - Boolean Selection Multiple Conditions
Module 2.7 - Boolean Selection More
Segment 40 - Boolean Selection on a Series
Segment 41 - Simultaneous Boolean Selection of Rows and Column Labels with loc
Segment 42 - Column to Column Comparison
Segment 43 - Filter for Missing Values
Segment 44 -Exercises - Boolean Selection More
Module 2.8 - Filtering with the Query Method
Segment 45 - Introduction to the Query Method
Segment 46 - Column to Column Comparison with Query
Segment 48 - Arithmetic Operations within Query
Segment 49 - Reference Variable Names
Segment 50 - Selecting Columns with Query
Segment 51 - Summary of the Query Method
Segment 52 -Exercises
Module 2.9 - Miscellaneous Subset Selection
Segment 53 - Selecting a Column with Dot Notation
Segment 54 -Selecting Rows with just the Brackets using Slice Notation
Segment 55 - Selecting a Single Cell with at and iat
Module 2.10 - Taking Certification Exam
Segment 56 - Going to Exam Website
Segment 57 - Completing the Exam
Segment 58 - Submitting the Exam
3
Module 3.1 - Numeric Series Methods
Segment 59 - Numeric Series Methods
Segment 60 - Core Series Attributes
Segment 61 - Arithmetic Operators
Segment 62 - Comparison Operators
Segment 63 - Boolean and Bitwise Operators
Segment 64 - Aggregation Methods
Segment 65 - Non-Aggregation Methods
Segment 66 - Series Methods with a Non-Default Index
Segment 67 - Operations on a Boolean Series
Segment 68 - Exercises
Module 3.2 - Series Missing Value Methods
Segment 69 - The isna and notna Methods
Segment 70 - Dropping Missing Values with dropna
Segment 71 - Filling Missing Values with the fillna Method
Segment 72 - Filling Missing Values with interpolate
Segment 73 - Exercises
Segment 74 - Sorting the Value and the Index
Module 3.3 - Series Sorting, Ranking and Uniqueness
Segment 75 - Ranking
Segment 76 - Uniqueness
Segment 77 - Exercises
Module 3.4 - More Series Methods
Segment 78 - The agg, idxmin, idxmax, nsmallest, and nlargest Methods
Segment 79 - Differencing Methods diff and pct_change
Segment 80 - Randomly Sample a Series
Segment 81 - The replace Method
Segment 82 - Exercises
Module 3.5 - String Series Methods
Segment 83 - String Series Methods
Segment 84 - The value_counts Method
Segment 85 - The split String Method
Segment 86 - Special Methods Just for Object Columns
Segment 87 - More String-Only Methods
Segment 88 - The replace String Method
Segment 89 - Selecting Subsets with the Brackets
Segment 90 - Exericses
Module 3.6 - Datetime Series Methods
Segment 91 - Datetime Attributes
Segment 92 - Datetime Methods
Segment 93 - Format Time as a String with strftime
Segment 94 - Convert to Period
Segment 95 - Timedeltas
Segment 96 - Datetime Series Methods
Module 3.7 - Project - Testing Normality of Stock Market Returns
Segment 97 - Project - Testing Normality of Stock Market Returns
Segment 98 - Exercises
4
Module 4.1 - Introduction to DataFrames
Segment 99- Introduction to DataFrames
Segment 100 - Arithmetic DataFrame Operations
Segment 102 - DataFrame Comparison Operators
Segment 103 - Overlap of DataFrame and Series Methods
Segment 104 - Data Dictionaries
Segment 105 - Exercises
Module 4.2 - Numeric DataFrame Methods
Segment 106 - Aggregation Methods
Segment 107 - Changing the Direction of the Operation
Segment 108 - Non-Aggregation Methods
Segment 109 - Summary Statistics for All Columns with the Describe Method
Segment 110 - Nuisance Columns
Segment 111 - Exercises
Module 4.3 - DataFrame Missing Value Methods
Segment 112 - The agg, idxmin, and idxmax Methods
Segment 113 - Dropping Rows and Columns with the dropna Method
Segment 114 - Filling missing values with the fillna Method
Segment 115 - The interpolate Method
Segment 116 - Exercises
Module 4.4 - DataFame Sorting, Ranking and Uniqueness
Segment 117 - Sorting
Segment 118 - Ranking
Segment 119 - Uniqueness
Segment 120 - Finding the Maximum or Minimum of a Group
Segment 121 - The value_counts Method
Segment 122 - Exercises
Module 4.5 - DataFrame Structure Methods
Segment 123 - Adding a New Column to the DataFrame
Segment 124 - Copying the DataFrame
Segment 125 - Column and Row Dropping and Renaming
Segment 126 - Inserting Columns in the Middle of a DataFrame
Segment 127 - Getting the Integer Location with the Index get_loc Method
Segment 128 - The pop Method
Segment 129 - Exercises
Module 4.6 - More DataFame Methods
Segment 130 - The isna and notna Methods
Segment 131 - Differencing methods diff and pct_change
Segment 132 - The Sample Method
Segment 133 - The nsmallest and nlargest methods
Segment 134 - The corr Method
Segment 135 - The replace Method
Segment 136 - Methods available only to Series and not DataFrames
Segment 137 - Exercises
Module 4.7 - Assigning Subsets of Data
Segment 138 - Setting New Data with loc
Segment 139 - Setting New Data with iloc
Segment 140 - Boolean Selection Assignment
Segment 141 - Improper Assignment
Segment 142 - Exercises
5
Module 5.1 - Integer, Float and Boolean Data Types
Segment 143 - Integer Data Type
Segment 144 - Changing Data Types with astype
Segment 145 - Unsigned Integers
Segment 146 - Nullable Integer Data Type
Segment 147 - Boolean Selection with Nullable Booleans
Segment 148 - Float Data Types
Segment 149 - Changing from Float to Int
Segment 150 - Pandas Nullable Float Data Type
Segment 151 - Boolean Data Type
Segment 152 - Nullable Boolean Data Type
Segment 153 - Different Syntax for Data Types
Segment 154 - Data Type Summary
Segment 155 - Exercises
Module 5.2 - Object, Categorical, and String Data Types
Segment 156 - 1 Object Data Types
Segment 157 - Categorical Data Type
Segment 158 - Internal Storage of Categorical Data
Segment 159 - The cat Acccessor
Segment 160 - Modifying Categories
Segment 161 - Massive Reduction in Memory Used
Segment 162 - Speeding Up Operations
Segment 163 - The str Accessor is Still Available
Segment 164 - Ordered Categories
Segment 165 - Integers can be Categories
Segment 166 - The New String Data Type
Segment 167 - Converting Strings to Numerica
Segment 168 - Exercises
Module 5.3 - Datetime, Timedelta, and Period Data Types
Segment 169 - The pandas datetime64 data type
Segment 170 - The pandas timedelta64 data type
Segment 171 - The pandas period data type
Segment 172 - Summary Table
Segment 173 - Exercises
Module 5.4 - DataFrame Data Type Conversion
Segment 174 - Discovering Strings in Numeric Columns
Segment 175 - Converting non-numeric values to missing
Segment 176 - The astype method for DataFrames
Segment 177 - Reading in data with known missing values
Segment 178 - More Data type Conversion with the Housing Dataset
Segment 179 - Exercises
6
Module 6.1 - Grouping Aggregation Basics
Segment 180- Grouping Aggregation Basics
Segment 181 - Grouping with the groupby Method
Segment 182 - Use String Names for Aggregation Functions
Segment 183 - Aligning the Dots when Method Chaining
Segment 184 - The Index When Grouping
Segment 185 - The GroupBy Object
Segment 186 - Exercises
Module 6.2 - Grouping and Aggregating Multiple Columns
Segment 187 - Grouping with Multiple Columns
Segment 188 - Aggregating Multiple Columns
Segment 189 - Getting the size of each group
Segment 190 - Exercises
Module 6.3 - Grouping with Pivot Tables
Segment 191 - Creating Pivot Tables with Pandas
Segment 192 - Where is the Pivoting
Segment 193 - Styling Pivot Tables
Segment 194 - Getting the Size of each Group
Segment 195 - Add Marging to get Row and Column Totals
Segment 196 - Non-Standard Pivot Tables
Segment 197 - Exercises
Module 6.4 - Counting with Crosstabs
Segment 198 - Counting the Frequency with the crosstab Function
Segment 199 - Normalizing Other Aggregations
Segment 200 - crosstab is almost unnecessary in pandas
Segment 201 - Exercises
Module 6.5 - Alternative Groupby Syntax
Segment 202 - Alternative Groupby Syntax
Segment 203 - Exercises
Module 6.6 - Custom Aggregation
Segment 204 - Using a Custom Aggregation Function
Segment 205 - Custom aggregation functions must return a single value
Segment 206 - Find the mean salary for the five highest paid employees per department
Segment 207 - What percent of total salary do these five employees represent
Segment 208 - Using a custom aggregation function in a pivot table
Segment 209 - Percentage of employees by department with salaries greater than 100,000
Segment 210 - Optimizing a custom aggregation function
Segment 211 - Complete operations that are independent of the group outside of the custom function
Segment 212 - Exercises
Module 6.7 - Filer and Transform with Groupby
Segment 213 - The filter Method
Segment 214 - Viewing each Sub-DataFrame
Segment 215 - Summary of the GroupBy filter Method
Segment 216 - Finding actors that appear in at least 25 movies
Segment 217 - The groupby transform Method
Segment 218 - transform second use case - return a new value for each row in the group
Segment 219 - Find Difference from the Mean
Segment 220 - Transforming multiple columns
Segment 221 - Summary of the groupby transform method
Segment 222 - Exercises
Module 6.8 - More Groupby Methods
Segment 223 - Kinds of groupby attributes and methods
Segment 224 - head, tail, and nth groupby methods
Segment 225 - Groupby Methods Unique to Series
Segment 226 - Non-aggregating Methods
Module 6.9 - Binning Numeric Columns
Segment 227 - Exercises
Segment 228 - Binning with pd.cut
Segment 229 - Cut into a specific number of bins
Segment 230 - Quantile binning with pd.qcut
Module 6.10 - Miscellaneous Grouping Functionality
Segment 231 - Grouping with Bins
Segment 232 - Exercises
Segment 233 - Grouping by Columns not in the DataFrame
Segment 234 - Grouping Series and aggregating other columns
Segment 235 - Change the Direction of Grouping
Segment 236 - Exercises