We live in an era with an exploding amount of data surrounding us. Every day or even every second, data is generated non-stop and fast. To deal with such a massive amount of data, we need to know how to analyze data and extract valuable insights. This is where statistical data analysis comes into play. It is used by scientists, governments, businesses, and organizations to investigate trends, patterns, and relationships in data, and thus it is necessary for many in-demand data analytics job roles. Statistical data analysis is an indispensable part of any data analysis task with a professional statistical attitude toward data. It is used whenever we need to decide on research design, sample size, and sampling technique, as well as specify hypotheses for testing variables relationships.
TechClass Statistical Data Analysis with Python online course aims to provide a rich data analysis skill-set to help you grasp knowledge and insights within data. It is designed to give you the necessary resources to gain the career-building Python skills you need to succeed as a Data Analyst. By the end of this course, you will get a full understanding of how to use Python's scientific computing libraries to import, clean, manipulate, visualize data and use a wide range of statistical techniques to analyze data to extract meaningful insights.
Learning outcomes
- Learn the basic concepts of data analysis, statistical analysis, and types of data analysis
- Get familiar with the primary steps involved in data analysis
- Learn the different methods of data ingestion to extract data from flat files, JSON files, Excel spreadsheets, SQL databases, and cloud data storage
- Learn how to extract and visualize descriptive statistics of data
- Gain hands-on experience cleaning data using popular Python libraries
- Learn the main concepts of probabilistic and statistical analysis
- Learn different methods of statistical data modeling such as distribution fitting and conditional probabilistic analysis
- Understand the concepts of data relationships, correlation, and causation
- Gain hands-on experience using different methods to shed light on possible relationships between variables
- Learn how to plot heatmaps for data relationship analysis
- Gain hands-on experience in hypothesis testing for different kinds of analysis
- Learn about the concept of A/B testing and how to implement it in Python
Table of contents
Chapter 1: Intro to Course
- 1.1. Welcome!
- 1.2. About TechClass Data Science Department
- 1.3. Learning Outcomes
- 1.4. Your Expectations, Goals, and Knowledge
- 1.5. Abbreviations
- 1.6. Copyright Notice
Chapter 2: Introduction to Data Analysis
- 2.1. What is Data Analysis?
-
2.2. Different Types of Data Analysis
-
2.3. What is Statistical Analysis?
-
2.4. Descriptive vs. Inferential Statistics
-
2.5. Methods of Sampling
-
2.6. Steps Involved in Data Analysis
-
2.7. Quiz
Chapter 3: Data Ingestion
- 3.1. Introduction
-
3.2. Importing Flat Files
-
3.3. Parsing Date and Time
-
3.4. Importing Excel Spreadsheets
-
3.5. Connecting to a Database
-
3.6. Retrieving Tables from MySQL Database
-
3.7. Retrieving Tables from PostgreSQL Database
-
3.8. Retrieving Data from Azure Blob Storage
-
3.9. Retrieving Data from AWS S3 Buckets
-
3.10. Importing JSON Files
-
3.11. Combining Multiple Datasets
-
3.12. Quiz
Chapter 4: Descriptive Statistics
- 4.1. Introduction
-
4.2. Histogram and Bar Chart
-
4.3. Central Tendency Measures
-
4.4. Data Variability Measures
-
4.5. Extracting Descriptive Statistics
-
4.6. Skewness
-
4.7. Kurtosis
Chapter 5: Data Cleaning
- 5.1. Introduction
-
5.2. Handling Incorrect Values
-
5.3. Handling Incorrect Data Types
-
5.4. Removing Missing Values
-
5.5. Handling Missing Values: Simple Imputation
-
5.6. Handling Missing Values: K-NN Imputation
-
5.7. Handling Missing Values: MICE
-
5.8. Binning
-
5.9. Outlier Detection: IQR Method
-
5.10. Outlier Detection: Isolation Forest
-
5.11. Data Sanitization
-
5.12. Quiz
Chapter 6: Probability
- 6.1. Introduction
-
6.2. Probabilistic Experiment
-
6.3. Probability of an Event
-
6.4. Random Variable
-
6.5. Discrete and Continuous Random Variables
-
6.6. Probability Mass Function
-
6.7. Probability Density Function
-
6.8. Cumulative Distribution Function
-
6.9. Empirical Cumulative Distribution Function
-
6.10. Expected Values
Chapter 7: Statistical Data Modeling
- 7.1. Introduction
-
7.2. Normal Distribution
-
7.3. Other Types of Distribution Functions
-
7.4. Kernel Density Estimation
-
7.5. Fitting Data to the Probability Distribution
-
7.6. Conditional Probabilistic Analysis
Chapter 8: Relationship Analysis
- 8.1. Introduction
-
8.2. Correlation vs. Causation
-
8.3. Covariance Matrix
-
8.4. Pearson Correlation
-
8.5. Kendall Rank Correlation
-
8.6. Spearman's Rank Correlation
-
8.7. Heatmap of Correlation Matrix
-
8.8. Quiz
Chapter 9: Hypothesis Testing
- 9.1. Introduction
-
9.2. Essential Concepts
-
9.3. Chi-square Test of Independence
-
9.4. Chi-square Test of Independence: Implementation
-
9.5. Two-Sample t-Test
-
9.6. Paired t-Test
-
9.7. One-Way ANOVA
-
9.8. Post-Hoc Tests
-
9.9. Non-Parametric tests
Chapter 10: A/B Testing
- 10.1. Introduction
-
10.2. Designing the Experiment
-
10.3. Collecting and Preparing the data
-
10.4. Testing the Hypothesis
Chapter 11: Final Tasks
- 11.1. Final Project
- 11.2. Self-study Essay
Chapter 12: Finishing the Course
- 12.1. What We Have Learned
- 12.2. Where to Go Next?
- 12.3. Your Opinion Matters
- 12.4. Congrats! You did it!
Brochure