gravatar

ParichartP

Parichart Pattarapanitchai

Recently Published

Statistics for Data Science (229711) - Chapter 6: Data Preprocessing
This chapter dives into the "engine room" of Data Science: Preprocessing. Students will learn that the quality of a model is determined long before it is trained, focusing on the critical steps required to turn messy, real-world data into a "model-ready" format. Core Topics covered: Why Preprocessing Matters Handling Missing Data Outlier Detection and Treatment Data Transformation Encoding Categorical Variables Feature Scaling Data Integration and Reshaping Chapter Lab Activity: Full Preprocessing Pipeline with msleep
Statistics for Data Science (229711) - Chapter 5: Data Sampling Techniques
This chapter addresses the foundational question of data science: "How do we ensure our data truly represents the world?" It explores the mechanics of selection, the math of sample size, and the power of computational resampling. Core Topics covered: Why Sampling Matters Probability Sampling Methods Non-Probability Sampling Methods Sample Size Determination Sampling Bias and Common Pitfalls Bootstrap Resampling Evaluating Sample Quality Chapter Lab Activity: Exploring Sampling with nhanes-Style Data
Statistics for Data Science (229711) - Chapter 4: Test of Independence of Variables
This chapter explores the statistical frameworks used to detect and quantify relationships between variables. It moves from testing the independence of categorical factors to measuring the strength and direction of associations in both discrete and continuous data. Core Topics covered: The Concept of Independence Chi-Square Test of Independence Fisher’s Exact Test Cramér’s V and Effect Size for Categorical Association Correlation Tests Point-Biserial and Phi Coefficients Partial Correlation Chapter Lab Activity: Exploring Independence with the titanic and mtcars Datasets
Statistics for Data Science (229711) - Chapter 3: Hypothesis Testing
This chapter introduces the core engine of statistical decision-making: Hypothesis Testing. It provides a rigorous framework for making inferences about populations based on sample evidence, a critical skill for any Data Scientist. Core Topics covered: The Logic of Hypothesis Testing One-Sample Tests Two-Sample Tests Paired Sample Test One-Way ANOVA Non-Parametric Alternatives Effect Size and Statistical Power Chapter Lab Activity: Exploring Hypothesis Testing with the ToothGrowth Dataset
Statistics for Data Science (229711) - Chapter 2: Data Distribution and Probability
This chapter serves as the theoretical bridge between descriptive analysis and statistical inference. It introduces the fundamental concepts of probability and explores the mathematical distributions that model real-world data behavior. Core Topics covered: Types of Data and Measurement Scales Probability Fundamentals Conditional Probability and Bayes’ Theorem Discrete Probability Distributions Continuous Probability Distributions Sampling Distributions and the Central Limit Theorem Assessing Normality Chapter Lab Activity: Exploring Distributions with the airquality Dataset
Statistics for Data Science (229711) - Chapter 1: Descriptive Statistics
This document serves as the introductory chapter for the Statistics for Data Science course at the graduate level. It focuses on the fundamental principles of Exploratory Data Analysis (EDA), shifting the focus from simple computation to critical statistical interpretation . Topics covered: Measures of Central Tendency Measures of Dispersion Measures of Shape: Skewness and Kurtosis Data Visualization for Descriptive Statistics Multivariate Descriptive Statistics Chapter Lab Activity: Exploring the mtcars Dataset
208251_LAB5_Nonparametric Statistics
Students are able to 1)perform descriptive statistics 2)apply appropriate non-parametric statistics tests to answer research questions of interest.
208251_LAB4_Nonparametric Statistics
Students are able to 1)perform descriptive statistics 2)apply appropriate non-parametric statistics tests to answer reseach questions of interest.
208251_LAB3_Model diagnostics
Students are able to use R language to analyse data using multiple linear regression: 1. Perform linear regression analysis 2. Check Normality Assumptions 3. Check Constant Variance Assumptions 4. Check Independence (Autocorrelation) Assumptions 5. Dealing with Invalid Model Assumption
208251_LAB1_SimpleLinearRegression
Students are able to use R language to 1. perform descriptive statistics 2. construct scatterplot between two quantitative variables 3. perform correlation analysis 4. perform linear regression analysis and inference on regression parameters 5. interpret the results
208251_LAB2_MultipleLinearRegression
Students are able to use R language to analyse data using multiple linear regression: 1. perform descriptive statistsics 2. transform qualitative independent variable into dummy variables 3. select independent variables 4. perform linear regression analysis and inference on regression parameters 5. interpret the results