gravatar

AndresGD18

Andres Garcia Damasco

Recently Published

Moneyball Predictor
This project applies multiple linear regression to predict the number of wins for professional baseball teams using a dataset of approximately 2,276 records spanning the years 1871 to 2006. Each observation represents a team's seasonal performance, adjusted to a 162-game schedule. The analysis follows a four-stage workflow: data exploration, data preparation, model building, and model selection.
Omitted Variable Bias
A practical walkthrough of Omitted Variable Bias (OVB) using the bwght dataset (n = 1,388). Covers the two conditions for OVB, direction of bias using the 2×2 matrix, and a side-by-side regression comparison via stargazer — showing how omitting family income overstates the negative effect of smoking on infant birth weight. Built with R and the wooldridge package.
Gauss Markov Assumptions and Residual Analysis
A practical walkthrough of the Gauss-Markov assumptions using the wage1 dataset (n = 526). Covers plain-English and technical explanations of all seven assumptions, a simple OLS regression of wages on education, diagnostic plot interpretation, and the case for log-transforming the dependent variable. Built with R and the Wooldridge package.
OLS Point Estimates
This assignment replicates OLS results using both R’s lm() function and matrix algebra. I run a multivariate regression on the CPS1985 dataset and manually compute the coefficients and standard errors. The goal is to show both methods give the same results and understand how OLS works behind the scenes.
HW 5
This assignment uses multiple linear regression to analyze factors influencing Mario Kart auction prices. The model evaluates variables such as auction duration, item condition, stock photo usage, and number of wheels. Results highlight the number of wheels as a key driver of price, while other variables show limited significance. The analysis includes model evaluation, coefficient interpretation, and a prediction for a new scenario, demonstrating the practical applications of regression with real-world data.
HW 4
This project applies classification techniques, including logistic regression, LDA, and Bayesian methods, to analyze and predict outcomes from the given data. The analysis compares model performance, interprets key metrics such as accuracy and error rates, and evaluates how different approaches handle classification problems. The assignment highlights the strengths and limitations of each method and demonstrates practical applications of statistical learning techniques in real-world scenarios.
Discussion 1: Types of Data
This assignment explores different types of data using the Gapminder dataset, including variables such as life expectancy, population, and GDP per capita. The analysis focuses on distinguishing between quantitative and categorical data and understanding how data types influence interpretation and analysis.