gravatar

Mfrimps

Marina Frimpong

Recently Published

EPI 553: Lab 08 Hypothesis – Frimpong
This RPubs report presents an analysis of hypothesis testing in multiple linear regression using the BRFSS 2020 dataset. The study examines factors associated with the number of mentally unhealthy days reported by U.S. adults. The analysis fits regression models including predictors such as physically unhealthy days, sleep hours, age, income, sex, and exercise. Different hypothesis testing methods are used to determine whether these variables significantly contribute to the model. The report demonstrates the use of the overall F-test, Type I and Type III sums of squares, partial F-tests, t-tests for regression coefficients, and chunk tests to evaluate the importance of individual variables and groups of variables in explaining mental health outcomes.
Household Language Environment and Impairing Mental and Behavioral Conditions Among U.S. Children: Evidence from the 2020–2022 NSCH
This project analyzes pooled 2020–2022 National Survey of Children’s Health (NSCH) data to examine the association between household language environment and currently impairing mental and behavioral conditions among U.S. children ages 2–17 years. Using survey-weighted logistic regression models, the analysis evaluates whether bilingual and non-English households differ from English-only households after adjusting for sociodemographic factors and adverse childhood experiences (ACE burden). The study aims to clarify whether observed language-related differences persist independent of early life adversity and structural determinants of health.
EPI 553: Lab 07 - MLR - Frimpong
This analysis examines the individual, behavioral, and socioeconomic factors associated with mentally unhealthy days among 5,000 U.S. adults drawn from the Behavioral Risk Factor Surveillance System (BRFSS) 2020 survey. Using multiple linear regression, we model self-reported mentally unhealthy days as a function of sleep hours, physical health days, age, sex, household income, and exercise status, following a sequential model-building approach to illustrate how confounding adjustment influences coefficient estimates and model fit. Results indicate that poor physical health, shorter sleep duration, female sex, and lower income were each independently associated with more mentally unhealthy days, while older age was associated with fewer. Residual diagnostics, ANOVA decomposition, and marginal effects plots are presented throughout. This report was produced as part of EPI 553: Statistical Inference at the University at Albany, School of Public Health, Spring 2026.
EPI 553: ASSIGNMENT 02 - SLR - FRIMPONG
This analysis examines the association between dietary calcium intake and total femur bone mineral density (BMD) using data from the NHANES 2017–2018 cycle. Using simple linear regression, we estimate the direction and magnitude of the calcium–BMD relationship, test whether the association is statistically distinguishable from chance, and generate predicted BMD values at a clinically meaningful calcium intake level. The dataset includes 2,129 adults after excluding observations with missing values on BMD or calcium. Results are interpreted in the context of bone health research, with attention to model fit, prediction uncertainty, and the limitations of drawing causal conclusions from a cross-sectional survey.
EPI 553: LAB 06 – SIMPLE_LINEAR_REGRESSION – FRIMPONG
This lab applies Simple Linear Regression (SLR) to the Behavioral Risk Factor Surveillance System (BRFSS) 2020 data, a nationally representative telephone survey conducted by the CDC among U.S. adults. The analysis examines whether BMI is associated with the number of poor physical health days reported in the past 30 days. Tasks include exploratory data analysis, model fitting and interpretation, ANOVA decomposition, R² computation, confidence and prediction intervals, residual diagnostics, and comparison of BMI versus age as predictors. Results are interpreted in the context of public health, with attention to model assumptions and limitations of simple linear regression for this outcome.
Household Language Environment and Impairing Mental and Behavioral Conditions Among U.S. Children: Evidence from the 2020–2022 NSCH
This project analyzes pooled 2020–2022 National Survey of Children’s Health (NSCH) data to examine the association between household language environment and currently impairing mental and behavioral conditions among U.S. children ages 2–17 years. Using survey-weighted logistic regression models, the analysis evaluates whether bilingual and non-English households differ from English-only households after adjusting for sociodemographic factors and adverse childhood experiences (ACE burden). The study aims to clarify whether observed language-related differences persist independent of early life adversity and structural determinants of health.
epi553_Lab05_Modeling_Frimpong
This document presents an applied statistical modeling lab using data from the 2023 Behavioral Risk Factor Surveillance System (BRFSS), a nationally representative U.S. health survey. The analysis demonstrates key regression modeling concepts in epidemiology including simple and multiple logistic regression, dummy variable coding, interaction testing, model diagnostics, and model comparison, all applied to predicting hypertension among U.S. adults. Key topics covered include building and interpreting logistic regression models, controlling for confounding, testing for effect modification using likelihood ratio tests, checking model assumptions with variance inflation factors and Cook's Distance, and selecting the best fitting model using AIC and BIC. Results show that age and obesity are the strongest independent predictors of hypertension in this sample, with no significant Age × BMI interaction detected. This lab was completed as part of EPID 553 Statistical Inference and is intended as a practical introduction to regression modeling for public health and epidemiology students.
epi553_hw01_Frimpong_Marina
This analysis uses NHANES data to examine bone mineral density (BMD) through one-way ANOVA and correlation methods. Part 1 tests whether BMD differs across five ethnic groups, including assumption checks, effect size calculation, and post-hoc comparisons. Results show significant ethnic differences (p < 0.001), with Non-Hispanic Black individuals having higher BMD. Part 2 explores correlations between BMD and potential predictors (age, BMI, calcium, vitamin D, physical activity) using transformed variables. BMI shows the strongest association with BMD (r = 0.425), while age shows a negative relationship (r = -0.232). Part 3 reflects on methodological choices, assumption challenges, and R programming skills developed during the analysis. Methods: R with tidyverse, ggplot2, broom, and car packages.
EPI 553: Lab 03 CORRELATION – Frimpong
This analysis examined the linear association between height and weight among U.S. adults using Pearson correlation. Results indicated a moderate positive relationship (r = 0.451), suggesting that taller individuals tend to weigh more. The association was statistically significant (t = 42.618, p < 0.001), and the 95% confidence interval (0.432, 0.469) excluded zero, providing strong evidence against the null hypothesis of no correlation. The coefficient of determination (r² = 0.203) indicates that approximately 20.3% of the variability in one measure is explained by the other, reflecting a meaningful but not complete linear relationship.
EPI 553: Lab 02 ANOVA – Frimpong
A one-way analysis of variance (ANOVA) was used to compare mean days of poor mental health across three physical-activity groups defined from survey responses: individuals reporting no regular activity, those engaging in moderate-intensity activity, and those engaging in vigorous-intensity activity. The analysis tested whether between-group differences in means exceeded within-group variability using the F-statistic, under assumptions of independent observations, approximately normally distributed residuals, and homogeneity of variances.
EPI 553: Lab 01 NHANES – Frimpong
I conducted an exploratory data analysis using NHANES data to examine how the prevalence of hypertension varies across education levels. I cleaned and grouped the data in R, generated summary statistics for systolic blood pressure and hypertension prevalence and interpreted the observed social gradient in cardiovascular risk. The report highlights how education, a key social determinant of health, is associated with meaningful differences in population health outcomes and discusses implications for public health practice and policy.
EPI 553: Week 1 Setup Checklist - Frimpong
This document presents introductory R programming exercises for EPI 553 (Statistical Inference), including data manipulation, basic visualization, and reproducible reporting using R Markdown. It demonstrates reproducible analysis practices, including data import, summary statistics, graphical exploration, and rendered HTML output for coursework submission.