gravatar

corey_sparks

Corey Sparks

Recently Published

Generalized Linear Models for Spatial Count data
This example shows how to fit generalized linear models to aggregate count data using an example from US counties on mortality rates.
DEM 7263 - Generalized Linear Modesl for Spatial Count data
This example shows how to fit generalized linear models to aggregate count data using an example from US counties on mortality rates.
DEM 7263 Fall 2017 - Spatially Autoregressive Models 2
This lecture builds off the previous lecture on the Spatially Autoregressive Model (SAR) with either a lag or error specification. Specifically, we examine more exotic forms of spatial autoregression.
DEM 7283 - Example 5 - Ordinal & Multinomial Logit Models
This example will cover the use of R functions for fitting Ordinal and Multinomial logit models to complex survey data. For this example I am using 2014 CDC Behavioral Risk Factor Surveillance System (BRFSS) SMART county data. Link
DEM 7283 - Example 4 - Logit and Probit Models Part 2
This example will further explore the logisitic regression model, including discussing model stratification, the chow test and comparison of regression effects across models. For this example I am using 2014 CDC Behavioral Risk Factor Surveillance System (BRFSS) SMART county data. Link
DEM 7263 Spring 2017 - Spatially Autoregressive Models 1
Introduction to Spatial Regression Models Up until now, we have been concerned with describing the structure of spatial data through correlational, and the methods of exploratory spatial data analysis. Through ESDA, we examined data for patterns and using the Moran I and Local Moran I statistics, we examined clustering of variables. Now we consider regression models for continuous outcomes. We begin with a review of the Ordinary Least Squares model for a continuous outcome. We consider data on mortality in San Antonio TX, and replicate the analysis of Sparks and Sparks 2010 from Population Space and Place
DEM 7283 - Example 3 - Logit and Probit Models
In the vast majority of situations in your work as demographers, your outcome will either be of a qualitative nature or non-normally distributed, especially if you work with individual level survey data. This example uses data from the 2014 BRFSS to illustrate the logit and probit models being fit to complex survey data.
DEM 7283 - Example 2 - Survey Statistics
This example will cover the use of R functions for analyzing complex survey data. This example uses the 2014 BRFSS Smart MMSA sample.
DEM 7263 - Exploratory Spatial Data Analysis
This example reviews methods of exploratory spatial data analysis, including Moran's I and local Moran's I using a data set from San Antonio TX, which is accessible here : https://github.com/coreysparks/data/blob/master/SA_classdata.zip
DEM 7283 - Example 1 - Introduction to R and review of Stat 1
This is a basic intro to using R and a review of principles from the first semester of statistics in our PhD program in applied demography
DEM 7223 - Event History Analysis - Example of Multi-state event history analysis
This example will illustrate how to fit a multistate hazard model using the multinomial logit model. The outcome for the example is “type of non-parental child care” and whether a family changes their particular type of childcare between waves 1 and 5 of the data. The data from the ECLS-K.
DEM 7223 - Example 9 - Competing Risks in the Cox Model
This example uses data from the National Health Interview Survey (NHIS) linked mortality data obtained from the Minnesota Population Center’s IHIS program, which links the NHIS survey files from 1986 tp 2009 to mortality data from the National Death Index (NDI). The death follow up in this data file used in the current example ends at 2006. Below, I code a competing risk outcome, using four different causes of death as competing events, and age at death as the outcome variable.
DEM 7223 - Example 8 - Discrete Time hazard model with frailty
This example will illustrate how to fit the discrete time hazard model with group-level frailty to continuous duration data (i.e. person-level data) and a discrete-time (longitudinal) data set. In this example, I will use two data sets, first: The longitudinal data example uses data from the ECLS-K. Specifically, we will examine the transition into poverty between kindergarten and 8th grade. The second example uses the event of a child dying before age 5 in the DHS model data file. The data for this example come from the model.data Demographic and Health Survey for 2012 birth history recode file. This file contains information for all births to women in the survey.
DEM 7223 - Event History Analysis - Example 7 Frailty in the Cox model
This example will illustrate how to fit the extended Cox Proportional hazards model with Gaussian frailty to continuous duration data (i.e. person-level data) and a discrete-time (longitudinal) data set. In this example, I will use the event of a child dying before age 5. The data for this example come from the model.data Demographic and Health Survey for 2012 birth history recode file. This file contains information for all births to women in the survey. The longitudinal data example uses data from the ECLS-K. Specifically, we will examine the transition into poverty between kindergarten and third grade.
DEM 7223 Event History Analysis - Discrete Time Hazard Model - Alternative Time Specifications
This example will illustrate how to fit the discrete time hazard model to person-period. Specifically, this example illustrates various parameterizartions of time in the discrete time model. In this example, I will use the event of a child dying before age 5. The data for this example come from the model.data Demographic and Health Survey for 2012 children’s recode file. This file contains information for all births in the last 5 years prior to the survey.
DEM 7223 Event History Analysis - Example 6 - Discrete Time Hazard Model
This example will illustrate how to fit the discrete time hazard model to longitudinal and continuous duration data (i.e. person-level data). The first example will use as its outcome variable, the event of a child dying before age 5. The data for this example come from the model.data [Demographic and Health Survey for 2012](http://www.dhsprogram.com/data/model-datasets.cfm) children's recode file. This file contains information for all births in the last 5 years prior to the survey. The longitudinal data example uses data from the [ECLS-K ](http://nces.ed.gov/ecls/kinderdatainformation.asp). Specifically, we will examine the transition into poverty between kindergarten and 8th grade.
DEM 7223 Event History Analysis - Example 5 Cox Proportional Hazards Model Part 2 - Model Checking
This example will illustrate how to fit parametric the Cox Proportional hazards model to a discrete-time (longitudinal) data set and examine various model diagnostics to evaluate the overall model fit. The data example uses data from the ECLS-K. Specifically, we will examine the transition into poverty between kindergarten and third grade.
DEM 7223 Event History Analysis - Example 4 Cox Proportional Hazards Model Part 1
This example will illustrate how to fit parametric the Cox Proportional hazards model to continuous duration data (i.e. person-level data) and a discrete-time (longitudinal) data set. The first example uses longitudinal data from the [ECLS-K ](http://nces.ed.gov/ecls/kinderdatainformation.asp). Specifically, we will examine the transition into poverty between kindergarten and third grade. In the second example, I use the *time between the first and second birth* for women in the data as the _outcome variable_. The data for this example come from the DHS Model data file individual recode file. This file contains information for all women sampled in the survey between the ages of 15 and 49.
Event History Analysis - DEM 7223 - Example 3 Parametric Hazard Models
This example will illustrate how to fit parametric hazard models to continuous duration data (i.e. person-level data). In this example, I use the time between the first and second birth for women in the data as the outcome variable. The data for this example come from the DHS Model data file Demographic and Health Survey for 2012 individual recode file. This file contains information for all women sampled in the survey between the ages of 15 and 49.
DEM 7223 Example 2 Comparing Survival Times Between Groups
This example will illustrate how to test for differences between survival functions estimated by the Kaplan-Meier product limit estimator. The tests all follow the methods described by Harrington and Fleming (1982) Link. The first example will use as its outcome variable, the event of a child dying before age 1. The data for this example come from the model.datan Demographic and Health Survey for 2012 children’s recode file. This file contains information for all births in the last 5 years prior to the survey. The second example, we will examine how to calculate the survival function for a longitudinally collected data set. Here I use data from the ECLS-K. Specifically, we will examine the transition into poverty between kindergarten and fifth grade.
Event History Analysis - Example 1 Functions of Survival Time
This example will illustrate how to construct a basic survival function from individual-level data. The example will use as its outcome variable, the event of a child dying before age 1. The data for this example come from the Demographic and Health Survey Model Data Files children’s recode file.
Using INLA for the Age Period Cohort Model
In this example, we use data from the Integrated Health Interview Series to fit an Age-Period-Cohort model. Discussion of these models can be found in the recent text on the subject. In addition to the APC model, hierarchical models could consider county or city of residence, or some other geography.
DEM 7903 Bayesian Regression using the INLA Approximation
This example uses the INLA approach to fit structured Bayesian regression models to aggregate (county) and individual survey data. The county level data analysis focuses on infant mortality in US counties. The survey data analysis focuses on an age-period-cohort model of BMI
Bayesian Data Analysis 2 - Bayesian Hierarchical Models
This example will go through the basics of using Stan by way of the brms library, for estimation of linear and generalized linear mixed models. We will use the ECLS-K 2011 data for our example, and use height for age/sex z-score as a continuos outcome, and short for age status outcome (height for age z < -1) as a dichotomous outcome.
Bayesian Data Analysis 2 - Bayesian Hierarchical Models
This example will go through the basics of using Stan by way of the brms library, for estimation of linear and generalized linear mixed models. We will use the ECLS-K 2011 data for our example, and use height for age/sex z-score as a continuos outcome, and short for age status outcome (height for age z < -1) as a dichotomous outcome.
DEM 7903 Bayesian Data Analysis 1
This example will go through the basics of using Stan by way of the brms library, for estimation of simple linear and generalized linear models. You must install brms first, using install.packages("brms"). We will use the ECLS-K 2011 data for our example, and use height for age/sex z-score as a continous outcome, and short for age status outcome (height for age z < -1) as a dichotomous outcome. There is an overview on using brms for fitting various models. You can find these here. The package rstanarm is very similar and better documented, and can be found here, with numerous tutorials on that page. Both packages serve as front-ends to the Stan library for MCMC. People new to Stan can often be put off by its syntax and model construction. Both of these packages allow us to use R syntax that we are accustomed to from functions like glm() and lmer() to fit models. We can also see the Stan code from the model that is generated, so we can learn how Stan works inside.
using survey design weights in Bayesian regression models
This is an example of trying to use survey design elements (weights and stratum information) in a bayesian model using brm() from the brms package, which calls Stan.
Publish DocumentDEM 7903 Week 6: Longitudinal Models for Change using Generalized Estimating Equations
In this example, we will use Generalized Estimating Equations to do some longitudinal modeling of data from the ECLS-K 2011. Specifically, we will model changes in a student’s standardized math score as a continuous outcome and self rated health as a binomial outcome, from fall kindergarten to spring, 1st grade. A presentation discussing GEE’s can be found on Rpubs under my page
DEM 7903 Week 6: Longitudinal models for change using GEE's
This presentation discusses Generalized Estimating Equations for modeling longitudinal data. There is an accompanying empirical example on rpubs as well.
DEM 7903 Week 5: Longitudinal Models for Change Document
In this example, we will use hierarchical models to do some longitudinal modeling of data from the ECLS-K 2011. Specifically, we will model changes in a student’s standardized math score from fall kindergarten to spring, 1st grade. This follows the presentation in Singer and Willett (2003) Chapters 3-6
DEM 7903 Week 5: Basic Hierarchical Models - Cross level interactions & Contextual Effects
In this example, I will show how to fit a multi-level model that includes a predictor at the macro level. I will also consider the cross-level interaction effect, where we are interested in contextualizing the effect of an individual level predictor within the context of the macro level predictor. The data we use this time merges data from the 2011 CDC Behavioral Risk Factor Surveillance System (BRFSS) SMART county data and the 2010 American Community Survey 5-year estimates at the county level. Our outcome of interest is a persons’s obesity status, measured using the BRFSS’s BMI variable, and using the cutoff rule of obese weight is a BMI greater than 30.
DEM 7903 Week 4: Basic Hierarchical Models - GLMMs
In this example, I will use the ECLS-K 2011 data. In this example, I will illustrate how to fit Generalized Linear Mixed models to outcomes that are not continuous. I will illustrate two different methods of estimation, Penalized Quasi Likelihood using the glmmPQL() function in the MASS library and the Laplace approximation using the glmer() function in the lme4 library.
DEM 7903 Hierarchical Models 2 - Random Intercepts and Slopes
In this example, I will fit a hierarchical linear model to the ECLS-K 2011 data. The outcome of interest here is the child's kindergarten math score. I will illustrate how to fit the basic multilevel model with random intercepts, a model with random intercepts *and* random slopes, and compare models with a likelihood ratio test. I also show how to extract the variance components of the models and form the intra class correlation coefficient. Lastly, I will visualize the implied regression lines for the random intercepts and slopes model, highlighting the variation in the effect of household poverty status on children's math scores between schools in the data.
Bayesian Multi-level Regression Models Using INLA
This example uses data from the 2011 BRFSS and the American Community Survey to fit Bayesian multi-level regression models using the INLA approach
Spatial Modeling with R-INLA
This example shows how to fit some basic Bayesian spatial regression models using R-INLA. I apply the method to US infant mortality data at the county level.
Spatial Regimes and Geographically Weighted Regression in R
This is an example, with notes, on using Geographically weighted regression and other forms of analysis of spatial regimes.
Measuring residential segregation in R
This document illustrates how to calculate several commonly used segregation indices used in social science
GLM’s for Spatial Data
These notes review fitting GLMs to aggregate data. Binomial, Poisson and Negative Binomial models are shown, with a few others. I also cover how to implement Moran Eigenvector filtering in a GLM. All data are for mortality rates for the state of Texas from the CDC Wonder.
DEM 7263 Fall 2015 - Spatially Autoregressive Models 2
This lecture describes alternative spatially autoregressive model specifications, and the use of specification testing
DEM 7263 Fall 2015 - Spatially Autoregressive Models 1
These are notes for my Spatial Demography course. This lecture deals with the spatially autoregressive model. The model is reviewed and several applications are shown using real data for San Antonio, TX an US counties
Lecture 1 Exploratory Spatial Data Analysis, August 26th
This lecture reviews the principles of explortory spatial data analysis, highlighting the use of the Moran I statistic for assessing spatial autocorrelation using data from San Antonio, TX
Bayesian Spatio-temporal analysis of mortality differentials in the US using the INLA approach
These are slides from a talk I gave on 4/24/15 in the statistics department at UTSA. http://business.utsa.edu/mss/mss_seminar_series.aspx
DEM 7223 Multistate Model Example
This example will illustrate how to fit a multistate hazard model using the multinomial logit model. The outcome for the example is “type of non-parental child care” and whether a family changes their particular type of childcare between waves 1 and 5 of the data. The data from the ECLS-K.
DEM 7283 - Multi-level models 3 - Small Area Estimates
This example will illustrate a way to combine individual survey data with aggregate data on counties to produce a county level estimate of basically any health indicator measured using the BRFSS. The framework I use below takes observed individual level survey responses from the BRFSS and merges these to county level variables from the ACS. This allows me to estimate the overall regression model for county-level prevalence, controlling for higher level variables. Then, I can use this equation for prediction for counties where I have not observed survey respondents, but I have observed the county level characteristics. In this example, I estimate county level obesity rates and compare my estimates to those from the CDC.
Example 9 - Competing risks hazard models
This example uses data from the National Health Interview Survey (NHIS) linked mortality data obtained from the Minnesota Population Center’s IHIS program, which links the NHIS survey files from 1986 tp 2004 to mortality data from the National Death Index (NDI). The death follow up currently ends at 2006. Below, I code a competing risk outcome, using four different causes of death as competing events, and age at death as the outcome variable.
Example 8 Multilevel Models 2 - Cross level interactions and GLMM's
In this example, I introduce how to fit the multi-level model using the `lme4` [package](http://cran.r-project.org/web/packages/lme4/index.html). This example continues from the first example and considers the linear case of the model with random slopes and the use of a higher-level predictor. The example is extended to a binary outcome to illustrate the logistic Generalized Linear Mixed Model (GLMM).
Discrete time hazard model with group level frailty
This example will illustrate how to fit the discrete time hazard model with group-level frailty to continuous duration data (i.e. person-level data) and a discrete-time (longitudinal) data set. In this example, I will use the event of a child dying before age 5 in Haiti. The data for this example come from the Haitian [Demographic and Health Survey for 2012](http://dhsprogram.com/data/dataset/Haiti_Standard-DHS_2012.cfm?flag=0) birth recode file. This file contains information for all live births to women sampled in the survey. The longitudinal data example uses data from the [ECLS-K ](http://nces.ed.gov/ecls/kinderdatainformation.asp). Specifically, we will examine the transition into poverty between kindergarten and 8th grade.
DEM 7283 Example 8 - Multilevel Models 1
In this example, I introduce how to fit the multi-level model using the lme4 package. This example considers the linear case of the model, where the outcome is assumed to be continuous, and the model error term is assumed to be Gaussian.
Event History Analysis - Example 7 Frailty in the Cox model
This example will illustrate how to fit the extended Cox Proportional hazards model with Gaussian frailty to continuous duration data (i.e. person-level data) and a discrete-time (longitudinal) data set. In this example, I will use the event of a child dying before age 5 in Haiti. The data for this example come from the Haitian Demographic and Health Survey for 2012 birth recode file. This file contains information for all live births to women sampled in the survey. The longitudinal data example uses data from the ECLS-K. Specifically, we will examine the transition into poverty between kindergarten and third grade.
Example 7 Principal components analysis
This example illustrates the use of the method of Principal Components to form an index of overall health using data from the 2011 CDC Behavioral Risk Factor Surveillance System (BRFSS) SMART county data. Link.
Event History Analysis - Discrete time hazard model time specifications
This example will illustrate how to fit the discrete time hazard model to person-period. Specifically, this example illustrates various parameterizartions of time in the discrete time model. In this example, I will use the event of a child dying before age 5 in Haiti. The data for this example come from the Haitian Demographic and Health Survey for 2012 birth recode file. This file contains information for all live births to women sampled in the survey.
Example 6 Multiple Imputation & Missing Data
This example will illustrate typical aspects of dealing with missing data. Topics will include: Mean imputation, modal imputation for categorical data, and multiple imputation of complex patterns of missing data. For this example I am using 2011 CDC Behavioral Risk Factor Surveillance System (BRFSS) SMART county data
Example 6 Discrete Time Hazard Model Part 1
This example will illustrate how to fit the discrete time hazard model to continuous duration data (i.e. person-level data) and a discrete-time (longitudinal) data set. In this example, I will use the event of a child dying before age 5 in Haiti. The data for this example come from the Haitian Demographic and Health Survey for 2012 birth recode file. This file contains information for all live births to women sampled in the survey. The longitudinal data example uses data from the ECLS-K. Specifically, we will examine the transition into poverty between kindergarten and 8th grade.
DEM 7283 Example 5.2 Count data models for aggregate outcomes
This example continues the coverage of the use of count data models. Instead of using individual survey data, in this example, I use truly aggregate counts. The data consist of county-level counts of deaths and population totals for US counties between the years 1999-2010. These data come from the CDC Wonder Compressed Mortality File Public Use Data
Event History Analysis - Example 4 Cox Proportional Hazard Model Part 2
This example will illustrate how to fit parametric the Cox Proportional hazards model to a discrete-time (longitudinal) data set and examine various model diagnostics to evaluate the overall model fit. The data example uses data from the ECLS-K. Specifically, we will examine the transition into poverty between kindergarten and third grade.
Example 5 Count Data Models
This example will cover the use of R functions for fitting count data models to complex survey data. For this example I am using 2011 CDC Behavioral Risk Factor Surveillance System (BRFSS) SMART county data.
Event History Analysis - Example 4 Cox Proportional Hazard Model Part 1
This example will illustrate how to fit parametric the Cox Proportional hazards model to continuous duration data (i.e. person-level data) and a discrete-time (longitudinal) data set.
Example 4 Ordinal & Multinomial Logit models
This example fits ordinal and multinomial logit models to complex survey data from the BRFSS.
Event History Analysis - Example 3 Parametric Hazard Models
This example will illustrate how to fit parametric hazard models to continuous duration data (i.e. person-level data). In this example, I use the time between the first and second birth for women in Haiti. The data for this example come from the Haitian [Demographic and Health Survey for 2012](http://dhsprogram.com/data/dataset/Haiti_Standard-DHS_2012.cfm?flag=0) individual recode file. This file contains information for all women sampled in the survey. I also illustrate the use of these models for person-period data from the ECLS-K
DEM 7283 Logit and Probit Model Example
This example will cover the use of R functions for fitting binary logit and probit models to complex survey data. For this example I am using 2011 CDC Behavioral Risk Factor Surveillance System (BRFSS) SMART county data.
Event History Analysis - Example 2 Comparing Survival Times Between Groups
This example covers the 2 and k-sample tests for comparing survival times. It uses data from the Haitian DHS and the Early Childhood Longitudinal Study - Kindergarten cohort.
Example 1 - Estimating functions of survival time from survey data
This example will illustrate how to construct a basic survival function from individual-level data. The example will use as its outcome variable, the event of a child dying before age 1. The data for this example come from the Haitian children's recode file. This file contains information for all births in the last 5 years prior to the survey.
Example 2- Analysis of complex survey data
This example shows how to conduct an analysis of complex survey data, using data from the 2011 Behavioral Risk Factor Surveillance System. It illustrates the differences between methods that assume random sampling and complex survey designs.
DEM 7283 Intro to R
This is a short introduction to using R for descriptive statistics and linear models.
Missing Data Imputation using a structured random effect model
This example presents an example of imputing county level poverty rates using a spatially structured random effect model for Texas counties
Bayesian Measurement Error Models
This example illustrates the use of Bayesian measurement error models. Specifically, the Berkson and Classical measurement error models are illustrated within a hierarchical modeling framework. Data from the Behavioral Risk Factor Surveillance System and Census Bureau's American Community Survey are used.
Bayesian Longitudinal Models
In this example, we will use Bayesian hierarchical models to do some longitudinal modeling of data from the ECLS-K using JAGS. Specifically, we will model changes in a student’s standardized math score from kindergarten to 8th grade.
Longitudinal Models in R - 1
In this example, we will use hierarchical models to do some longitudinal modeling of data from the ECLS-K using functions in the lme4 and nlme libraries. Specifically, we will model changes in a student’s standardized math score from kindergarten to 8th grade.
Bayesian Data Analysis 2 - Hierarchical Models
In this example, I use JAGS and rjags to fit hierarchical linear and logistic regression models with both random intercepts and random slopes. I then show how to use standard diagnostic tools to examine model convergence. The example uses BRFSS data from the state of Texas
Bayesian data analysis 1 - simple regression models
In this example, I use JAGS and rjags to fit simple linear and logistic regression models. I then show how to use standard diagnostic tools to examine model convergence. The example uses BRFSS data from the state of Texas
Other topics in Hierarchical Modeling
In this example I use the Behavioral Risk Factor Surveillance System data from 2011. I illustrate fitting binomial, poisson, negative binomial and ordered logit multilevel models.
Hierarchical models 2: Example of random slopes and intercepts
This example uses the ECLS-K data to examine models of random slopes and intercepts by school.
Introduction to Hierarchical Models
These slides give a short lecture on random slope/intercept models. The empirical example supplementing these slides can be found at : http://rpubs.com/corey_sparks/28060
Introduction to Hierarchical Models
This is the first lecture in my applied hierarchical modeling course
Using survey design weights in Linear Mixed Models
This follows the logic of Carle, 2009 for applying survey design weights to complex survey data.
Example of comparing the mean of multiple groups using the linear model
In this example, we will compare the means of more than two groups using the Analysis Of Variance model, or ANOVA. This will be done in just the same way as the two-group comparison, by using the linear model framework. This example uses data from the Population Reference Bureau's world population data sheet for 2013.
Example of comparing the mean of two groups using the linear model
This example uses data from the World Population Data sheet from the Population Reference Bureau. Instead of the regular t-test for the difference between two means, I use the linear model.
Example of doing some basic recoding of variables
This example shows a basic recoding of variables using ifelse(), then using the new variables to visualize differences in central tendency. The data are from the Population Reference Bureau's World Population Data Sheet for 2013: http://www.prb.org/Publications/Datasheets/2013/2013-world-population-data-sheet.aspx