Investigating the incidence of missing persons in the United States. ----- Preliminary analysis will look at missing person rates by state, dominant missing sex by state, and odds of being missing by sex & state.----- We will conduct an ecological analysis to determine whether or not females & males are going missing at different rates. We will do the same to compare whites & minorities? ------- To conduct non-ecological analysis, we will evaluate the number of days that a person has been missing for, by sex, using Complete Pooling, No Pooling, and Partial Pooling methods.
Using Logistic & Multinomial regression, I will evaluate the impact of a films production budget on IMDB movie ratings. Drama & Comedy films only.
Using GLM Analysis, we will determine whether boys or girls are more likely to perform above the average in Math and ELA NY State Tests
Using the sample "turnout" dataset from R, we will identify whether or not income dispersion increases or decreases as a function of age and/or education level. This assessment will be done using a Maximum Likelihood Estimation function.
What % of each counties total arrests are drug related? A simple visualization using 2012 US Crime Data from the Uniform Crime Reporting Program, obtained via the Social Explorer.
Using Zelig & Logit, we will evaluate whether an increase in price also increased the likelihood for daily revenue to perform above the 3-year average.
Using Zelig & Logistic Regression Model (Logit) we estimate the probability of surviving the titanic based on sex, age, fare price, and cabin class.
This is a draft of potential research questions, to be refined as the questions are further considered and developed.
Notes as I follow along "R for Data Science" from O'reilly, chapter 3 on basic visualizations using ggplot2.