Recently Published
DATA 622: Homework 4
SVM and oversampling to boost logistic regression performance
Homework 3
SVM model of absenteeism
DATA622 - Homework 2
Decision trees versus random forest models for modeling absenteeism and disciplinary action
DATA622: Homework 1
Modeling using kNN and decision trees using randomly generated dataset
DATA609 - Homework 8, Exercise 1
Neural neets
New York State Maternity Information Statistics
Data visualization of maternity statistics relating to number of births, birth defects, and perinatal centers
DATA609 - Homework 7
Support vector machines using PlantGrowth and iris datasets
DATA609 - Homework 6
Data mining techniques, distance metrics, kNN, kmeans
DATA609 - Homework 5
Logistic regression, PCA, SVD
DATA608 Final Proposal
NY State Maternity Statistics
DATA609 - Homework 4
Data fitting and regression
DATA609 - Homework 3
Optimization
DATA608 - Module 1
5000 Fastest Growing Business, Inc.com
Blog 4: Poisson versus Negative Binomial model
Using performance and countreg packages to visualize overdispersion and zero-inflation in a count model
DATA621 - Week 15 discussion
Nonparametric modeling of prostate data,
Exercise 11.3 in ELMR (p253)
Blog 5: DATA 621 content and its future applications
A look back on DATA 621 curriculum
Blog 3: Binning numerical variables
A description of the benefits and risks of binning a numerical variable, with an example using the insurance dataset for logistic regression.
DATA621 - Week 13 discussion
Using Poisson and inverse Gaussian distributions to model poison-treatment-response on survival in rats. Exercise 7.2 in ELMR
DATA621 - Week 12 discussion
Exercise 5.4 in ELMR (p124)
Pneumo dataset for multinomial prediction of case severity
DATA621 - Week 11 discussion
Discussion Week 11 - Exercise 8.2 in LMR (p131)
Divorce rates and autocorrelation
DATA621 - Week 10 discussion
Exercise 8.1 in LMR: weighted regression using pipeline dataset
DATA621 - Week 9 discussion
Exercise 3.3 in ELMR, modeling dose-response to quinoline on Salmonella spp using Poisson GLM
DATA621 - Week 8 discussion
Exercise 8.2 in EMLR, Miss America regression
Blog 1: Missing values
Using Little's test to measure probability of MCAR and missing value independence
Blog 2: Bimodal distributions
DATA 621 Blog
Discussion 16
APEX Calculus 12.4 Exercise 16
Discussion 15 - Taylor series
Solution to 8.8.14 in APEX Calculus
Discussion 14
Questions 17/25 in section 7.4 in APEX Calculus
Arc length of circle function by integration and Simpson's rule
Discussion 13: multivariate regression
Medication use in diabetic inpatients
Discussion 11: Mammalian sleep patterns
Checking assumptions of linear modelling using built-in R data set
DATA 605 - Discussion 7
Question 44, proof of binomial approximation of hypergeometric distribution
Document
Discussion 4 for DATA 605
DATA 605 - Discussion 3
Question C20
DATA607 - Final Project
Drug Reviews Sentiment Analysis and Safety Profiles
A look at reviews from drugs.com and openFDA API
Assignment - Recommender systems
Discussion submission for 4/19/20 assignment
DATA607 - Assignment Sentiment Analysis
April 5, 2020 assignment - sentiment analysis. Looked at sentiments expressed in Jane Austen and Victor Hugo's works
DATA607 - Assignment Web API's
I took a brief look at a NY Times Bestseller list web API.
DATA607 - Homework 3/15/20
Assignment – Working with XML and JSON in R
Please see my GitHub for xml.txt:
https://github.com/hillt5/DATA607_assignment_3_15_20
DATA607 - Project 2
A brief exploration and analysis of three data sets:
HCAHPS hospital survey data, August 2018 - March 2019
Under 5 mortality rates per 1,000 live births by country, 1950 - 2015
Drug use survey results for ages 12+, 2016 - 2018
Document
Final draft of Project 1