gravatar

randyhowk

Randy Howk

Recently Published

DATA 624 Project 1: Forecasting Project - Parts A, B, and Bonus C
This report covers: Part A: daily ATM cash forecasting for May 2010. Part B: monthly residential power forecasting for 2014. Bonus Part C: hourly waterflow aggregation, stationarity assessment, and one-week-ahead forecasting. Across all three parts, the modeling work depended on fixing data-format and structure problems before any forecasts could be trusted. The ATM file required Excel date conversion and removal of trailing non-series rows, the power file required interpolation of a missing monthly usage value, and the waterflow files required conversion from Excel serial timestamps into regular hourly series. Those cleanup steps were not cosmetic. Each one addressed a specific failure mode that would otherwise distort the time index, break the tsibble structure, or cause the forecasting functions to evaluate the wrong objects.
CUNY DATA 624 Homework 6
FPP3 Chapter 9 ARIMA Exercises
FPP3 Chapter 8 Exercises
— Complete Worked R Markdown
Data Pre-processing
Home work for DATA 624 problems 3.1, 3.2, and 3.3 in the Kuhn and Johnson book Applied Predictive Modeling.
Homework 3: FPP3 Toolbox Exercises
Homework assignment for FPP3 chapter 5
Data 624 Homework 2
Chapter 3 homework
FPP3 Chapter 2 Graphics Exercise
Exploriing tsibble, facet_grid, ACF, etc
Knowledge Graph Hate Incident Classification
This analysis evaluates a Knowledge Graph-powered machine learning system for classifying hate incident terminology against human expert classifications.
Bias in Nobel Prize Awards: Gender, Racial, and Institutional Analysis
This analysis examines systematic biases in Nobel Prize awards across gender, race, ethnicity, and geography using data from the Nobel Prize API. Our investigation reveals: Gender Bias: Only a small percentage of science Nobel laureates are women, with statistical evidence suggesting bias contributes to this gap. Racial Bias: There are no Black scientists who have won science Nobel Prizes in the historical record examined here. The Rosalind Franklin Case: A prominent example of uncredited contributions in the discovery of DNA’s double helix.
Assignment10a-Sentiment Analysis with Tidy Data
This document demonstrates sentiment analysis using tidy data principles. The primary example code is based on Chapter 2 of "Text Mining with R: A Tidy Approach" by Julia Silge and David Robinson【N:cmhhademz01cwoe0fwgxngx3b】.
School Shootings Database Projectument
Our project will analyze trends and contributing factors of school shootings in the United States by modeling the data in a normalized PostgreSQL database. We’ll explore how variables like location, school type, incident characteristics, and demographics relate to shooting frequency and severity.
Three books I like
A comparison of Data Formats using books I like
Wrexham AFC: A Three-Season Journey Through English Football
This analysis examines Wrexham AFC’s performance across three seasons spanning multiple divisions of English Football, from League Two (E3) through League One (E2) to the Championship (E1). The analysis focuses on identifying Wrexham’s closest competitors in each season and evaluating their competitive positioning.
Analysis of Trump Administration LGBTQ+ Policy Actions
An example of taking untidy data from https://glaad.org/trump-accountability-tracker/ and processing it into a form that makes analysis more accessible
Chess ELO Performance Analysis
This analysis calculates expected scores for chess players based on ELO ratings and compares them to actual tournament performance. The analysis uses the ELO expected score formula from Glickman’s “A Comprehensive Guide To Chess Ratings” paper.
Trump Fakes and Fox: Political Content Detection
This analysis examines a large dataset of political news articles to develop automated classification methods for identifying Trump-related and Fox News-related content. The dataset spans multiple years of political coverage during a period of significant media activity and political events. This work builds upon research in automated news categorization and applies scalable text processing techniques to handle large-scale news corpora. The methodology demonstrates efficient processing of substantial datasets while maintaining classification accuracy for media bias detection and political sentiment tracking.