Recently Published
Mathematical Exploration of Neural Networks
This is a mathematical study of Neural Networks with model examples to illustrate and compare the important ideas of the topic using the nnet and mxnet packages with the caret package. Use of a multi-core performance enhancement with the doParallel package is implemented. This was part of a predictive analystics course in my master's program.
Random Forest Classifier for Spam Detection
Discussion on decision trees with mathematical underpinnings and coding of model with parameter tuning using grid search. This was part of an assigment for a predictive analytics course as part of my master's degree.
Spam Detection with Gradient Boosting using Grid Search
Using the spam7 dataset from the DAAG package, this project explores the mathematics of the gradient boosting regression and classification algorithm using the caret package with parameter tuning using grid search. This was part of an assigment for a predictive analytics course.
Exploration of Support Vector Machines
Exploration of mathematical underpinnings and use of R to model build in the area of margin classifiers up to support vector machines. This was part of an assignement for a predictive analytics course.
Evaluation of Black-Scholes and Cox-Ross-Rubenstein Option Pricing Models
Evaluation of mathematical underpinnings of Black-Scholes and the Cox-Ross-Rubenstein models using input based models and web scraped options pricing data. This was part of an assigment for a financial engineering course in my master's program.
Exploration of CART, Naive Bayes Classifier, Random Forests, Linear Discriminant Analysis with Cross Validation
Exploratory data analysis of a data set of mushroom data and evaluation of several models to find the best predictor of edibility. Models used include classification and regression trees, naive Bayes classifier, random forests and linear discriminant analysis. Cross validation performed to optimze performance.
Parametric and Non-Parametric Probability Model Fitting
This is a mathematical exploration of the process of fitting both parametric and non-parametric probability distributions to a data set. What is the "right" distribution and how do we know? This project was part of a loss model course as part of muy masters degree program.
Using Auto data set to explore LDA, QDA, LR, and KNN
This was part of an assignment from a predictive analytics course that details the mathematics underlying Linear Discriminant Analysis, Quadratic Discriminant Analysis, Logistic Regression, and K-Nearest Neighbors approaches to modeling data.
Bayesian Analysis of Comorbidity Impact on Korean Covid-19 Outcomes
This uses a synthetic data set to create a set of age and comorbidity oriented patient profiles to use with reported patient hospitalization data using a Bayesian analysis framework to build a generalized linear model. The mathematics is built out and the process explained in detail. The project includes the use of RShiny for illustrative purposes. This was a project for my Bayesian statistics course as part of my master's program.
Non-Linear Modeling Approaches - Polynomial Regression, Step Functions, Splines, and GAMs
Analysis of non-linear modeling approaches following chapter 7 of An Introduction to Statistical Learning to flesh out the lectures from a predictive analystics course.
LDA and QDA Models with PCA
We discuss and apply principal component analysis to linear discriminant analysis and quadratic discriminant analyis models using the MNIST data set and examine the effectiveness on a holdout test set.
Analysis of LR, LDA, QDA, GAM models with K-CV Validation
This is an assignment from a predictive analytics course that followed the book An Introduction to Statistical Learning with Applications in R that introduces K-fold cross validation applied to several models on the same data set.