RPubs

by RStudio

christianthieme

Christian Thieme

Recently Published

Final Project - DATA622

Use of Decision Tree, Random Forest, KNN, XGBoost, and SVM for modeling which data scientists are looking to leave their current jobs

over 3 years ago

Homework 4 - KNN, PCA, XGBOOST, and SVM

over 3 years ago

Homework 3 - Modeling with LDA, KNN, Decision Trees and Random Forest

over 3 years ago

Project 2 - Non-linear Regression Models

almost 4 years ago

Nonlinear Regression Models, Regression Trees and Rules-Based Models

almost 4 years ago

Project 1 - Time Series Analysis and Forecasting Using ARIMA and ETS

almost 4 years ago

Time Series Analysis Stock Data

almost 4 years ago

Data Pre-processing & Exponential Smoothing

almost 4 years ago

Time Series and Decomposition

Group 4 - Subhalaxmi Rout, Kenan Sooklall, Devin Teran, Christian Thieme, Leo Yi

almost 4 years ago

Influenza and Pneumonia Mortality during the Global COVID-19 Pandemic and the Impact of Local Government Restrictions

Final project can be viewed here: https://github.com/christianthieme/Business-Analytics-and-Data-Mining-with-Regression/blob/main/Final%20Project%20-%20COVID-19%20Effect%20on%20Pneumonia%20%26%20Influenza/final_project_report.pdf

almost 4 years ago

CODE APPENDIX - COVID-19 Effect on Pneumonia & Influenza

Code Appendix for 621 final project

almost 4 years ago

Identifying Outliers in Linear Regression - Cook's Distance

about 4 years ago

Poisson, quasi-Poisson, Negative Binomial, and Zero Inflated Regression

Our objective is to build a count regression model to predict the number of cases of wine that will be sold given certain properties of the wine.

about 4 years ago

Car Insurance Analysis and Logistic Regression

about 4 years ago

Logistic Regression - Identifying High Risk Neighborhoods

about 4 years ago

Understanding Common Classification Metrics - Titanic Style

Discussion on Confusion Matrices, Accuracy, Classification Error Rate, Precision, Sensitivity, and Specificity

about 4 years ago

Knowledge and Visual Analytics Final Project Proposal

about 4 years ago

Understanding Classification Metrics

Review of accuracy, precision, sensitivity, specificity, F1 score, and ROC curve

about 4 years ago

Understanding Linear Regression Output in R

about 4 years ago

Moneyball Multiple Regression Analysis

Full EDA, imputation of nulls, model building, and model diagnostics

about 4 years ago

Inference, Prediction, and Explanation with Linear Regression

about 4 years ago

Ink-Data Ratio ggplot2 EDA

Using principles from Edward Tufte, use ggplot2 to come up with visuals for the EDA questions provided.

over 4 years ago

Inc 5,000 Fastest Growing Companies Web Scrape

The website Inc. 5000 has a list of the top 5,000 fastest growing private companies from 2020. I will use xml2 and rvest to scrape the data from the website.

over 4 years ago

Computational Mathematics Final Exam

DATA605 Final Exam

over 4 years ago

Predicting House Prices: Regression Techniques

Final Project DATA605

over 4 years ago

Multivariable Functions

over 4 years ago

Taylor Series Approximations

over 4 years ago

Episode III - NBA Player Salary with Multiple Regression Using Transformations

over 4 years ago

Univariate and Multivariate Calculus

over 4 years ago

Part II - NBA Player Salary Analysis with Multiple Regression

Can We Predict an NBA Player’s Salary Using His Statistics from the Prior Year?

over 4 years ago

WHO dataset Regression Analysis

Forecasting Life Expectancy

over 4 years ago

NBA Player Salary Analysis with Linear Regression

over 4 years ago

Simple Linear Regression

over 4 years ago

Markov Chains and Random Walks

Prisoner's dilemma questions

over 4 years ago

Week 9 Discussion - Central Limit Theorem

over 4 years ago

Central Limit Theorem & Generating Functions

over 4 years ago

Sum of Random Variables and Law of Large Numbers

over 4 years ago

Distributions, Expected Value, and Standard Deviation

over 4 years ago

Combinatorics, Bayes' Theorem, and Conditional Probability

over 4 years ago

Week 5 - Discussion Discrete Probabilities

over 4 years ago

DATA605 - Week 5 - Probability Distributions

over 4 years ago

DATA 605 Week 4 - Singular Value Decomposition & Matrix Inverses

over 4 years ago

DATA605 - Week 3: Eigenvalues and Eigenvectors

over 4 years ago

DATA605 - Week 2 - Transpose Proof, Matrix Decomposition function

over 4 years ago

DATA605 Week 1 Assignment - Vectors, Matrices, and Systems of Equations

over 4 years ago

DATA606 - Final Project - What factors are most predictive of stress in college students?

about 5 years ago

Project 4: Document Classification - Using Machine Learning to Build a SPAM Predictor

The purpose of this project is to build a classification model that can accurately classify spam email messages from ham email messages. We will do this by using pre-classified email messages to build a training set and then build a predictive model to forecast unseen email messages as either spam or ham.

about 5 years ago

Thieme-Proposal DATA606

DATA606 research proposal

about 5 years ago

Recommender Systems Analysis - Udemy’s Recommender Engine

about 5 years ago

Tidyverse Extend Assignment - Lubridate

Extension of Ken Popkin's Lubridate Tidyverse Create Assignment

about 5 years ago

Week 10 Assignment DATA607 - Sentiment Analysis

The purpose of this project is two-fold: First, to take a deep dive into the mechanics and application of Sentiment Analysis by following an example provided by Juilia Silge and David Robinson from their book “Text Mining with R - A Tidy Approach”. Second, to choose another corpus and incorporate another lexicon, not used in the example below, to perform sentiment analysis.

about 5 years ago

Week 9 Assignment - Working with Web APIs

For this project, I will work with The New York Times web site API

about 5 years ago

Using purrr::map() Instead of For Loops in R

about 5 years ago

Week 7 Assignment - Working with HTML, XML, and JSON in R

The purpose of this project is to demonstrate knowledge of HTML, XML, and JSON, as well as how to parse and extract information from each.

about 5 years ago

Project 2 - Data Transformation

The purpose of this project is to demonstrate the ability to transform data from various wide formats into a more digestible format for analysis. As part of the project, I will also clean/tidy the data and perform analysis. Below you will see three different data sets that were provided by fellow classmates. In addition to providing the data set, each classmate was asked to suggest analysis that could be completed using the data set. I will show the loading, tidying, and analysis of each data set below.

about 5 years ago

Week 5 Assignment DATA607 - Tidying and Transforming Data with tidyr

The purpose of this assignment is to: 1. Demonstrate how to transform data between wide and long formats with tidyr 2. Demonstrate how to tidy messy/unitdy data using tidyr - single entries on multiple lines and missing data 3. Perform data analysis using ggplot

about 5 years ago

Week 4 Project 1- Chess Tournament Data - Regular Expressions

In this project we will take a raw text file containing the results of a chess tournament and extract key infomration from the file and perform some cacluations. What makes this project particularly challenging is that each entry in the file (a single chess player) has data points spanning two rows. Our task will be to find a way to extract the information that we need so that we can create a CSV file where all of the data that we want for an entry, including data we will calculate, is on one row.

about 5 years ago

3 Week Assignment DATA607 - R Character Manipulation

The below assignment is geared toward jumping into character extraction/manipulation with R using Regular Expressions. The examples show how to use regex in a variety of ways such as identifying rows from a dataframe containing certain words, extracting key data from messy datasets, as well as using capture groups, lookbacks, and more to solve for tricky scenarios with word extraction.

about 5 years ago

Sign In

christianthieme

Christian Thieme

Recently Published