RPubs

by RStudio

JosieGallop

Josie Gallop

Recently Published

STA 490 Final Project

My final group project for STA 490. This project looks at a survey data set done regarding student experience in college, particularly focusing on loyalty and satisfaction. We will look at the most significant factors in regards to a student's overall satisfaction and loyalty.

20 days ago

Survey Analysis: Principal Component Analysis

For this assignment, we will look at a survey data set done on student experience and satisfaction with their time in college. We will perform internal reliability analysis of the data along with principal component analysis within this project.

about 1 month ago

Survey Analysis: Internal Reliability

In this project, I will look at a survey data set which looks at college students and their student experience, along with their overall satisfaction. I will perform some exploratory data analysis steps and assess the internal reliability of these two subsets of the survey data set.

about 1 month ago

STA 490 Sampling Presentation of Combined Bank Loan Data

A group presentation done with using various sampling methods on a combined U.S. bank loan data set. We will look at calculating the default rates for the population and the various random sampling methods in order to determine which sampling method would be the best for our data.

2 months ago

Sampling Design with the Combined Bank Loan Data

In this assignment, we will look at a data set which observed bank loan data. We will conduct steps of sampling preparation to analyze and prepare this data for further analysis. We will address concerns such as potential missing values, along with creating and redefining some of the variables in the original data set to make them more meaningful and relevant for future analysis. We will then conduct the random sampling process. We will take various types of samples including a simple random sample, a systematic sample, a stratified sample, and a cluster sample.

3 months ago

EDA with the Combined Bank Loan Data Set

In this assignment, we will look at a data set which observed bank loan data and conduct various exploratory data analysis steps in order to address concerns such as missing values, along with creating and redefining new variables which add more meaning and relevance to our data set.

3 months ago

STA 321 Logistic Regression Project Presentation: Predicting a Patient's Odds of CHD

The finalized group presentation of my STA 321 logistic regression project. This project uses logistic regression to create models which predict a patient's odds of being at risk for developing CHD based upon various personal and medical factors. This presentation was created as a group, and the contributions of each group member are listed at the end.

3 months ago

STA 321 Logistic Regression Project Presentation: Predicting a Patient's Odds of CHD

3 months ago

STA 321 Logistic Regression Project: Predicting a Patient's Odds of CHD

A HTML presentation on a logistic regression project which uses binary predictive modeling to predict a patient's odds of being at risk for developing CHD in a 10-year period of time.

3 months ago

Practice Ninja Presentation

A practice ninja presentation using my logistic regression project from STA 321.

4 months ago

Kepler Exoplanet Search: Detecting and Confirming Planets Beyond our Solar System

In this project, I will use the Kepler Space Observatory data to analysis the observations of detected and confirmed exoplanet findings along with potential candidate exoplanet observations. This project looks at both unsupervised learning models, such as PCA, and supervised learning models, such as linear regression and classification, to learn more about the factors that have significance in whether a sighting is confirmed as a true exoplanet, or if it is merely a false positive sighting.

4 months ago

Monthly Average Air Travel Passengers: Time Series with Exponential Smoothing

In this project, I looked at a time series which recorded the monthly average number of air travel passengers on all U.S. flights over the course of 2003 to 2023. I used several different exponential smoothing methods to create various models and determine which one provided the best performance.

6 months ago

House Sale Prices from 2007 to 2019: Time Series Forecasting with Decomposing

In this project, I analyzed a time series of a collection of home sales prices from 2007 to 2019. I created a monthly time series which looks at the average home sale prices by month. I used classical and STL decomposition to observe the patterns and trends in this time series, and forecasting to estimate the average home sale prices for the next twelve months. I also looked at what would be the ideal sample size of observations for a training data set made from this time series.

6 months ago

Quassi Poisson Regression Model of the Cyclists on the Williamsburg Bridge

In this project, I created standard Poisson regression models on frequency counts and rates of the cyclists on the Williamsburg Bridge, along with a Quassi-Poisson regression model to look at the dispersion.

7 months ago

Poisson Regression of the Counts and Rates of Cyclists on the Williamsburg Bridge

A Poisson regression model project of the counts and rates of cyclists on the Williamsburg Bridge.

7 months ago

Predicting a Patient's Odds of Being at Risk for Developing CHD- Binary Predictive Modeling

In this project, we will create several candidate models for the purpose of using a multiple logistic regression model to predict the odds of an individual being at risk for developing coronary heart disease (CHD) over a 10-year period. We will use cross-validation to determine which candidate model has the greatest predictive power. We will also use ROC analysis the determine which candidate model has the greatest global goodness.

7 months ago

Predicting a Patient's Odds of Being at Risk for Developing CHD- Multiple Logistic Regression

This project utilizes multiple logistic regression to build a model which can be used to predict a patient's odds of being at risk for developing coronary heart disease (CHD) based upon various medical and personal risk factors.

7 months ago

Using Diastolic Blood Pressure to Predict a Patient’s Odds of Being at Risk for Developing CHD- Simple Logistic Regression

In this project, I created and analyzed a simple logistic regression model to predict the odds of a patient being at risk for developing coronary heart disease (CHD) over a 10-year period of time based on their diastolic blood pressure level.

8 months ago

Factors Affecting Forest Fires MLR Project Report

Analyzing the factors which affect the area of land affected by forest fires through the use of multiple regression models and bootstrap confidence intervals.

8 months ago

Factors Affecting Forest Fires- Multiple Linear Regression

A statistical analysis of the various factors affecting the area burned by a forest fire. This project tests several multiple regression models for this data to see which one provides the best utility for the prediction and estimation of the area affected by a forest fire.

8 months ago

Sign In

JosieGallop

Josie Gallop

Recently Published