Recently Published
My Final Data Science Capstone Presentation
The goal of this exercise is to create a product to highlight the prediction algorithm that you have built and to provide an interface that can be accessed by others. For this project you must submit:
A Shiny app that takes as input a phrase (multiple words) in a text box input and outputs a prediction of the next word.
A slide deck consisting of no more than 5 slides created with R Studio Presenter pitching your algorithm and app as if you were presenting to your boss or an investor.
My Final Data Science Capstone Presentation
The goal of this exercise is to create a product to highlight the prediction algorithm that you have built and to provide an interface that can be accessed by others. For this project you must submit:
A Shiny app that takes as input a phrase (multiple words) in a text box input and outputs a prediction of the next word.
A slide deck consisting of no more than 5 slides created with R Studio Presenter pitching your algorithm and app as if you were presenting to your boss or an investor.
Santana's Data Science Capstone Project Week 2
The Capstone Project of the Johns Hopkins University Coursera Data Science Specialization is on Natural Language Processing (NLP). The goal of this project is simply to display that I’ve gotten used to working with the data and demonstrate that I’m on track to create my prediction algorithm by the end of the project. I downloaded the data for the project from the Coursera site link (https://www.coursera.org/learn/data-science-project/supplement/idhGA/syllabus) that lead me to the SwiftKey zip files (https://d396qusza40orc.cloudfront.net/dsscapstone/dataset/Coursera-SwiftKey.zip). I then manually accessed the data from my hard drive.
My Final Developing Data Products Project
This is my pitch for my final developing data products project.
My Week 3 Assignment--Developing Data Products
This is my week 3 assignment for the Developing Data Products class. I created a short slide show. Slide #3 shows an interactive plotly 3D plot of the iris data.
My First Interactive Map
This is an interactive map of the US territory of Puerto Rico created in R Studio.
Santana’s Practical Machine Learning Final Project
This report is the final project for the Practical Machine Learning class offered by John’s Hopkins University via Coursera. I
The objective of the project is to predict the manner in which 6 participants performed a variety of different exercises using an existing data set. The exercise type is captured in the “classe” variable in the training set. To complete the class requirements, ultimately the machine learning algorithm described here will be applied to the 20 test cases available in the test data and the predictions are submitted in appropriate format to the Course Project Prediction Quiz for automated grading.
Santana’s Regression Models Final Course Project
The following analysis of the built-in data MTCars data set in R was careated as the final project for the Coursera Regression Models Course. A description of the MTCars data set can be found at the following link: https://www.rdocumentation.org/packages/datasets/versions/3.6.2/topics/mtcars
The objective of the project is to answer the following two questions using the built-in dataset:
“Is an automatic or manual transmission better for MPG” “Quantify the MPG difference between automatic and manual transmissions”
This reports includes all the exploratory data analysis conducted in order to reach a final conclusion on both of these questions.
M Santana's Statistical Inference Final Course Project
This is my final project for the statistical inference course. The project has two parts:
a simulation exercise
basic inferential data analysis
Final Project Coursera Reproducibility Class
The National Oceanic and Atmospheric Administration (NOAA) keeps track of the amount of damage caused by natural events both physically and financially. Events are categorized into 16 discrete categories (Tornado, SNOW/ICE, STORMS, etc). The physical damage is tracked for properties and crops separately. This reports looks at the overall damage caused by these events by adding up the number of fatalities and injuries per category. In the report, I also take a look at the top 20 events recorded nationally. For financial harm, I computed the top 20 events that caused property damage and crop damage. This summary includes all the data process and computations note needed to reproduce the analysis. The results section includes the key data tables and figures all in one place.