# njcooper137

## Recently Published

##### Eng Phys 2 Week 2
End of Ch 17, beginning of 18
##### Eng Phys 2 Week 1
Lecture Slides for Week 1
##### Data 643 Discussion 3
Stepping through a hypothetical scenario of someone getting fired via a Employee Evaluation machine learning algorithm.
##### Data Science In Context - Eigenvalues and Eigenvectors
Here I try to step through a couple examples of context for Eigenvalue problems in Data Science.
##### 621 Final Presentation Slides -Draft
Slides for analysis of the Final Project of DATA 621
##### Correlation: Pearson v. Sprearman, v. Kendall
An example I made for a DATA 621 discussion.
##### Regression Analysis of Crime Statistics
In this report, I analyze crime statistics for Boston in 1978 using Logistic Regression to classify neighborhoods as high crime or low crime. I find that the clearest indicator if a neighborhood is high crime is high air pollution measured by concentration of nitrogen oxides, which is caused by fossil fuel combustion.
##### DATA 621 Question
A write up for a discussion question in DATA 621. The question has to do with explaining to others how outliers can influence regression.
##### DATA 606 Final Project
In this paper, I analyze salary and unemployment data pertaining to college majors. These data were obtained from fivethirtyeight.com's github page. I find that STEM Majors earn more with less unemployment than Humanities, and that gender inequality in these two categories could play a role in the difference in pay.
##### DATA 607 Final Project
This is the final version of the analysis that Chunhui Zhu and I did of world electrical energy production.
##### DATA 607 Final Project Draft
This is a draft of Chunhui Zhu's and my Final Project where we examine the production of electricity and the flow of energy resources.
##### DATA 605 HW15
Our final assignment is about multivariate calculus.
##### DATA 607 Final Project Draft
Draft of the final project for DATA 607 where we analyze energy production and usage for the top 10 economies.
##### DATA 605 Week 15
This week is multivariate calculus.
##### Calculating pi
I use Euler's formula for pi to calculate pi and compare to r's default value.
##### DATA 605 Week 14
This week's assignment covers series and sequences.
##### DATA 607 Database Migration
I migrate a MySQL database to a Neo4J database.
##### DATA 606 Lab8
This week's lab is about multiple regression.
##### DATA 605 Week 13 part 2
Checking another student's work by request.
##### DATA 606 HW8
This Assignment covers Logistic and Multiple Regression models.
##### DATA 605 Week 13
This week's discussion is solving a Surface Area integral using trigonometric substitution and substitution methods.
##### DATA 605 Week 13 HW
This week's homework is a primer in basic calculus.
##### Data 605 HW12
This week's assignment covers multiple linear regression models and transformations to make data fit a linear regression.
##### Kinematics Presentation
A sample lesson on the derivation and use of kinematic equations.
##### DATA 607 Final Project Proposal
Our proposal for the Final Project
##### DATA 605 Week 12 part 2
I explain a couple methods to make non-linear data usable for linear regression.
##### DATA 605 Week 12
This week I perform a multiple regression analysis on Human Resources data from Kaggle.com: https://www.kaggle.com/ludobenistant/hr-analytics/data to see if the factors measures predict job satisfaction.
##### DATA 606 HW7
This assignment looks at the interpretation of linear models and calculating parameters such as slope from R and standard deviations.
##### DATA 606 Lab7
We use linear regression to reproduce the analysis done in the move Moneyball. I find that team Batting Average is the best traditional predictor of runs and On-Base%+Slugging is the best modern predictor of runs. All modern statistics out-preformed traditional statistics in predicting runs.
##### DATA 605 HW11
This week we perform linear regression on the breaking distance of a car vs speed, and see that just because you get a low p-value, it doesn't mean the model is valid.
##### DATA 607 Discussion Week 11 - Draft
This week we are tasked with qualitatively reverse engineering a recommend system. Our group selected grubhub. Essentially we are testing what recommendations are made based on our input and conjecturing what models grubhub uses.
##### DATA 605 Week11
We begin linear regression by examining the relationship between video duration and views for TED talks.
##### DATA 607 Project 4
We were tasked with using the text mining package in r, 'tm' and supervised learning techniques to classify emails as 'spam' or not. I was able to get >96% accuracy.
##### DATA 607 Context Presentation
This draft of my "Data Science in Context" presentation provides a basic formula for creating a word cloud from a data frame.
##### DATA 605 HW10
This assignment explores the Gambler's Ruin problem from 2 different strategies. First using a constant bet, and then using a increasing bet. Markov Chains, Binomial Distribution and Simulations are used.
##### DATA 605 Week10
This week's discussion question involves the Gambler's Ruin problem.
##### DATA 607 HW9
The goal of this assignment is to extract data from the NYT's API in the form a json file and to format it as an R data frame.
##### DATA 606 Lab6
This lab covers inference of proportions.
##### DATA 605 HW9
This weeks assignment covers CLT for independent random variables and Moment Generating Functions.
##### DATA 606 HW6
This chapter covers proportion tests, calculating confidence intervals and Chi-sq tests.
##### DATA 606 HW Chapter 5
This assignment cover Hypothesis testing using Confidence Intervals, t-tests and ANOVA.
##### DATA 606 Lab5
In this week's lab we calculate confidence intervals and perform hypothesis testing on data about pregnancies.
##### DATA 605 Week9
This weeks discussion in DATA 605 tests the Central Limit Theorem for proportions.
##### DATA 607 Project 3 Presentation
Slides for DATA 607 presentation.
##### Data 606 Project Proposal
Minor correction made on the original.
##### Data 606 Project Proposal
This is my proposal for the final project for DATA 606. I will take an in depth look at incomes and employment statistics for 173 college majors using data obtained from the fivethirtyeight.com github page.
##### Project 3 Presentation Draft
This is a rough draft of the presentation slide for DATA 607 Project 3.
##### DATA 607 Project 3 Salaries
Extended Silverio's work to include Confidence Intervals, t-tests, and KS tests for salaries. I also adjusted salaries for Cost of Living Index.
##### DATA 605 HW8
This weeks assignment covers convolution of discrete and continuous random variables.
##### DATA 605 Week8
In time for the World Series, here I calculation the probability of at bat outcomes given a probability distribution for 4 at-bats.
##### DATA 607 HW7
This weeks assignment covers loading data from web-based formats: hmtl, xml, and json into r in the form of data frames.
##### DATA 605 HW7
This weeks assignment covers important probability densities and distributions, such as the Beta, Geometric, Exponential, Binomial, and Poisson.
##### DATA 606 HW4
This assignment covers calculating confidence intervals, p-values, and hypothesis testing.
##### DATA 606 Lab4b
This lab examines how to define a confidence intervals. Please note that the data in this file will be different than the data I had in R studio while writing, so the answers may not match the graphs and summary statistics.
##### DATA 605 Week7
This week's discussion covers common continuous probability densities and discrete distributions.
##### DATA 607 Project 2
The goal of this Project is to take data from three different sources. In this case two .csv files and one scraped from a web page, and use tidyr and dplyr to clean and reorganize the data for further analysis.
##### DATA 605 HW6
This assignment covers combinatorics and probability.
##### DATA 605 Week6
This week's discussion covers combinatorics and conditional probability.
##### DATA_606_Lab4a
This lab explores behaviors of sampling and populations needed to introduce the Central Limit Theorem and Confidence Intervals.
##### DATA 605 HW5
This assignment covers defining probability distributions and calculating probabilities from probability distributions.
##### DATA 607 HW5
We were tasked with tidying and analyzing a data set using r's tidyr and dplyr. I opted to use an SQL database for the starting data instead of a .csv file
##### DATA 606 HW3
This homework assignment covers probability distributions, namely the Normal, Geometric and Binomial distributions.
##### DATA 605 Week5
This week in DATA 605 is probability distributions. Here I use a basic simulation to solve a popular urban legend about a professor tricking his students, after they try to trick him/her.
##### DATA_606_Lab3
This Lab covers the properties of the Normal Distribution.
##### DATA 605 HW4
This week's homework covers the svd decomposition of a matrix and finding the inverse matrix from it's co-factors.
##### DATA 607 Project 1
In this project we were tasked with taking a semi-structured .txt file and creating a R markdown file that would output a .csv that can be used to populate a SQL database.
##### DATA 605 Week4
This week's topic is Linear Transformations. In this discussion question I show that a transformation is linear.
##### DATA 607 Extra Credit Questio
This was an optional question for Week 3 HW
##### DATA_605_HW3
This week's DATA 605 homework covers matrix ranks, eigenvalues, and eigenvectors.
##### DATA_607_HW3
This assignment covers using regular expressions to extract data from files.
##### DATA_606_HW2
This homework set covers Probability and discrete random variables.
##### DATA_605_HW2
This homework covers Matrix operations such as trace, transpose, matrix multiplication, and factorization.
##### DATA_606_Lab2
This lab analyzes Kobe Bryant's shooting performance during the 2009 NBA finals to test the "hot hands" hypothesis. We find that Kobe performed no better that a simulation where the simulated shooter's hit percentage was set at Kobe's hit percentage.
##### Presentation Question for DATA 606
We have to present a practice problem from the text. I am presenting 1.23 which evaluates the methodology used in a survey.
Something I wanted to add to my discussion for DATA 605
##### N Cooper 605 Discussion
This is my week 2 discussion question to DATA 605.
##### DATA_606_Lab1
This Lab demonstrates techniques for initial data summaries and visualizations and how to subset data.
##### DATA_606_HW1
My solutions to the first homework set for CUNY DATA 606, Statistics and Probability for Data Analytics.
##### DATA_605_HW1
This is my first homework for DATA 605, Fundamentals of Computational Mathematics. This assignment mostly covers vector and matrix operations and solving systems of equations.
##### DATA_606_Lab0
This is a lab for CUNY DATA 606 to familiarize the student with the functionality of R and Rstudio.
##### DATA_607_HW1
This is my submission for CUNY's MS in DATA Science's DATA 607 Homework 1. The objectives were to load the data on mushrooms from a website using R into a data frame, then create a subset of that data frame with 3 or 4 columns from the original data. Finally we were tasked with relabeling the data headers and categories into a more readable format. I also added a few visualizations.
##### Pittsburgh Bridges
This is the test lab for DATA 607 for the CUNY MS in Data Science program.
##### Crime Data
This is an analysis of arrest records.
##### Week1 R Bridge
This is the homework for week 1 of the MSDA Bridge program in R