gravatar

genmid13

Genesis Middleton

Recently Published

Prevalence of Diabetes
This project looks into the prevalence of diabetes across the world by understanding the lowest and highest diabetic countries of each continent divided by gender. (IN PROGRESS)
Analysis on NY's Market Research job market
This project takes a look into the current stance of NY's current job market for Market Researchers by web scraping 120 job postings from Glassdoor.com
Music Analysis
Project 4 - Document Classification
This project uses spam and "ham"(not spam) emails from https://spamassassin.apache.org/old/publiccorpus/ to train a mode to predict how future emails will be classified.l
Deconstructing Spotify's Recommendation System
The task is to analyze an existing recommender system that you find interesting. Perform a Scenario Design analysis as described below. Consider whether it makes sense for your selected recommender system to perform scenario design twice, once for the organization (e.g. Amazon.com) and once for the organization's customers. Attempt to reverse engineer what you can about the site, from the site interface and any available information that you can find on the Internet or elsewhere. Include specific recommendations about how to improve the site's recommendation capabilities going forward. Create your report using an R Markdown file, and create a discussion thread with a link to the GitHub repo where your Markdown file notebook resides. You are not expected to need to write code for this discussion assignment.
TidyVerse Assignment
The dataset used for this analysis assignment is taken from Kaggle titled “Diabetes Dataset By Age Standardized Countries”. The dataset includes information from 200 countries and territories and covers the period from 1980 to 2014. The data is presented in both male and female categories, and estimates are given for different age groups ranging from 20-79 years old. The data is standardized to account for differences in age distributions across countries and over time. This analysis however will only look into Latin American countries and how the estimated prevalence of diabetes differ across the countries by gender through using the dplyr and ggplot packages part of the tidyverse library.
Sentiment Analysis - Assignment #10
This assignment is focused on reproducing a sentiment analysis for Chapter 2 in Text Mining with R by Julia Silge and David Robinson. The project is extended further by also analyzing the novel Les Miserables.
Critics Movie Reviews - Assignment 9
The data analyzed in this assignment was movie reviews critiques and their critiques, pulled from the New York Times API. The task included accessing the chosen information contained in the API for it to then bwe read in the JSON data, and transformed into an R DataFrame.
Project #2
The assignment for this project entailed retrieving three wide datasets posted previously by the class in a discussion board and conducting the analysis the poster intended for their dataset.
Assignment #5
The assignment instructs of creatig a dataset in a wide format (reflecting a given chart) to then conduct a comparison analysis.
Chess Tournement- Project 1
For this project, a text file was given with chess tournament results where the information has some structure. The data was manipulated into a workable table that can become basis for future analysis and generated into a .CSV file.
Assignment #3
The coding in this document is being done to gain familiarity with manipulating strings by first using a dataset obtained from fivethirtyeight and retrieving majors using specific keywords, as well as using expressions to identify patterns within strings as a form of filtering.
DATA 607- Assignment #2
connecting SQL database to Rstudio
The Lasting Legacy of Redlining in NY Assignment #1 -Genesis Middleton
This data analysis looks into the segregation of the cities of upstate New York relative to their HOLC-graded living areas