gravatar

Priya_Shaji

Priya Shaji

Recently Published

Data 624 Homework 9
Homework 9
Data 624 Week 11
Homework 8
Data 624 Homework 7
Week 10 hw 7
Data 624 Project 1
Project 1
Data 624 Homework 6
Forecasting Principles and Practice
Data 624 Week 6 Homework 5
Data 624 Week 6 Homework 5
Week5_Exercises
Applied Predictive Modeling
NYC 311 complaint analysis
Analysis of NYC 311 complaints and twitter sentiment analysis of nyc 311
Datahub
Generalized metadata search and discovery tool.
Data 621_Homework_1
Regression analysis
Data 612 | Final Project
Donor Matching Recommender System
Data 612 | Final Project Proposal
Donor Matching Recommender System: Project Proposal
Data 612 Assignment 4
Project 4
DATA 612 Project 4 | Accuracy and Beyond
Assignment 3
Project 2 | Content-Based and Collaborative Filtering
Evaluating and comparing different approaches, using different algorithms, normalization techniques, similarity methods, neighborhood sizes, etc. and building a basic movie recommender system.
Research Discussion Assignment 1
Choose one commercial recommender and describe how you think it works (content-based, collaborative filtering, etc). Does the technique deliver a good experience or are the recommendations off-target?
DATA 612 Project 1 | Global Baseline Predictors and RMSE
In this first assignment, we’ll attempt to predict ratings with very little information. We’ll first look at just raw averages across all (training dataset) users. We’ll then account for “bias” by normalizing across users and across items.
DATA 606 Final Project
Adolescent Pregnancy Analysis
DATA 607 Final Project
Medical Recommender System Establishing a Medical Recommender System that can give recommendation with excellent efficiency and accuracy based on diagnosis and symptoms.
Tidyverse Assignment Part 2
Extend an Existing Example. Using one of your classmate’s examples, extend his or her example with additional annotated code.
NoSql Migration
For this assignment, you should take information from a relational database and migrate it to a NoSQL database of your own choosing. For the relational database, you might use the flights database, the tb database, the "data skills" database your team created for Project 3, or another database of your own choosing or creation. For the NoSQL database, you may use MongoDB (which we introduced in week 7), Neo4j, or another NoSQL database of your choosing. Your migration process needs to be reproducible. R code is encouraged, but not required. You should also briefly describe the advantages and disadvantages of storing the data in a relational database vs. your NoSQL database.
Lab_8
Homework_8
Tidyverse Assignment
"Dplyr", one of the tidyverse package, is used to explore a dataset "drug_use_by_age" and a programming sample “vignette” is executed that demonstrates how to use one or more of the capabilities of the selected TidyVerse package. dataset source: fivethirtyeight.com
Recommender Systems
Introduction to Linear Regression
Homework Chapter 7
Introduction to linear regression
In this lab, we’ll be looking at data from all 30 Major League Baseball teams and examining the linear relationship between runs scored in a season and a number of other player statistics. Our aim will be to summarize these relationships both graphically and numerically in order to find which variable, if any, helps us best predict a team’s runs scored in a season.
DATA_606 Project Proposal
Data 606 Project Proposal
Inference for Categorical Data
Inference for Categorical Data
Assignment 9
Our task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data and transform it into an R data frame.
Presentation Data 606
Inference for Numerical Data
Inference for numerical data
Data Science Thought Leadership
Who are today’s “thought leaders” in data science? What are the topics that data scientists care most about? How do these change over time, and across geographical location?
Document
Pick three of your favorite books on one of your favorite subjects. At least one of the books should have more than one author. For each book, include the title, authors, and two or three other attributes that you find interesting. Take the information that you’ve selected about these three books, and separately create three files which store the book’s information in HTML (using an html table), XML, and JSON formats (e.g. “books.html”, “books.xml”, and “books.json”). To help you better understand the different file structures, I’d prefer that you create each of these files “by hand” unless you’re already very comfortable with the file formats. Write R code, using your packages of choice, to load the information from each of the three sources into separate R data frames. Are the three data frames identical?
Data 607 Project 2
Preparing different datasets for downstream analysis work
homework_4
Tidying and Transforming Data
The Normal Distribution
Chess Data Extraction
In this project, you’re given a text file with chess tournament results where the information has some structure. The task is to create an R Markdown file that generates a .CSV file (that could, for example, be imported into a SQL database) with the following information for all of the players: Player’s Name, Player’s State, Total Number of Points, Player’s Pre-Rating, and Average Pre Chess Rating of Opponents For the first player, the information would be: Gary Hua, ON, 6.0, 1794, 1605 1605 was calculated by using the pre-tournament opponents’ ratings of 1436, 1563, 1600, 1610, 1649, 1663, 1716, and dividing by the total number of games played.
HotHands Analysis
Probability Homework
Movie survey
Introduction to Data
Introduction to data
Exercises_Data607
Arbuthnot_Present Analization
Analyzing two similar datasets but differed on their scale count, using basic R commands.
Data Analysis on Mushroom Family
In this publication, we learn about different transformation tasks performed on the given mushroom dataset. Transformation Tasks help us to analyze the factors affecting the data and also to rename abbreviations and column names to make it more user-friendly and effective.