gravatar

ksb357

Kavya Beheraj

Recently Published

DATA 605, Final Exam
DATA 605, Week 13: Calculus
DATA 605, HW 11: Linear Modeling
Using the “cars” dataset in R, build a linear model for stopping distance as a function of speed and replicate the analysis of your textbook chapter 3 (visualization, quality evaluation of the model, and residual analysis.)
DATA 605, Week 11 Discussion: Linear Modeling
How well does the percentage of students in a school who are eligible for free or reduced-price lunch explain the school's average critical reading SAT score?
DATA 605, HW 10
DATA 605, HW 7: Probability Distributions
General probability distribution of min value among random variables, calculated various types of distributions for machine failure problem
DATA 605, HW 4: Matrix Inverse and SVD
Verified that SVD and eigenvalues are related, created function to compute inverse of square matrix
DATA 605, Discussion 3
DATA 605 HW 3: Eigenvalues and Eigenvectors
Determined matrix rank, found eigenvalues and eigenvectors for a matrix, tested against eigen package.
Document
DATA 605 HW 2: Basic Matrix Operations and Properties
Wrote an R function to factorize a square matrix A into LU.
DATA 605, Discussion 2
DATA 605 HW 1: Vectors, Matrices, Systems of Equations
Calculated dot products, angles, and created a Gaussian elimination algorithm.
DATA 605, Discussion 1: Systems of Linear Equations
Solved a simple system of linear equations.
DATA 607 - Tidyverse Recipe: unnest()
In this project, I put together this short tutorial on the unnest() function for an upcoming book on the tidyverse. This book was co-created by the spring 2018 class of DATA 607 (Data Acquisition & Management) students at the CUNY School for Professional Studies.
DATA 607, Final Project – Music Recommender with Neo4j
In this project, we created a song recommender system based on nearest-neighbor collaborative filtering using the Million Songs Database. We trained and evaluated our data source in Neo4j, a graph database platform, and (as a stretch goal) built out a proof-of-concept user interface in RShiny.
DATA 607, Week 12 - Recommender System: StumbleUpon
In this assignment, I performed a scenario design analysis on StumbleUpon and its users, researched how the platform works, and provided suggestions for improving its recommender system.
DATA 607, Project 4: Document Classification
We ran Support Vector Machine and Maximum Entropy classifiers on a spam/ham email corpus to evaluate and compare predictive power.
DATA 607, Project 3 - The Most Valued Data Science Skills
In this project, we used supervised and unsupervised data mining techniques on a scraped dataset of 1,303 Indeed job listings to answer the following question: What are the most valued data science skills?
Data 606 - Lab 4b
DATA 606 - Lab 4a
DATA 607, HW 07 – XML and JSON in R
In this assignment, I recorded information for three different books, stored the information in an HTML file, XML file, and JSON file, and read the files into R to see how they compare.
DATA 607 - Project 02
In this project, I imported three CSV datasets, tidied them, and answered questions about them.
DATA 607, Week 5 – Tidying and Transforming Data
In this assignment, I practiced tidying and transforming data for downstream analysis.
DATA 607 – Project 01
In this project, I was given a text file with chess tournament results where the information had some structure. My goal was to create an R Markdown file that generates a CSV file with requested information for all players.
DATA 607, Week 3 – Regular Expressions
The purpose of this assignment was to practice using regular expressions in R. These questions are from the end of Chapter 8 in the textbook Automated Data Collection with R.
DATA 607 Week 1 - Basic Loading and Transformation
In this assignment, I studied the Mushrooms dataset, renamed the columns and variables to make it easier to understand, and created subsets of the data to answer a few questions about it.