gravatar

PursuitOfDS

Y. Yu

Recently Published

Water Potability Data Analysis and Classification
Data Visualization. Using logistic regression, SVM and random forest to classify whether water is drinkable or not.
Amazon Bestsellers Data Exploration and Category Classification
Data Visualization and Machine Learning (Logistic Regression & Random Forest) on Amazon Bestsellers data set.
dplyr::summarize() V.S. dplyr::summarise()
Besides the spelling difference, what other difference can you tell from summarise() and summarize() in dplyr package? This blog post can shed some light in this regard.
Data Visualization and Sentiment Analysis on Trending Youtube Statistics
Using tidyverse and tidytext library to dive into a youtube daseset and providing dataviz and sentiment analysis.