Recently Published
KNN_Iris_Data
In this project I will try to use a KNN (K-nearest-neighbors) method for Iris Species identification. The iris dataset is a built-in dataset in R. The goal of this project is to find out which k-value (No. of Nearest Neighbor) is the best in species identification.
Random_Forest_Heart_Disease
This is the heart disease data set from the UCI machine learning repository. Here I will use these variables to build a random forest model, to predict if a patient have heart disease according to the data provided
Logistic_Regression_Heart_Disease
This is the heart disease data set from the UCI machine learning repository.
Here I will use these variables to build a logistics regression model, to predict if he/ she have heart disease.
Logistic Regression Model - Determine the Student’s Success Admission Rate
Logistic Regression Model - Determine the Student’s Success Admission Rate
The Top Billionaires
According to Bloomberg/Forbes, the 10 richest people on the planet.
Data for the last five years. Interesting dynamics of capitalization of companies. How do people make money in technology companies.
Sources: https://www.kaggle.com/datasets/alexandrparkhomenko/the-top-billionaires
Shiny-IO Project
Trend in Demographics and Income Explore the difference between people who earn less than 50K and more than 50K. You can filter the data by country, then explore various demographic information.
Shiny-IO + Flexdashboard
Building Shiny-IO with an interactive Flex-dashboard about global country data from 2000-2015.
The link inside can direct to final shiny-io product.
Text Mining of Year 2020 ESG Report from a Power Generator Company
Text Mining of Year 2020 ESG Report from a Power Generator Company
Bayesian Statistics Project: Movie Audience Score Prediction
Develop a Bayesian regression model to predict the Audience Score from a series of explanatory variables about movie.
Testing of R Flex Dashboard
Testing of Flex Dashboard