RPubs

by RStudio

ngbolin

Ng Bo Lin

Recently Published

Word Prediction Model Presentation

Presentation slides on explaining the use, methodology and accuracy of my word prediction model.

over 8 years ago

Word Prediction Model

As part of the Data Science Specialisation in Coursera, we are tasked to build a predictive text product. In this R Markdown file, we develop a modified Kneser-Ney prediction algorithm to predict the next most common words arising from a user's text inputs.

over 8 years ago

Coursera - Data Science Capstone: Milestone - Report

As part of the Data Science Specialisation in Coursera, we are tasked to build a predictive text product. Before building our model, we explore and clean the data. Following which, we conduct preliminary data analysis to identify the most frequent unigrams, bigrams and trigrams.

almost 9 years ago

Titanic: Machine Learning from Disaster (Data Cleaning)

Conduct simple exploratory data analysis, data cleaning and data imputation.

almost 9 years ago

Reproducible Pitch: Distribution Visualisation Application

In this app, the student first chooses the distribution that he/she is interested in finding more about, and then sets the parameters for the distribution, to visualize how the distribution looks.

almost 9 years ago

Fuel Efficiency of Various Car Makes

In this data visualisation exercise, we attempt to track the fuel efficiency of specific car makes whilst driving in the city from 1985 to 2000.

almost 9 years ago

Visualizing Taxi Availability in Singapore

In this report, we call the taxi-availability real-time API from data.gov.sg, and visualize the current locations of all available taxis on a leaflet map. The API returns latitude and longitude data of all available taxis at a given timing, which you can define by setting the date_time parameter. In order to call the API, you will have to apply for an API-Key, by creating an account at data.gov.sg. Lastly, in order to obtain the most current data, we leave the date_time parameter blank.

almost 9 years ago

Peer Graded Assignment - Prediction Assignment

In this prediction exercise, our goal is to predict the manner in which the participants carried out the various exercises, using data from accelerometers on the belt, forearm, arm and dumbbell of 6 participants.

about 9 years ago

Titanic: Machine Learning from Disaster

The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This sensational tragedy shocked the international community and led to better safety regulations for ships. One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew. Although there was some element of luck involved in surviving the sinking, some groups of people were more likely to survive than others, such as women, children, and the upper-class. In this challenge, we are tasked to complete the analysis of what sorts of people were likely to survive.

about 9 years ago

Statistical Inference Course Project

about 9 years ago

Taxi Temporal Analysis

Illustrating the impact of the entry of private-hire car companies on the taxi industry through simple data visualisations

about 9 years ago

Reproducible Research - Course Project 2

In this report, we identify severe weather events which are most harmful with respect to population health, as well as events which have the greatest economic consequences. To identify these events, we leverage on data from the NOAA Storm Database, which tracks characteristics of major storms and weather events in the United States. From our analysis, we find that tornados are the most harmful with respect to population health, having killed over 5000 people and injuring more than 90,000 people, while Floods have the greatest economic impact, causing the US more than $100 billion in collateral damage.

about 9 years ago

Sign In

ngbolin

Ng Bo Lin

Recently Published