Recently Published
Next Word Prediction Slides
Slide deck promoting the Shiny App. A 5 slides explaining the product, the data exploration and methodology.
Swiftkey Next Word Prediction - Analysis Report Part 2
This is part 2 of the Swiftkey Next word prediction of the NLP project.
I made some models for predicting the next word using the data and variables created during the initial Exploratory Data Analysis reporting. I realised that there are still some numbers and symbols existed after it is being cleaned using the `tokens` function from Quanteda package. The reason for this is because the numbers and symbols are attached to the words such as "9AM", "@My" and lots of hashtags.
1. We will remove these unwanted number and symbols by performing a low level be using the `Regex` function in the stringi/stringr package.
2. Improve the speed and effciency of the model.I notice that a small fraction (less than 50%) of unique words accounts for the majority of text and we could use unique words with less than 50% coverage.
Swiftkey Next Word Prediction - Milestone Report
JHU Data Science Capstone Project
Aoc website - Our cause visual improvement proposal
Propose change to the visual of Our Cause section on the website:
https://www.australiaoikoscare.org/
GolfDay
This is a 5 web page presentation using R Markdown that features a Shiny Web App on "Golf Weather".
There absolutely is such a thing as golf weather, although the specific spectrum of perfectly playable weather conditions depends on where you live and a vast majority of dedicated golfers love cooler weather with low humidity and no breeze (or just a little bit of a breeze).
The aim of this web app is to predict whether to play a game of golf in certain weather conditions. To achieve the objective we will be deploying a Machine Learning - Random Forest Model on the Golf|Weather Dataset.
mtcars Plotly
This is a 5 web page presentation using R Markdown that features a plot on mtcars data created with Plotly. It contains an interactive table (page 3) where you can move the columns around and an interactive plot with fitted loess line (mpg~disp).
Interactive Map of iconic locations in Melbourne - Australia
This is a web page created using R Markdown that features an interactive map using the Leaflet package. It contains a created date at the top and today's date at the bottom. This web page also contains iconic locations in Melbourne - Australia and please click on the icons to visit their respective websites.
Human Activity Recognition - Machine Learning
Using devices such as Jawbone Up, Nike FuelBand, and Fitbit it is now possible to collect a large amount of data about personal activity relatively inexpensively. These type of devices are part of the quantified self movement.
Use data from accelerometers on the belt, forearm, arm, and dumbell of 6 participants.to predict the activity that they carried out::
Class A - exactly according to the specification ,
Class B - throwing the elbows to the front,
Class C - lifting the dumbbell only halfway,
Class D - lowering the dumbbell only halfway,
Class E - throwing the hips to the front.
Motor Trend Data Analysis
Regression Models and Exploratory Data Analysis on mtcars dataset
Statistical Inference Project 2 - ToothGrowth Hypothesis Testing
The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice(coded as OJ) or ascorbic acid (a form of vitamin C and coded as VC).
Statistical Inference Project 1 - Sample Mean Simulation
Investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Set lambda = 0.2 for all of the simulations. Investigate the distribution of averages of 40 exponentials and perform a thousand simulations.
Illustrate via simulation and associated explanatory text the properties of the distribution of the mean of 40 exponentials using the following:
<br />
- Show the sample mean and compare it to the theoretical mean of the distribution.
- Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.
- Show that the distribution is approximately normal.
JHU PA2 - StormData
John Hopkins Uni Peer Assessment 2 - StormData
JHU Reproducible Research: Peer Assessment 1
John Hopkins University Reproducible Research Module 5 Week 2 Peer Assessment 1.