This is a slide deck for a Shiny App for next word prediction based on NGRAMS. This is a capstone project for Coursera Data Science Specialization in partnership with Swiftly.
anguage Modeling for n-gram models provided a lot of challenges such as dealing with numbers, punctuations, non-english words and selecting the most suitable probabilistic model for predicting the next word given a set of words. This report provides a documentation showing the the process of subsetting the corpus dataset into subsets to be used for training and cross-validation, cleaning the data set, performing the explortatory analysis, and finally selecting the most suitable prediction model for the ngrams.
This project is in connection with the requirements of Coursera's specialization on Developing Data Products under the Data Science Course. The Shiny app shows the the locations of Significant Earthquake Events that have occurred in a little over the past two centuries.
An assignment on use of Plotly on R Markdown Presentation for the Coursera Data Science Course.
This is an assignment for Developing Data Products Week 2 focused on the use of R Markdown and Leaflet.
A project from Coursera's Data Science series Practical Machine Learning. This project is recognizing human activity from accelerometers.
The goal of this study is to determine which event accross the United States are most harmful in terms of public health. The study will identify the top 10 weather events that have caused highest number of fatalities. Secondly, this sturdy will also identify the top 10 weather events that have cause the most economic consequences. The dataset used in this study was downloaded from the NOAA Storm Database which contains events from 1950 until November 2011.