gravatar

Utopian

Austin Routt

Recently Published

Publish Presentation
Meet Wordsley
Publish Presentation
Meet Wordsley
Milestone: Progress on the JHU Data Science Capstone
The following is an exploratory analysis of the JHU swiftkey data set, particularly with respect to building a model for next word text prediction; this report fulfills the milestone requirement portion of the 2015 Summer Capstone project in Data Science, hosted by Coursera. In this document you will find 5 sections: an Introduction that poses the problem, followed by 4 other sections that give abridged information on Downloading and Opening the Data, Summarizing the Data, Cleaning the Data, and Future Plans for the Data. For the programming savy, note that the lion’s share of code used to make this report possible and reproducible can be found in the Appendix included at the end. Ultimately, this analysis finds the swiftkey data conducive to deriving an n-gram based statistical model for next word text prediction, though some additional processing will be required.