

Giancarlo Callaoapaza

Recently Published

Word Prediction App
My application developed for the Data Science Capstone (April, 2015)
Data Science Capstone Project - Milestone Report
In this report we present the initial study of the data involved in the capstone project of the Data Science Specialization focused in Text Mining and Natural Language Processing. For this we proceed as follows: Loading the datasets and sampling them for training our model. This leads to the cleaning procedure (two steps) and the tokenization process allowing us to identify in our sample 1,975 word types and a total word count of 774,005. Next we develop a Bigram Language Model using the tools of the R package “tm” and the word types found in the tokenization to finally identify 139,112 bigrams. This model with let us begin the predictive process and to evaluate the trade-offs between accuracy and memory consumption.
U.S. Most Harmful Weather Events: A Perspective on Population Health and Economic Loss
Peer Assessment 2 for the course Reproducible Research provided by The Johns Hopkins University through Coursera.
People who did not board the RMS Titanic at Queenstown, what if?
Project for the Developing Data Products Course
Reproducible Research: Peer Assessment 1
My first paper for the course Reproducible Research offered by Johns Hopkins University.