gravatar

kaloatanasov

Kaloyan Atanasov

Recently Published

Data Science Specialization Capstone Project
The Data Science Specialization Capstone Project consists of analyzing a large corpus of text documents, provided by SwiftKey, in order to discover the structure of the data, and how words are put together. It covers cleaning and analyzing the text data. As a continuation of the present analysis, there will be a next-word-prediction model, represented interactively through a Shiny app.
JHU - Coursera - Practical Machine Learning Course Project
The following analysis is done on the Weight Lifting Exercise Dataset (Velloso, E.; Bulling, A.; Gellersen, H.; Ugulino, W.; Fuks, H. Qualitative Activity Recognition of Weight Lifting Exercises. Proceedings of 4th International Conference in Cooperation with SIGCHI (Augmented Human ’13) . Stuttgart, Germany: ACM SIGCHI, 2013.). The data had been gathered from accelerometers on the belt, forearm, arm, and dumbbell of 6 participants. The following Course Project predicts the manner in which the exercise was performed, i.e. the “classe” variable, with five categories. The current report describes how the prediction model was built, the way cross-validation was used, as well as in the sample and out of sample error estimates.
JHU - Coursera - Regression Models Course Project
The following analysis is done on the “mtcars” data, as part of the R “datasets” package. The current Regression Models Course Project tests linear regression against generalized linera model. It proves that a Logistic Regression model best interprets the relationship between the type of transmission, automatic or manual, and the fuel efficiency, or the miles per gallon. The model shows the probability of transmission change as MPG increases or decreases, i.e. the percentage probability of a transmission type for every additional mile per gallon that a vehicle achieves.
Influence of Natural Events, Across the US, Over Population Health and Economic Well-Being
The purpose of the current analysis is to describe which natural events, across the US, are most harmful for the population health within the country, as well as have the greatest economic impact, between the year 1950 and end of November 2011. The overall hypothesis is that the types of events that are more harmful for the US population health are heat waves, cold and snow, as well as tornadoes and hail. In addition, the believed event types that has the greatest economic impact are drought and floods. To investigate this hypothesis, the current analysis is based on a data obtained by the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. The analysis of the data confirms that the most harmful events for the US population health are heat waves, cold and snow, as well as tornadoes and hail. Also, the analysis confirmed that the events that have the greatest economic impact are drought and floods.