Recently Published
Word Prediction App (Data Science Capstone)
This Presentation is used to predict the next word
Word Prediction App (Data Science Capstone)
This presentation is created as part of the requirement for the Coursera Data Science Capstone Course.
The goal of the project is to build a predictive text model combined with a shiny app UI that will predict the next word as the user types a sentence similar to the way most smart phone keyboards are implemented today using the technology of Swiftkey.
Coursera Data Science Capstone: Milestone Report
The goal of this project is to analyse the corpus data provided from various sources like tweeter, news sites & blog sites and to come up with a strategy for predicting next word from partial sentences/words.This strategy/modeling technique will be used to build shiny app. This reports includes data fetching, generating summary statistics on data and then cleaning of data using principles of natural language processing. On cleaned data, first ngrams up to quad grams are generated and then various exploratory techniques are used to provide insight of data & underlying pattern in data. Histogram (Quantitative technique) of frequently occurring words provides with top words and provides nature of distribution of top words. Unique word coverage of 50% & 90% (Quantitative technique) provides a detail about how much words are necessary to cover up 50% & 90% of corpus. It also helps to understand underlying distribution of words & ngrams. Word-cloud (Visual technique) is used to represent most occurring terms and which is helpful in unearthing dependent patterns. Tweeter data stands out differently in various analysis and hence it is needed to be treated different in final modeling.
Shiny App :: Will you default on your Credit Card Soon?
This is a presentation of the shiny app created for coursera course on Developing Data Products.
Developing Data Products : Assignment 2 - Mid-Atlantic Wage Visualization | Presentation
Wage and other data for a group of 3000 workers in the Mid-Atlantic region between year 2003 to 2009.
Developing Data Products : Assignment 2 - Mid-Atlantic Wage Visualization
Wage and other data for a group of 3000 workers in the Mid-Atlantic region between year 2003 to 2009.
Developing Data Products : Assignment 1 - New 7 Wonders
New7Wonders of the World (2000–2007) was an initiative started in 2000 to choose Wonders of the World from a selection of 200 existing monuments. The popularity poll was led by Canadian-Swiss Bernard Weber and organized by the New7Wonders Foundation based in Zurich, Switzerland, with winners announced on 7 July 2007 in Lisbon.
This map provides interactive view of all wonders.
Reproducible Research Assignment : Storm Analysis
The National Weather Service (NWS) tracks and records severe weather events across the United States in a Storm Events Database. Based on available data, this report provides insight on fatalities & injuries caused by severe weather. Additionally, it also provised insight on econoic losses suffered by analysis of propety damage & crop damage caused by severe weather conditions. This report provides graphical analysis on fatalities, injuries, property damage & crop damage. Based on available analysis, report concludes that leading cause of fatalities, injuries, property damage & crop dmage is tornado, tornado, flooding & draught respectively.
Practical Machine Learning: Project
The goal of your project is to predict the manner in which they did the exercise. This is the “classe” variable in the training set. You may use any of the other variables to predict with. You should create a report describing how you built your model, how you used cross validation, what you think the expected out of sample error is, and why you made the choices you did. You will also use your prediction model to predict 20 different test cases.
Regression Models Course Project: which transmission is better AT or MT?
From analysis of the mtcars data set, it is determined that in general manual transmissions are better in terms of miles per gallon than automatic transmissions. In a linear regression model with only transmission type as an explanatory variable, a change from automatic to manual transmission increased the mpg by 7.245. However, transmission type only explained 36% of the variation in mpg. A linear regression model of all significant variables (determined by ANOVA), explained 84% of the variation in mpg. It included only the variables weight and number of cylinders. Transmission type was determined to be an insignificant contributory variable to the model. It is recommended that the editors of Motor Trend consider the variables weight, number of cylinders, and possibly horsepower as the most significant explanatory variables of miles per gallon.
Statistical Inference Project : ToothGrowth Data Analysis
In this report investigation on the ToothGrowth data set is performed in R and various hypothesis are derived and tested with respect to dosage size and supplement type.
Statistical Inference Project : Simulation Excercise
In this report investigation on the exponential distribution is performed in R and it is compared with the Central Limit Theorem. As part of report, following aspects are investigated.
Sample Mean versus Theoretical Mean
Sample Variance versus Theoretical Variance
Distribution comparison with normal distribution