Recently Published
Claim risk analytics for an insurance company
Identifying risky customers based on their demographics, previous claims, driving history and performing segmentation
Customer Segmentation for a retail supermarket using K-means clustering
Customer segmentation done using Customer Value Model and K-means clustering
Classification of the type of exercise using Random Forest, Gradient Boosting and Linear Discriminant Model
As a part of practical machine learning course requirement (data science specialization track from Coursera), various kind of device data was used to predict whether a certain exercise was done perfectly or not.
Data Science Specialization Capstone Project: Predict the next word
Back-off model to predict the next word
Milestone Report: Coursera Data Science Capstone Project
This is one step towards building a basic text predictive model. The dataset used here has been downloaded from here. The dataset is provided by a company called swiftkey in partnership with Coursera. The dataset was made by taking words from users’ tweets, blogs and words from the news articles.
Prediction of a car's mileage
This is a part of submission required to pass Data Science Specialization track, Coursera. This presentation has a link to the shiny app.
Global population concentration
Interactive map showing the population concentration around world's major cities. We can see that coastal China and Indian subcontinent region are the most densely populated areas, though Tokyo in Japan has the biggest population. North American population is majorly centred around the North-eastern United States. Aussies, as going by their popularity of being beach bums, live mostly on the coast (their innermost land is basically uninhabitable). Regions like Sahara, Gobi desert, Greenland, Amazon forest and Siberia are scarcely inhabited.
World coal consumption 1980-2009
Fossil fuels are one of the major sources of energy all over the world. For a long time, thermal power stations, using steam produced by burning coal, has been at the centre of power generation. More importantly, while the western countries very recently have started focussing on clean sources (like nuclear and hydroelectric), their developing countries especially China are still relying heavily on coal production, for either for their power generation or steel production (which requires tonnes of coal to extract pig iron from iron ores in blast furnace)
Analysis of historical storm data of USA
Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
Analysis of activity pattern of an individual
Part of submission for Reproducible Research Peer Assessment Project. It was found that person becomes active a bit late and stays active late hours during weekends (probably partying)