Recently Published
Coursera Data Science Capstone Project
This presentation is a part of the capstone project for the Coursera Data Science specialization designed by Johns Hopkins University and sponsored by SwiftKey.
Coursera Data Science Milestone Project
The main purpose of this milestone report is to demonstrate the familiarity with the datasets and subject matter expertise required to successfully complete the Coursera capstone data science project designed by Johns Hopkins University and sponsored by SwiftKey. The goal of the capstone project itself is to create a predictive text model based on a large corpus of text documents as training data. Natural Language Processing (NLP) techniques are used to perform an analysis and build a text predictive model, which will be packaged as a Shiny R application in the capstone project.
Solidification Range in Al-Si-Mg Alloys with Shiny App
Equilibrium computational results from open-source OpenCalphad software, coupled with COST507 thermodynamic database for light alloys, were used to fit three multiple polynomial regressions to predict the liquidus, solidus, and the solidification range of Al-Si-Mg alloys. The models were implemented as a Shiny app and uploaded to RStudio server.
Equilibrium Solidification Range in Al-Si-Mg Alloys
The liquidus and solidus data for the Al-Si-Mg ternary system were calculated with open-source OpenCalphad - V 4.0 software coupled with COST507 thermodynamic database for light alloys. An interactive heatmap showing the equilibrium values for liquidus, solidus, and solidification range (delta_T) in the Al-Si-Mg system was created with R Markdown and PLOTLY package in R programming language.
Reported Crime Incidents in Pittsburgh, PA
An interactive map of the reported crime incidents in the city of Pittsburgh, Pennsylvania, has been coded in R programming language. The dataset used to create this interactive map was released by City of Pittsburgh Police and it contains only the most recent data (2016 and 2017).
Practical Machine Learning
This is the final report of the Peer Assessment project from the Practical Machine Learning course, which is a part of the Data Science Specialization (Johns Hopkins University).
Statistical Inference Course Project - Part 2
In this report the ToothGrowth data in R was used to perform a basic inferential analysis, which has the following four parts: (1) Load the ToothGrowth data and perform some basic exploratory data analyses. (2) Provide a basic summary of the data. (3) Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose. (4) Conclusions and assumptions.
Statistical Inference Course Project - Part 1
In this project the exponential distribution in R is investigated and compared with the Central Limit Theorem. The exponential distribution can be simulated with rexp(n,lambda) command, where lambda is the rate parameter. Both the mean and the standard deviation of exponential distribution is 1/lambda. The simulation confirms that sample distribution tends towards the theoretical distribution, as proposed by the Central Limit Theorm.
Impact of Severe Weather Events in the USA
Exploring the NOAA Storm Database for Health and Economic Impacts