Recently Published
Impact of beauty on Instructor’s Teaching Ratings Dataset
TeachingReatings (AER Package) - Data Analysis: & Decision Making Analysis
US Wages Data Analysis - Part 1
Data Cleaning, Predicting Wage, Partial F-stat of the Big Model vs Small Model, Anova for model optimization
US Wages Data Analysis - Part 2
Analysis done for predicting wage using Nonconstance variance, Residuals vs Fitted plots, heteroskedasticiy, Non-normal errors, Shapiro-Wilk test of normality, Box-Cox Power Transform.
Predicting Car Prices
Toyota Prices Data - Linear Regression
Random Data Generation
To analyze the data using Gaussian Distribution
Transformation, Diagnostics, Generalized Regression and ANOVA
To discover relation between US new house construction starts data and macro economic indicators: GDP, CPI and Population. The description for this
data can be found in https://fred.stlouisfed.org/.
Clustering Basics in R
Hierarchical and K-means Clustering
Online News Popularity Prediction
This dataset contains 39,797 articles published by Mashable (www.mashable.com). The features were extracted from the original news articles and are captured by 61 attributes. The goal of the regression task is to predict the number of shares for an article given its attributes.
Boston Crime Prediction
Analyzing the Boston dataset, using multivariate logistic regression models in order to predict whether a given suburb has a crime rate below or above the median.
Gene Expression Data Analysis using Clustering and PCA
PCA analysis, Hierarchical Clustering and K-means Clustering implementation
US Jobless Claims - Forecasting
Federal Reserve releases weekly data on initial jobless claims. See http://www.investopedia.com/university/releases/joblessclaims.asp for more information. Since weekly numbers can be volatile, the series you see here is a monthly average of this data without any adjustment for seasonality. We will work with this dataset to understand strength of our economy in the short term.
US Department of Tourism - Forecasting
Travel and tourism is the largest services export industry for the United States. Estimates of future demand at destination level are very important in managing and planning tourism development and the necessary investment. It is required to predict travel demand for every major global market and particular traveler segments. These forecasting models will give insights in the economic fundamentals of origin markets along with the changing dynamics of the tourism industry and traveller preferences.