gravatar

maulikpatel

maulik patel

Recently Published

Predict MPG of your car
Instruction on How to use the Shiny App to predict MPG of your car
plot_ly
Plotly package
Support Vector Machine
Book: ISL Ch-9 SVM
Support Vector Classifier
Book: ISL Ch-9 SVM
Project : fitbit data Classification
comparing multiple classification methods: Decision Tree LDA SVM Bagging Boosting Random Forests
Project: Customer segmentation, recommendations and Sales Prediction
IE575 Final Project: k-means scatterplot3d plot3d visualization
Project: Clustering web users
- hierarchical-clustering for categorical variables - "gower" dissimilarity distance for categorical variables - MCA for categorical variable for visualization (like PCA for numeric)
Principal Component Analysis (PCA)
an Unsupervised learning
Random Forests for tree
- an extension to bagging for classification and regression trees.
Boosting for tree
an ensemble method
LDA using caret package
LDA for classification
Bagging for tree
Bagging is Bootstraping with aggregation. method: regression tree package: caret dataset: ozone (ElemStatLearn)
classification tree using caret package
dataset: iris caret package decision tree
Mahalanobis Distance : Full process
Using Mahalanobis Distance to Find Outliers
k-means clustering for Outlier detection
Clustering Based Outlier Detection Technique
LOF Outlier Detection Technique
LOF (Local Outlier Factor): Proximity (density) Based Outlier Detection Technique
Using Mahalanobis Distance to Find Outliers
R’s mahalanobis() function provides a simple means of detecting outliers in multidimensional data.
MLR - comparing stepwise regression models
forward selection vs. backward selection vs. both
Bootstrap method
- to estimate parameters and their variance for a statistical learning models. - to quantify the uncertainty (variability) associated with a given estimator or statistical learning method.
K-fold Cross-validation
A ReSampling technique to estimate test error to see how well the model fits the data or to select the best model for the data.
LOOCV method - Leave One Out Cross-Validation
A ReSampling technique to estimate test error to see how well the model fits the data or to select the best model for the data.
Validation-set method
A ReSampling technique to estimate test error to see how well the model fits the data or to select the best model for the data.
Naive Bayes using caret package
iris data set; spam email data set
K-means clustering_2
from IE575_HW-11
K-means clustering
An Unsupervised Machine learning technique
KNN using PCA
using PCs from PCA for KNN, calculating train-error and test-error (Diabetes data)
ANN for linear regression
Artificial Neural Network, Boston data for predicting median home value
Comparison_LogisticRegression_LDA_QDA
Examples: weekly and auto datasets (ISLR, MASS packages)
LDA / QDA - supervised classification methods
Linear Discriminant Analysis & Quadratic Discriminant Analysis
Logistic_Regression
modeling, prediction, evaluation
Linear Regression_Basics
from Book: Intro to Stats Learning / chapter-3
Multiple Linear Regression_comparing diff models
AIC-based selection, Principal components based models(PCR, PLSR), Penalty-based models(lasso, ridge, elastic)
Probability
Intro to Data - NYCFlight data
Inference for numerical data
Inference for categorical data
Confidence Interval
Bayes Inference
Analysis of Storm data from NOAA for finding the most harmful events
exploratory data analysis