gravatar

SeongJin

SeongJin Kim

Recently Published

Create stable R working environment
Create stable R working environment
(Box-Cox Transformation) Transformation of Y_ Constructing Linear Model
contains.. 1) setting cran mirror prior to installing R packages 2) coding to sort out categorical & numeric variables 3) boxcox transformation that gives best log-likelihood result 4) fitting normal distribution to y random variable
Survival Analysis _HW1
1. Discussion about 5 function for describing survival model 2. IFR, IFRA
Weather Data Analysis _ EDA 1
### objective -> quick overview of the data 1. NA check 2. Correlation with response variable -> only numerical variables 3. Correlation within control variables -> check out for multicollinearity
ESC Final Project_ EDA on imdb Data
1. Setup A. How to Replace NA -> EDA; you don't have to B. Duplicated Rows -> Eliminate 2. Y variables A. Siginificant correlations ? -> Not Really B. Multi-Collinearity -> Eliminate cast_total_facebook_likes, num_users_review 3. X variables -> Partition in to Categories A. Old Movies B. Independent Films -> Low budget C. Facebook Likes -> if you insists on using, Director_facebook_likes is significant to slight degree; the rest? no
ESC Assignment_Ensemble Method_my own method
1. Bagging 2. Random Forest
ESC Assignment_Ensemble Method_Lab
Use Package "randomForest", "tree" 1. Bagging - randomForest( ..., mtry=number of IV, importance=TRUE) 2. Random Forest - mtry, ntree 3. Boosting - distribution="gaussian", n.trees(B), interaction.depth, shrinkage(lambda)
ESC Assignment_Decision Tree_Lab
Decision Tree A. Classification Tree B. Regression Tree
ESC Assignment_Regularization with Boston Data
Conducting regularization(shrinkage method) w/ Boston Data -> How does ridge, lasso improve a prediction model? => Smaller test MSE A. Compare between Best Subset Selection and Shrinkage method 1) test MSE 2) Variance B. Compare between ridge regression and lasso 1) coefficients 2) variance -> which method was better ?
ESC Assignment_Regularization_Lab
1. Ridge Regression a) Construct Ridge regression model - Difference with least squares method b) Selecting the best lambda(CV) 2. The Lasso a) Construct Lasso model - Difference with least squares method / ridge regression b) Selecting the best lambda(CV)
EDA Assignment 4_ Probabilitiy Plot
EDA Chapter 6_ Probability Plot Examine if sample distribution is from any theoretical distribution -->Compare between sample quantiles and theoretical quantiles
Statistical Computing_Chapter 6 Assignment_ Generating Dependent Random variables
Generating Dependent random variables with copula method 1) Generate Standard normal random variables as intermediary random variables 2) Transform standard random variables using Gaussian Copula 3) Apply resulted values to inversed normal cdf 4) Apply resulted values to inversed exponential cdf
ESC week 4 Assignment _ Logistic Regression analysis with 'binary' data
1. Construct Logistic Regression Model 2. Calculate odds and logits for each X 3. k-Fold Cross Validation
Statistical Computing_Chapter 4,5 Assignment
Generating Continuous Random variables
EDA_week3_Re-expression
1. Symmetrizing Re-expression 2. Variance Stablizing Re-expression
ESC_week3_Subset Selection and Cross Validation with Boston Data
1. Subset Selection A. Best Subset Selection B. Forward Selection and Backward Selection * Results from forward selection and backward selection is almost similar, where as best subset selection yields pretty different results. 2. Cross Validation A. LOOCV B. k-Fold Cross Validation
ESC_week3_Cross Validation Techniques _ Following Lab script
[Cross Validation Techniques] 1. Leave-One-Out Cross Validation * use cv.glm() * Cross Validation technique with square values of predictor 2. k-Fold Cross Validation [Subset Selection Models] # use regsubsets() from package 'leaps' 1. Best Subset Selection 2. Forward and Backward Stepwise Selection * use argument 'method="forward" or "backward"' in function 'regsubsets()'
ESC week4_Assignment. Logistic Regression Analysis
Conducting Logistic Regression Analysis with Smarket data A. Construct Logistic Regression model with function glm, argument (family=binomial) B. Function 'predict()'(argument 'type="response"') C. Separating Test data
ESC week2_My Own Regression Analysis with Boston Data
[Table of Contents] A. Setup B. Construct Regression Model 1) Eliminate Insignificant Predictors 2) Interaction Terms 3) Non-linear Terms C. Diagnosis 1) VIF 2) Plotting - Residual, Leverage
Week 1_Regression Analysis with Boston Data_ESC
Regression Analysis with Boston Data provided by package MASS 1) Constructing Regression Model a) Add Interaction Terms b) Add non-linear Terms 2) Plotting Regression Model a) Residual - Leverage Plot