gravatar

sciencephalon

SciEncephalon

Recently Published

Identifying Risky Loans using Ensemble Trees
Ensemble methods, which combine several decision trees to produce better predictive performance than utilizing a single decision tree. The main principle behind the ensemble model is that a group of weak learners come together to form a strong learner. Bagging (Bootstrap Aggregation) is used when our goal is to reduce the variance of a decision tree. Here idea is to create several subsets of data from the training sample chosen randomly with replacement. Now, each collection of subset data is used to train their decision trees. As a result, we end up with an ensemble of different models. Average of all the predictions from different trees are used which is more robust than a single decision tree. Random Forest is an extension over bagging. Boosting is another ensemble technique to create a collection of predictors. In this technique, learners are learned sequentially with early learners fitting simple models to the data and then analyzing data for errors. In other words, we fit consecutive trees (random sample) and at every step, the goal is to solve for net error from the prior tree. When an input is misclassified by a hypothesis, its weight is increased so that the next hypothesis is more likely to classify it correctly. By combining the whole set at the end converts weak learners into a better performing model. Gradient Boosting is an extension over the boosting method.
Identifying Risky Loans using Decision Tree
Decision tree learning is a supervised machine learning technique for inducing a decision tree from training data. A decision tree (also referred to as a classification tree or a regression tree) is a predictive model that is a mapping from observations about an item to conclusions about its target value. In the tree structures, leaves represent classifications, nonleaf nodes are features, and branches represent conjunctions of features that lead to the classification
Document
Assumptions of Linear Regression
We make a few assumptions when we use linear regression to model the relationship between a response and a predictor. These assumptions are essentially conditions that should be met before we draw inferences regarding the model estimates or before we use a model to make a prediction.