RPubs

by RStudio

kkrynska

Katarzyna Kryńska

Recently Published

Comparison of models for credit risk purposes - logistic regression vs random forest

In this research I compared predictive power of Logistic Regression and Random Forest in the context of Credit Scorecard. Firstly I estimated GLM model, in accordance with best practices and subsequently I trained a Random Forest, using cross-validation to optimise hyperparameters. Random Forest seems to outperform Logistic Regression, however the difference is not immense. In case of Credit Scorecard building, GLM models still are preferable as they provide clear answer about how each trait of a client contributes to their Credit Score.

almost 5 years ago

Evolution of language in literature throughout ages

The goal of our project is to compare how English language evolved in four corresponding centuries, from 17th century to 20th century. To achieve this aim we collected books from specific centuries from Project Guttenberg.

over 5 years ago

Advanced Visualisation Project

over 5 years ago

Using Association Rules for transactional data

Using Association Rules, I mine interesting relationships between the items. I also create a simple recommender system in R shiny.

about 6 years ago

Dimension reduction in Analysis of Human Interests

The main goal of this research was to examine whether human interests can be described by a smaller number of latent concepts by dimension reduction. In the analysis I used Multidimensional Scaling with k-means and compared the results to Hierarchical Clustering.

about 6 years ago

Extreme Value Theory

Short introduction to Extreme Value Theory with a short literature review.

over 6 years ago

Using K-means and PAM clustering for Customer Segmentation

In this article, I will use data mining techniques such as K-means and PAM to divide customers into groups with different characteristics. Data comes from a small online shop.

over 6 years ago

Sign In

kkrynska

Katarzyna Kryńska

Recently Published