RPubs

Segmenting with Mixed Type Data - A Case Study Using K-Medoids on Subscription Data

In this post I retrace the steps I took for one take home analysis I was tasked with whilst looking for new employment opportunities and revisit clustering, one of my favourite analytic methods. Only this time the set up is a lot closer to a real-world situation in that the data I had to analyse came with a mix of categorical and numerical feature. Simply put, this could not be tackled with a bog-standard K-means algorithm as it’s based on pairwise Euclidean distances and has no direct application to categorical data.

almost 5 years ago

Segmenting with Mixed Type Data - Initial data inspection and manupulation

In this post I retrace the steps I took for one take home analysis I was tasked with whilst looking for new employment opportunities and revisit clustering, one of my favourite analytic methods. Only this time the set up is a lot closer to a real-world situation in that the data I had to analyse came with a mix of categorical and numerical feature. Simply put, this could not be tackled with a bog-standard K-means algorithm as it’s based on pairwise Euclidean distances and has no direct application to categorical data.

almost 5 years ago

Using h2o and DALEX to Estimate Likelihood to Purchase a Financial Product - Abridged Version

In this project I'm analysing the result of a bank direct marketing campaign to sell term deposits in order to identify what type of customer is more likely to respond. This is the Abridged Version Combining EDA-Data Formatting, Model Estimation-Evaluation-Selection and Final Profit Optimisation

almost 5 years ago

Using h2o and DALEX to Estimate Likelihood to Purchase a Financial Product - Part 3 of 3: Optimise Profit With the Expected Value Framework

n this project I'm analysing the result of a bank direct marketing campaign to sell term deposits in order to identify what type of customer is more likely to respond. The marketing campaigns were based on phone calls and more than one contact to the same client was required at times. In this third and final part, I take one final model that combines findings from the exploratory analysis and insight from models’ selection and use it to run a basic profit optimisation.

almost 5 years ago

Using h2o and DALEX to Estimate Likelihood to Purchase a Financial Product - Part 2 of 3: Estimate Several Models and Compare Their Performance Using a Model-agnostic Methodology

In this project I'm analysing the result of a bank direct marketing campaign to sell term deposits in order to identify what type of customer is more likely to respond. The marketing campaigns were based on phone calls and more than one contact to the same client was required at times. In this second of three parts, I’m estimating a number of models and assess their performance and fit to the data using a model-agnostic methodology that enables to compare traditional “glass-box” models and “black-box” models.

almost 5 years ago

Using h2o and DALEX to Estimate Likelihood to Purchase a Financial Product - Part 1 of 3: Data Preparation and Exploratory Data Analysis

In this project I'm analysing the result of a bank direct marketing campaign to sell term deposits in order to identify what type of customer is more likely to respond. The marketing campaigns were based on phone calls and more than one contact to the same client was required at times. In this first part, I am going to carry out an extensive data exploration and use the results and insights to prepare the data for analysis.

almost 5 years ago

Time Series Machine Learning Analysis and Demand Forecasting with H2O & TSstudio

In this project I go through the various steps needed to build a time series machine learning pipeline and generate a weekly revenue forecast. I carry out a more “traditional” exploratory time series analysis with TSstudio and create a number of predictors using the insight I gather. I then train and validate an array of machine learning models with the open source library H2O, and compare the models’ accuracy using performance metrics and actual vs predicted plots.

about 5 years ago

Build Your Own Website with Hugo and blogdown

How I used RStudio, GitHub and Netlify to create and deploy your own webpage

over 5 years ago

A Practical Approach to Profile your Customer Base Using a Feature-rich Dataset

Steps and considerations to run a successful statistical segmentation with K-means, Principal Components Analysis and Bootstrap Evaluation

over 5 years ago

Loading_Merging_and_Joining_Datasets

This is the minimal coding necessary to assemble the various data feeds and sort out the likes of variables naming & new features creation plus some general housekeeping tasks

over 5 years ago

Diego Usai - Curriculum Vitae - R pagedown

Refreshed my CV using the R pagedown package

over 5 years ago

Modelling with Tidymodels and Parsnip - A Tidy Approach to a Classification Problem

I am using R tidymodels to create and execute a “tidy” modelling workflow to tackle a classification problem. My aim is to show how easy it is to fit a simple logistic regression in R’s glm and quickly switch to a cross-validated random forest using the ranger engine by changing only a few lines of code.

almost 6 years ago

K-Mean Clustering for Customer Segmentation

I use the popular K-Means clustering algorithm to segment customers based on their response to a series of marketing campaigns. Data includes sales promotion data for a fictional wine retailer with details of 32 promotions (including wine variety, minimum purchase quantity, percentage discount, and country of origin) and a list of 100 customers and the promotions they responded to

almost 6 years ago

RPubs

DiegoUsai

Diego Usai

Recently Published

Segmenting with Mixed Type Data - A Case Study Using K-Medoids on Subscription Data

Segmenting with Mixed Type Data - Initial data inspection and manupulation

Using h2o and DALEX to Estimate Likelihood to Purchase a Financial Product - Abridged Version

Using h2o and DALEX to Estimate Likelihood to Purchase a Financial Product - Part 3 of 3: Optimise Profit With the Expected Value Framework

Using h2o and DALEX to Estimate Likelihood to Purchase a Financial Product - Part 2 of 3: Estimate Several Models and Compare Their Performance Using a Model-agnostic Methodology

Using h2o and DALEX to Estimate Likelihood to Purchase a Financial Product - Part 1 of 3: Data Preparation and Exploratory Data Analysis

Time Series Machine Learning Analysis and Demand Forecasting with H2O & TSstudio

Build Your Own Website with Hugo and blogdown

A Practical Approach to Profile your Customer Base Using a Feature-rich Dataset

Loading_Merging_and_Joining_Datasets

Diego Usai - Curriculum Vitae - R pagedown

Modelling with Tidymodels and Parsnip - A Tidy Approach to a Classification Problem

K-Mean Clustering for Customer Segmentation

Market Basket Analysis - Part 3 of 3

Market Basket Analysis - Part 2 of 3

Market Basket Analysis - Part 1 of 3

Sign In

DiegoUsai

Diego Usai

Recently Published