Recently Published
Segmenting with Mixed Type Data - A Case Study Using K-Medoids on Subscription Data
In this post I retrace the steps I took for one take home analysis I was tasked with whilst looking for new employment opportunities and revisit clustering, one of my favourite analytic methods.
Only this time the set up is a lot closer to a real-world situation in that the data I had to analyse came with a mix of categorical and numerical feature.
Simply put, this could not be tackled with a bog-standard K-means algorithm as it’s based on pairwise Euclidean distances and has no direct application to categorical data.
Segmenting with Mixed Type Data - Initial data inspection and manupulation
In this post I retrace the steps I took for one take home analysis I was tasked with whilst looking for new employment opportunities and revisit clustering, one of my favourite analytic methods.
Only this time the set up is a lot closer to a real-world situation in that the data I had to analyse came with a mix of categorical and numerical feature. Simply put, this could not be tackled with a bog-standard K-means algorithm as it’s based on pairwise Euclidean distances and has no direct application to categorical data.
Using h2o and DALEX to Estimate Likelihood to Purchase a Financial Product - Abridged Version
In this project I'm analysing the result of a bank direct marketing campaign to sell term deposits in order to identify what type of customer is more likely to respond.
This is the Abridged Version Combining EDA-Data Formatting, Model Estimation-Evaluation-Selection and Final Profit Optimisation
Using h2o and DALEX to Estimate Likelihood to Purchase a Financial Product - Part 3 of 3: Optimise Profit With the Expected Value Framework
n this project I'm analysing the result of a bank direct marketing campaign to sell term deposits in order to identify what type of customer is more likely to respond. The marketing campaigns were based on phone calls and more than one contact to the same client was required at times.
In this third and final part, I take one final model that combines findings from the exploratory analysis and insight from models’ selection and use it to run a basic profit optimisation.
Using h2o and DALEX to Estimate Likelihood to Purchase a Financial Product - Part 2 of 3: Estimate Several Models and Compare Their Performance Using a Model-agnostic Methodology
In this project I'm analysing the result of a bank direct marketing campaign to sell term deposits in order to identify what type of customer is more likely to respond. The marketing campaigns were based on phone calls and more than one contact to the same client was required at times.
In this second of three parts, I’m estimating a number of models and assess their performance and fit to the data using a model-agnostic methodology that enables to compare traditional “glass-box” models and “black-box” models.
Using h2o and DALEX to Estimate Likelihood to Purchase a Financial Product - Part 1 of 3: Data Preparation and Exploratory Data Analysis
In this project I'm analysing the result of a bank direct marketing campaign to sell term deposits in order to identify what type of customer is more likely to respond. The marketing campaigns were based on phone calls and more than one contact to the same client was required at times.
In this first part, I am going to carry out an extensive data exploration and use the results and insights to prepare the data for analysis.
Time Series Machine Learning Analysis and Demand Forecasting with H2O & TSstudio
In this project I go through the various steps needed to build a time series machine learning pipeline and generate a weekly revenue forecast.
I carry out a more “traditional” exploratory time series analysis with TSstudio and create a number of predictors using the insight I gather. I then train and validate an array of machine learning models with the open source library H2O, and compare the models’ accuracy using performance metrics and actual vs predicted plots.
Build Your Own Website with Hugo and blogdown
How I used RStudio, GitHub and Netlify to create and deploy your own webpage
A Practical Approach to Profile your Customer Base Using a Feature-rich Dataset
Steps and considerations to run a successful statistical segmentation with K-means, Principal Components Analysis and Bootstrap Evaluation
Loading_Merging_and_Joining_Datasets
This is the minimal coding necessary to assemble the various data feeds and sort out the likes of variables naming & new features creation plus some general housekeeping tasks
Diego Usai - Curriculum Vitae - R pagedown
Refreshed my CV using the R pagedown package
Modelling with Tidymodels and Parsnip - A Tidy Approach to a Classification Problem
I am using R tidymodels to create and execute a “tidy” modelling workflow to tackle a classification problem.
My aim is to show how easy it is to fit a simple logistic regression in R’s glm and quickly switch to a cross-validated random forest using the ranger engine by changing only a few lines of code.
K-Mean Clustering for Customer Segmentation
I use the popular K-Means clustering algorithm to segment customers based on their response to a series of marketing campaigns.
Data includes sales promotion data for a fictional wine retailer with details of 32 promotions (including wine variety, minimum purchase quantity, percentage discount, and country of origin) and a list of 100 customers and the promotions they responded to
Market Basket Analysis - Part 3 of 3
Third and final part of a Market Basket Analysis project in which I apply an Improved Collaborative Filter implementation to power a Shiny App Product Recommender
Market Basket Analysis - Part 2 of 3
Second part of a Market Basket Analysis project in which I apply various machine learning algorithms for Product Recommendation and select the best performing model with the support of the recommenderlab package
Market Basket Analysis - Part 1 of 3
First part of a Market Basket Analysis project in which I source, explore and format a complex dataset suitable for modelling with recommendation algorithms.