gravatar

xgiamakidis

Chris Giamakidis

Recently Published

Mall Customer Segmentation using Clustering Techniques
This report presents a detailed customer segmentation analysis using clustering techniques on the "Mall Customers Segmentation" dataset from Kaggle. The dataset contains demographic and behavioral information for 200 customers. The analysis begins with exploratory data analysis (EDA), including data validation, summary statistics, and visualizations to explore relationships between age, gender, annual income, and spending score. We apply hierarchical clustering using Ward’s method and K-means clustering, identifying optimal cluster numbers and interpreting the resulting customer groups. The findings offer actionable insights into distinct customer profiles based on age, income, and spending behavior—valuable for targeted marketing and strategic decision-making in retail management. In general this project demonstrates how unsupervised learning can be leveraged to segment retail customers and support data-driven business decisions.
Segmentation of Wholesale Clients with Hierarchical Clustering and K-means
This project performs an in-depth clustering analysis on the Wholesale Customers Dataset using R. The objective is to segment wholesale clients based on their annual spending on product categories such as fresh products, milk, grocery, frozen foods, detergents, and delicatessen. The analysis includes data preprocessing and normalization, exploratory data analysis with visual insights, hierarchical clustering to identify natural customer groupings, dimensionality reduction using Principal Component Analysis (PCA) and interpretation of clusters from a business perspective. The findings reveal distinct client profiles, such as high-spending businesses and product-specific buyers, which can be leveraged for targeted marketing, personalized service strategies, and optimized sales efforts.
Survival Prediction on the Titanic Dataset Using Logistic Regression and CART Decision Tree
This project explores the Titanic dataset using two popular classification techniques: Logistic Regression and Classification and Regression Trees (CART). The goal is to identify key factors that influenced passenger survival. The analysis includes data cleaning, variable transformation, model training, evaluation using confusion matrices, ROC curves, and comparison of performance metrics such as accuracy and AUC. The report is designed to showcase practical machine learning skills in R, using real-world data.
Bank Marketting Logistic Regression Remodelled
In this project, we analyze the Bank Marketing dataset using logistic regression to predict whether a customer will subscribe to a term deposit. We explore data preprocessing, feature selection, model evaluation using metrics like accuracy, sensitivity, specificity, ROC curve, and AUC. The model outperforms the baseline and reveals key factors influencing customer decisions, offering valuable insights for targeted marketing strategies.
Bank Marketting Logistic Regression
In this project, we analyze the Bank Marketing dataset using logistic regression to predict whether a customer will subscribe to a term deposit. We explore data preprocessing, feature selection, model evaluation using metrics like accuracy, sensitivity, specificity, ROC curve, and AUC. The model outperforms the baseline and reveals key factors influencing customer decisions, offering valuable insights for targeted marketing strategies.
Real Estate Price Prediction and Linear Regression Analysis
This project explores a real estate dataset from Taiwan, applying exploratory data analysis and multiple linear regression models to understand the factors affecting housing prices per square meter. We visualize key relationships through scatterplots, histograms, and boxplots, and evaluate several regression models with increasing complexity. Model performance is assessed using R² and SSE metrics, while we also discuss important statistical assumptions such as normality and homoscedasticity. The analysis concludes with a comparison of the models and the selection of an optimal one for price prediction.
Online Retail Dataset Analysis
This R Markdown file analyzes an Online Retail Dataset (UK e-commerce, 2010-2011) with: - Descriptive stats (mean, variance) - Correlations (price vs. demand) - Visualizations (sales trends, top products) - Key insights on pricing, seasonality, and markets. Goal: Optimize pricing, inventory, and marketing strategies. (Author: Chris Giamakidis Kiosses | Date: 2025-04-02)