Recently Published
Student Dropout Prediction - Econometric and Machine Learning Modeling
This analysis is conducted to identify the factors that impact a student's dropout or graduation. The dataset is collected from one of Portugal's universities and shows details about students while keeping privacy. In this analysis, one econometric logistic and four linear and non-linear machine learning models are applied and evaluated using three metrics.
MasterThesis_DataImport
Selected 69 Stocks are fetched from the Yahoo Finance website. And will be used for Modleling
DATA SELECTION OF MA THESIS (ANALYZING STOCK MARKET BEHAVIOR AND PRICE PREDICTIONS ACROSS GLOBAL MARKETS)
This file documents the data selection and preparation phase of the Master's dissertation. It begins by scraping a comprehensive list of stocks from the StockAnalysis website. After removing problematic entries, the initial dataset included 3,225 stocks, which was narrowed down to approximately 3,185. Applying a start date filter from the year 2000 further reduced the list to 967 stocks. From this refined pool, a final random selection of 69 companies was made for detailed analysis. Additionally, this file includes the Exploratory Data Analysis (EDA). The modeling and subsequent analysis are conducted separately in a Python script.
MA_Tesis_Data_Selection
This file documents the data selection and preparation phase of the Master's dissertation. It begins by scraping a comprehensive list of stocks from the StockAnalysis website. After removing problematic entries, the initial dataset included 3,225 stocks, which was narrowed down to approximately 3,185. Applying a start date filter from the year 2000 further reduced the list to 967 stocks. From this refined pool, a final random selection of 69 companies was made for detailed analysis. Additionally, this file includes the Exploratory Data Analysis (EDA). The modeling and subsequent analysis are conducted separately in a Python script.
Concrete Jungle: Unraveling the Complexities of Apartment Pricing in Baku’s Urban Landscape
Through this analysis, we seek to uncover the roles of location, amenities, economic conditions, and other relevant variables that contribute to the fluctuation of property values in Baku. By leveraging data sourced from Kaggle and scraping information from the local real estate platform Bina.az, the project will employ advanced statistical methods to generate actionable insights.
The outcome of this research will not only enhance our understanding of the Baku housing market but also provide critical policy recommendations for more informed urban planning and development strategies. Ultimately, this project aims to offer a comprehensive model that can guide future real estate investments and improve living conditions for residents in the city.
APPLIED FINANCE TASK - EUROPEAN OPTION PUT UP-and-OUT
The task is from the Applied Finance class at, the University of Warsaw.
Which products are frequently purchased together in a given week?
This study uses complex data mining and analytics tools to examine the complex dynamics of product transactions over a 52-week period. Finding significant relationships between products is the main goal, providing a detailed picture of commonly co-purchased goods in a particular week. Utilizing PAM, Hierarchical clustering for pattern recognition, and the Apriori algorithm for association rule mining, the analysis offers a thorough understanding of customer behavior and product interactions.
How can the telecommunications customer dataset’s fundamental structures and trends be identified using dimension reduction techniques like t-SNE and PCA?
This study explores the complex topic of customer churn in telecommunications, which is a major worry in a market that is very competitive. The study includes a thorough examination of the dataset and provides trends, preferences, and behavior patterns of the target audience. The study develops relevant research questions with a strategic focus on overcoming the difficulties caused by client attrition. The dataset is carefully cleaned and standardized throughout the data preparation stage to provide an excellent foundation for subsequent studies. Statistical techniques are utilized to identify trends, associations, and critical metrics impacting turnover. The dataset’s interpretability is improved by the use of t-distributed Stochastic Neighbor Embedding (t-SNE) and Principal Component Analysis (PCA) for dimensional reduction. A comprehensive summary of the research is provided, connecting the dots between the analytical results and the ramifications for the telecom sector.
Manhattan Neighborhood Segmentation: Unveiling Property Sales Dynamics in 2008 for Class 1-, 2-, and 3-Family Homes
This study analyzes the complicated landscape of Manhattan real estate sales in 2008 using an extensive dataset from the Department of Finance (DOF). Applying k-means clustering, the study concentrates on residences belonging to Class 1-, 2-, and 3-Family and offers novel findings into the geographic distribution of house trades. The dataset includes all sales with a sale price of at least $150,000 that occurred between January 1st and December 31st, 2008.