gravatar

znaboulsi

Zain Naboulsi

Recently Published

Google Data Analytics Capstone Case Study: How Does a Bike-Share Navigate Speedy Success?
Cyclistic, a popular bike-share company based in Chicago, strives to maximize its annual memberships for sustainable growth. The company has an existing user base composed of both casual riders (who opt for single-ride or full-day passes) and annual members. The focus of this case study is to understand the utilization patterns of these two user groups, as the company believes that converting casual riders into annual members can significantly enhance their profitability. To devise effective marketing strategies for this conversion, the case study will delve into questions that probe how these two user groups use Cyclistic bikes differently, why casual riders might opt for annual memberships, and how digital media can facilitate this conversion.
Johns Hopkins Data Science Specialization Capstone - TextNex Presentation
Welcome to TextNex! This application utilizes cutting-edge natural language processing technology to predict the next word based on your input text.
Johns Hopkins University Data Science Capstone - Milestone Report
Welcome to the Milestone Report for the Johns Hopkins Data Science Specialization. We appreciate you taking time to review our work and give feedback on our progress as well as thoughts on fitting a prediction model. Also, any insight you would like to share on the final plan for our Shiny application would be welcome.
Iris Predictor - A Simple and Interactive Flower Classifier
Unveil the captivating universe of iris blossoms with our presentation. This eye-catching slide show unravels the complexity of machine learning algorithms by demonstrating how our Shiny application streamlines iris species identification for users of all backgrounds. Whether you're an aspiring learner, botanical enthusiast, or teaching professional, this presentation provides a wealth of knowledge about the real-life implications of machine learning. Seize the opportunity – immerse yourself in the world of iris blooms and experience the magic of prediction algorithms at your command!
Air Quality - Temperature vs. Ozone Concentration
This is a web page created with R Markdown featuring a plot created with Plotly. We used the "airquality" dataset that comes with R to show ozone concentration in relation to temperature.
Earthquake Map with Clustering
This R Markdown document demonstrates how to create an interactive map with clustering using the leaflet package and the quakes dataset available in R. The quakes dataset contains information about 1,000 earthquakes around Fiji since 1964.
Predicting Exercise Quality with Sensor Data
Activity recognition research has primarily centered on identifying different activities, or predicting “which” action occurred at a specific moment. However, the execution quality, or the “how (well),” has been largely overlooked, even though it can offer valuable insights for numerous applications. In this work, we focus on detecting execution mistakes as a vital aspect of qualitative activity recognition. Thanks to devices like Jawbone Up, Nike FuelBand, and Fitbit, collecting extensive personal activity data has become affordable and accessible. These devices are part of the quantified self movement, where individuals regularly monitor various aspects of their lives to improve their well-being, discover behavioral patterns, or simply because they enjoy technology. While people often track the quantity of an activity, assessing its quality remains rare. In this project, we aim to evaluate the quality of barbell lifts using data from accelerometers placed on participants’ belts, forearms, arms, and dumbbells. The six participants performed barbell lifts correctly and incorrectly in five different manners. We demonstrate our approach to assessing and providing feedback on weightlifting exercises using a sensor-based technique for qualitative activity recognition in this study.
Statistical Inference - Basic Inferential Data Analysis
In this analysis, the ToothGrowth dataset, containing data on tooth growth in guinea pigs, was explored and summarized. The dataset has two variables: tooth length (len) and supplement type (supp), with either vitamin C (VC) or orange juice (OJ) as supplements. The data is divided across three different dose levels: 0.5, 1, and 2 mg/day. Initial exploration of the dataset showed an increase in tooth growth with increasing dose levels for both supplement types, but the differences between the supplement types were less clear. To investigate this further, three independent t-tests were performed to compare tooth growth between the supplement types at each dose level. The results of the t-tests indicated no significant difference in tooth growth between vitamin C and orange juice supplements at any of the three dose levels. To interpret these results, some assumptions were made, including the independence and random sampling of the data, normal distribution of tooth growth measurements within each group, and equal variances of tooth growth measurements between the groups.
Statistical Inference - Simulation Exercise
In this project, we examined the exponential distribution within R and contrasted it with the Central Limit Theorem using 1,000 simulations, each containing a sample of 40 exponentials with a constant rate parameter (lambda) of 0.2. Our investigation focused on the distribution of the average of 40 exponentials, determining the sample mean and variance, and juxtaposing them with their corresponding theoretical estimations. The simulation analysis revealed that the sample mean and variance closely resembled the theoretical mean and variance, suggesting that the simulation effectively captured the characteristics of the exponential distribution. Additionally, the project established that the distribution of the average of 40 exponentials approximates a normal distribution, in line with the Central Limit Theorem's predictions. The histogram of the simulated sample means mirrored a normal density curve, underscoring the distinction between the distribution of numerous random exponentials and that of numerous averages of 40 exponentials. The project's outcomes lend credence to the Central Limit Theorem, which posits that the distribution of sample means converges to a normal distribution as the sample size expands, independent of the population distribution's form.
Negative Impact of Weather Events on Population Health and Economy
Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern. The U.S. National Oceanic and Atmospheric Administration’s (NOAA) maintains a storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.In this paper we will first map the most harmful weather events with respect to population health across the United States. Finally, we will map weather events to their economic consequences.