Recently Published
Understanding linear regression with Perth house price data
Notes to help understand linear regression as taught in the Coursera Machine Learning specialisation.
England Recycling Rates - A cluster analysis
Can a cluster analysis of recycling rates by local authority in England provide insights to help improve overall recycling?
Farnham Food Data Profiling
Data profiling script for the Farnham Food Hygiene Ratings project
edX Statistics - Week 4 - Uncertainty in Data
Using samples to make predictions about a population brings uncertainty into our data. As the study of risk and uncertainty, probability is therefore key to understanding statistics. We introduce the ideas here for describing and quantifying uncertainty via probabilities.
edX Statistics - Week 3 - Collecting Data
We look at key methods of data collection, seeing how we generally use samples of a population to make predictions about the whole population. We learn about how to choose a representative sample, and how to set up a statistical experiment.
edX Statistics - Week 2 - Patterns in Data
We look further into the science of data analysis, focusing on finding and interpreting relationships between different data sets, and on using trends in data to make predictions.
Modeling With Data in the Tidyverse Chapter 4
In the previous chapters, you fit various models to explain or predict an outcome variable of interest. However, how do we know which models to choose? Model assessment measures allow you to assess how well an explanatory model "fits" a set of data or how accurate a predictive model is. Based on these measures, you'll learn about criteria for determining which models are "best".
Modeling With Data in the Tidyverse Chapter 3
In the previous chapter, you learned about basic regression using either a single numerical or a categorical predictor. But why limit ourselves to using only one variable to inform your explanations/predictions? You will now extend basic regression to multiple regression, which allows for incorporation of more than one explanatory or one predictor variable in your models. You'll be modeling house prices using a dataset of houses in the Seattle, WA metropolitan area.
Modeling With Data in the Tidyverse Chapter 2
Equipped with your understanding of the general modeling framework, in this chapter, we'll cover basic linear regression where you'll keep things simple and model the outcome variable y as a function of a single explanatory/ predictor variable x. We'll use both numerical and categorical x variables. The outcome variable of interest in this chapter will be teaching evaluation scores of instructors at the University of Texas, Austin.
Modeling With Data in the Tidyverse Chapter 1
From Datacamp. This chapter will introduce you to some background theory and terminology for modeling, in particular, the general modeling framework, the difference between modeling for explanation and modeling for prediction, and the modeling problem. Furthermore, you’ll start performing your first exploratory data analysis, a crucial first step before any formal modeling.
Surrey/Hants Primary School Admissions Analysis
This document forms part of a project I am working on to analyse and present primary school admissions and performance data in Surrey and Hampshire in a meaningful way to parents looking to choose a school for their child. The full project, including all source data and scripts, can be found on Github.
Moving Averages
What are moving averages and how can we calculate them in R?
Explore Weather Trends
Project submitted as part of the Data Analyst Nanodegree by Udacity
edX Statistics - Week 1 - Introducing Data
First in a series of notes adapted from the edX EdinburghX course on Statistics: Unlocking the World of Data.