Blog Posts

Making beautiful bar charts with ggplot

Bar charts (or bar graphs) are commonly used, but they’re also a simple type of graph where the defaults in ggplot leave a lot to be desired. This is a step-by-step description of how I’d go about improving them, describing the thought processess along the way. Every plot is different and the decisions you make need to reflect the message you’re trying to convey, so don’t treat this post as a recipe, treat it as some points to consider—and hopefully, a few tips that will help you achieve the look you want in your own plots.

Things which are equivalent to t-tests

If you have a continuous measurement and two groups you’d like to compare based on that measurement, what’s the first statistical test that comes to mind? Chances are it’s the two-sample t-test, sometimes known as Student’s t-test. It’s typically the first statistical test taught in an introductory statistics course, it’s well known and understood, and it has good theoretical properties—so if a t-test answers your research question, you should probably use it.

Understanding levels of variation and mixed models

Data that has some kind of hierarchical structure to it is very common in many fields, but is rarely discussed in introductory statistics courses. Terms used to describe this kind of data include hierarchical data, multi-level data, longitudinal data, split-plot designs or repeated measures designs. Statistical models used for these types of data include mixed-effects models (often abbreviated to just mixed models), repeated measures ANOVA and generalised estimating equations (GEEs).

Plotting multiple variables at once using ggplot2 and tidyr

In exploratory data analysis, it's common to want to make similar plots of a number of variables at once. Here is a way to achieve this using R and ggplot2.