Coffee Data Science

Espresso Coffee Distributors: More Data

Another angle across a few techniques

Robert McKeon Aloe
7 min readFeb 6, 2024

Another dataset has come out on distribution tools for espresso. This dataset was collected by Lance Hedrick to examine how many popular tools affect extraction yield. The data showed one of the simplest techniques, the blind shaker, works the best. This result is contrary to all community belief, but it is another example of how data can be useful for coffee.

The first dataset (8 shots across 7 distribution techniques) was criticized by some (not me, FYI, I thought it was fine) for being too small and using a high end grinder (the EG1). So he collected a second data set (15 shots across 5 distribution techniques).

There is a small wrinkle in the dataset that isn’t quite explained by the data collected, but at a bare minimum, a blind shaker performs the same if not better than the rest.

I pulled his data, and I took a look because I love data. I did direct paired comparisons of sorted data. I’ll look at both of his datasets, starting with the first.

Definitions

Total Dissolved Solids (TDS) is measured using a refractometer, and this number combined with the output weight of the shot and the input weight of the coffee is used to determine the percentage of coffee extracted into the cup, called Extraction Yield (EY). Typically, one aims for 18% to 22% extraction or some times higher, but it is difficult to get more than 30% EY.

First Data Set

He collected 8 shots across a few methods of distribution:

  1. Horizontal Tapping
  2. WW Blind Shaker
  3. Manual WDT
  4. BH Autocomb
  5. WW Moonraker
  6. 3D Printer Spirograph (similar to the WW Moonraker)
  7. NCD v3 (an OCD variant)

WW Blind Shaker

BH Autocomb

WW Moonraker

3D Printer Spirograph

NCD v3

First Dataset Analysis

I started by sorting all the shots and plotting the data. The WW Blind Shaker is sitting comfortably on top.

Then I compared all samples in a scatter plot to horizontal tapping, which is the baseline.

Things got fun when I compared all samples to the Blind Shaker.

Then I looked at a few variants compared to tapping.

Next, we have three tools that perform WDT with more automation, and I compared them to manual WDT. The Moonraker seemed to have a slight edge.

I also compared the Spirograph and Autocomb to just the Moonraker, but at the very least, the Moonraker seems to do better.

First Dataset: Statistical Significance

Lance originally compared this data using a one-tailed t-test of unpaired values. Those results only showed statistical significance for the blind shaker vs everything else and WW Moonraker vs BH Autocomb.

For example, in his results, the WW Moonraker vs Horizontal Tapping had a p-value of 0.236 for a 1-tail t-test. For a 2 tailed t-test, the p-value was 0.472, and for a paired 2 tailed t-test, the value was 0.046. Even p-values and t-tests have a range of how they can be approached. With smaller samples, it is a challenge.

Since I paired all the sorted samples, I ran a two-tailed paired t-test. I like using those because the sample order shouldn’t matter, and then you are comparing the best to the best and the worst to the worst.

Most of the differences were statistically significant. Green means the result is statistically significant, and red means it wasn’t. Remember that N=8, so a larger dataset may tell a different story. I would prefer a larger dataset.

This is the average difference with red boxes for data points that are statistically significant.

Second Data Set

This dataset used two grinds (EG1 and DF 64). I kept all data grouped with the same grinders, and the shots were taking across grinders over many hours of data collection. There were not EG1 samples for the BH Autocomb.

Starting with the raw data, I sorted the samples. It seemed Manual WDT was close to the Blind Shaker, but oddly enough, this changes for DF 64.

Looking at a scatter plot, the Blind Shaker still wins out.

The manual WDT samples were bothering me, so I wanted to look at the data in a few different ways. First, I looked at shot order:

Then I checked shot time. All of these shots were taken across variables throughout the test to hopefully remove bias tied to machine temperatures or changes to the beans. There is not a pattern for the other techniques, and the range seemed to be less than the Blind Shaker.

There was the thought that the Blind Shaker had higher EY connected to the shot time. Looking at shot times, the Blind Shaker stands out, but there are not trends in the other techniques. The shot times were more similar for the EG1, and most of the lower shot times are from the DF 64.

Second Round: Statistical Analysis

Let’s take a look at the overall differences. Almost all the differences are statistically significant in both a paired and unpaired t-test.

Average differences:

However, these results seem slightly skewed by the Blind Shaker doing so well against all when using the DF 64. Let’s focus on just the EG1 and merge the results from the first test set. I dropped the BH Autocomb because it didn’t have new data in the second data set.

These results seem closer for statistical significance, but also for the average differences.

This data flies in the face of actually using WDT in any shape, and the results question whether the cost for these tools are appropriate. Espresso does not have to be so expensive.

The data caveats are simple:

  1. Two coffee types
  2. Coffee dialed in (maybe these tools better impact non-ideal grind settings?)
  3. Coffee age is within the same day
  4. 2.5:1 shot ratio
  5. Two coffee machine (Unica then Decent), 2 grinders grinder, and one basket

It is quite possible any of these variables could alter the outcome, and until there is some data, it is hard to know where the benefits are for WDT type tools. I suspect doing a broad range of tests on just WDT or some variant would be interesting to understand the space better.

I think there is still a hidden variable, quite possibly the temperature of the shaker and the filter basket. This is quite important in the second experiment where all the grounds were dosed directly into the filter basket except for the blind shaker.

If you like, follow me on Twitter, YouTube, and Instagram where I post videos of espresso shots on different machines and espresso related stuff. You can also find me on LinkedIn. You can also follow me on Medium and Subscribe.

Further readings of mine:

My Second Book: Advanced Espresso

My First Book: Engineering Better Espresso

My Links

Collection of Espresso Articles

A Collection of Work and School Stories

--

--

Robert McKeon Aloe
Robert McKeon Aloe

Written by Robert McKeon Aloe

I’m in love with my Wife, my Kids, Espresso, Data Science, tomatoes, cooking, engineering, talking, family, Paris, and Italy, not necessarily in that order.

Responses (1)