Coffee Data Science

Colombia Copa De Oro Competition Breakdown 2023

A look at cupping data

Robert McKeon Aloe


Christopher Feran participated as a judge for the Colombia Copa de Oro Competition, and afterwards, he was able to get the data from all the cuppers for all of the coffees. So I took a look at this data to see how close cuppers scored, which Christopher wrote about.

I looked at the data in a few ways, and I was trying to understand how closely different cuppers scored to each other. I also wanted to know how much scores for coffees and cuppers changed from the preliminary two rounds to the final round.

Raw data

For these experiments, I dropped cupper 7 because they didn’t score a few coffees. I also dropped coffees that didn’t make it to the final round. The final round was determined by the average of the first two rounds.

Firstly, we have paired data for all of these cuppers across all of the rounds, which means we can apply a two-tailed paired t-test to determine if two cuppers scored coffees differently in a statistically meaningful way. I made this test of all cuppers against all cuppers. The green means they are statistically significantly different in their score distributions, which is anything less than 0.05.

I then took the average differences in scores, and I colored them with green for cuppers that are statistically different in how they score. As you can see, the actual differences are small in most cases aside from cupper 18 who was a point off from most.

I then resorted the columns and rows to show a group of cuppers that all score within statistical significance of each other. I will come back to this group later.

Final Round vs Preliminary Rounds

We can plot all the scores from all the rounds, but this is difficult to read. I plot it here to show you, but I have another way to view the data.

Example Graph

To talk about preliminary (average of rounds 1 & 2) vs final round, I want to use what is called a Waydogram. This shows a baseline in gray, a decrease in red, and an increase in green. So just gray means the final round score was the same as the preliminary score. Red means the final round score decreased by the size of the red. Green means there was an increase from the baseline of the gray.

First, we can look at all the coffees, which most of them improve their score.

If we average just the cuppers who are statistically the same, we get something similar.

If we use only the cuppers that were statistically different from the rest.

The order of the coffee don’t change that much. Most changes happened with simple flips in position. I put the flips here in this table:

Per Coffee

Let’s look at all the cuppers per coffee starting with overall, and then the coffees are descending order.

I didn’t see anything too crazy; many coffees improved in how they tasted. I’m not sure if that was an artifact from some other variable.

There were some variations.

The last few coffees decreased in score more than others.

Each round was statistically graded higher than the previous

Per Cuppers

I also thought each cupper might have some interesting differences that speak to their changes in preferences.

Nothing stood out, but maybe I didn’t look at the data long enough.

I thought maybe someone would crash a coffee total score or raise one up, but I didn’t see much evidence of anything suspicious.

Overall, this data was excellent for understanding how cuppers grade in competition and how well calibrated they are to one another. I hope to see more data of this sort.

Note: I originally called this data from being Cup of Excellence, but it was Copa de Oro.

If you like, follow me on Twitter, YouTube, and Instagram where I post videos of espresso shots on different machines and espresso related stuff. You can also find me on LinkedIn. You can also follow me on Medium and Subscribe.

Further readings of mine:

My Second Book: Advanced Espresso

My First Book: Engineering Better Espresso

My Links

Collection of Espresso Articles

A Collection of Work and School Stories



Robert McKeon Aloe

I’m in love with my Wife, my Kids, Espresso, Data Science, tomatoes, cooking, engineering, talking, family, Paris, and Italy, not necessarily in that order.