Coffee Data Science

Bad Coffee Roasting on Purpose

Tasting for defects

5 min readMar 29, 2024

As I have been roasting, I wanted a better intuition of roasting defects. The challenge for me is that I haven’t known exactly what was a baked roast or a roast with a particular problem. I have made roasts that tasted burned or grassy, but I didn’t have much supporting data to understand why.

So I made a bunch of bad roasting profiles to get a better idea.

I looked at five profiles:

Baseline
Baked
Spiked
HotColdHot
Spiked

The baked roast causes a drop in the Rate of Rise, making it go negative.

The spiked roast causes a fast temperature spike after first crack.

HotColdHot quickly raised the temperature, then dropped back down.

Slow curve aimed to make a slower rise to temperature.

We can first look at the roast data, then the roasted bean data, and then espresso shot data.

Roasting

I used Scott Rao’s roast profile that uses bean temperature as turning points for inlet temperature. This is my baseline profile.

I used a Colombia coffee and a Burundi coffee.

Roast Data

The bean temperature curve look as expected.

The Rate of Rise (RoR) shows the impact of these different profiles.

These profiles had a big impact on the number of detected cracks. If I had to guess, the Spiked profile should be more developed because of the higher end temperature and the number of cracks as well as the rate of cracks.

Roasted Coffee Metrics

These metrics come from a Syncfo 4in1 to see how any of these profiles impact the bean.

As predicted, the spiked profile had a higher weight loss for the Colombian indicating it is more developed or darker.

In terms of moisture, the results were all over the place. The Baked roast had more moisture as did the HotColdHot.

The coffee densities were consistent for Burundi but scattered for Colombia.

Roast color had some variations, particularly for the Spiked Colombian as predicted.

Tasting Equipment/Technique

Espresso Machine: Decent Espresso Machine, Thermal Pre-infusion

Coffee Grinder: Zerno

Coffee: Home Roasted Coffee, medium (First Crack + 1 Minute)

Pre-infusion: Long, ~25 seconds, 30 second ramp bloom, 0.5 ml/s flow during infusion

Filter Basket: 20 Wafo Spirit

Other Equipment: Acaia Pyxis Scale, DiFluid R2 TDS Meter

Metrics of Performance

I used two sets of metrics for evaluating the differences between techniques: Final Score and Coffee Extraction.

Final score is the average of a scorecard of 7 metrics (Sharp, Rich, Syrup, Sweet, Sour, Bitter, and Aftertaste). These scores were subjective, of course, but they were calibrated to my tastes and helped me improve my shots. There is some variation in the scores. My aim was to be consistent for each metric, but some times the granularity was difficult.

Total Dissolved Solids (TDS) is measured using a refractometer, and this number combined with the output weight of the shot and the input weight of the coffee is used to determine the percentage of coffee extracted into the cup, called Extraction Yield (EY).

Shot Data

Each set of 5 shots were collected within a day or each other, and they are ordered in time post roast. These results are not conclusive, and I was hopeful that one of these defects would be so obvious as to not be ignored even in 3 shots.

The Colombian variants did not have as large of taste differences as I would have thought. Spiked scored lower because it was more developed so there was not as much sweetness. I was surprised how HotColdHot performed.

For Burundi, the baseline was higher, but there were not great differences between the defects. The Slow Curve should be close to the baseline anyhow similar to the Colombia coffees.

For Extraction Yield (EY), there were no great differences aside from one of the Spiked shots. That particular shot had bad flow, so I would discount it from the EY perspective. In terms of taste, it scored similar to the other two Spiked shots, which speaks more of how I score with respect to roast development.

The Burundi EY numbers didn’t show major differences to explain taste differences aside from one outlier in the Baseline.

All of this tasting was for espresso because that is my focus. It is possible these defects come out differently in cupping or pourover, and I will leave that for someone else to discover. The same could be said for weaker espresso shots rather than a highly efficient 1:1 shot.

These results make me skeptical that roast defects are easy to detect in espresso, and I am really looking for ways to make a bad shot so I can understand why it is failing. Why something fails gives clues to how to avoid the failure or make it succeed.

In the mean time, I want to explore HotColdHot because it is such a strange one.

If you like, follow me on Twitter, YouTube, and Instagram where I post videos of espresso shots on different machines and espresso related stuff. You can also find me on LinkedIn. You can also follow me on Medium and Subscribe.