There are a lot of different rating systems around the industry, and you may be wondering why we didn't just adopt one used elsewhere. Let's take a quick look at how we think about our ratings, as well as the benefits and drawbacks of popular systems.
Goals
Each rating system has different goals and meanings for both the reviewer and the reader. We need to be cognizant of these different aspects to make sure our ratings convey value to everyone.
Consistency
For reviewers themselves as well as long-time readers, rating systems are about consistency. They let us look at the historical statistics of our ratings, to understand a reviewer's average rating as well as the absolute range in which they've operated. This information is crucial to helping understand what a single rating means within the larger picture of a reviewer's individual taste.
If a reviewer switches between different systems, there's a loss of fidelity as both the reader and the reviewer themselves struggles to relate between the two systems. What may have made sense in the moment becomes more difficult to understand as time passes, and ratings are a historical record of the tasting experience. Consistency ensures that ratings still have meaning and value after time passes!
Comparisons
Using the same metrics for each review gives relative meaning to different reviews, providing a quantifiable measure for comparing different products. It is often difficult to understand how a reviewer's experience relates to their other tastings by words alone. At the end of the day, many readers want to know if one review constitutes a better rating than another review. In a world of many great products, there are sometimes minute differences that don't surface well in their written thoughts. As such, being able to directly compare two ratings generated by the same, consistent system provides immediate value to everyone.
Quantitative Value
This is the goal everyone thinks of when they talk about ratings. Distilling many positive and negative thoughts into a single result (or sometimes multiple values) is a difficult process, but it's a valuable endeavor for both reader and reviewer alike. A quantitative rating provides an easily digestible measure of how good (or bad) a product is. This is great as a quick reference, as well as for comparing a reviewer's experience across many products, as we mentioned above. Of course, it's also a popular marketing tool; consumers love easy choices, and distilling the qualities of a product down to a rating means we don't have to think as hard!
Some people will contend that ratings are meaningless and only the tasting notes and written word matter, but we believe there is no "one size fits all" solution. At the end of the day, if you want to read people's thoughts and ignore their quantitative ratings, go for it! There are plenty of other people who find value in this approach, so we feel it's important to provide it as a quick, accessible measure of our thoughts.
Rating Systems
Now that you understand the goals of a good rating system, let's take a look at some different systems you might see in use. We want to be up front that while the following takes a critical look at the benefits and drawbacks of each system in theory, this should not be viewed as a criticism of any other reviewer's approach. We're providing this to give more context to how we designed our own ratings, and so that everyone can be aware of the implications of choosing a particular scale. There is no perfect answer (that we know of!), just different options available.
100-Point Scale
The classic scale topping out at a perfect 100 is perhaps the most widely-used rating system in the world today. In the world of spirits, Robert M. Parker, Jr. is broadly credited with introducing a specific 100-Point system for rating wines. The "Parker Scale", as we call it here at Hella Drunk, does not encompass the full range of 0-100 as one might assume at first. Rather, it starts at 50 and roughly corresponds to letter grades used by the American education system. To simplify it, 90+ is a great rating, 80+ is above average, 70+ is average, and anything below 60 is an outright unacceptable product.
Benefits
Now, there are a lot of advantages to a 100-point scale. For example, there's a lot of room to be particular about minute differences across such a large range of values. Using whole numbers to differentiate scores is generally easier to parse than decimal values. People also inherently understand scores out of 100 due to their frequent use in educational testing and prevalence of percentage values for all manner of uses throughout society.
Additionally, there's always a benefit to using the same system as everyone else. Even though ratings are always subjective and specific to the reviewer, people still like to compare scores across publications.
Drawbacks
Conversely, however, even the top half of 100 points provides a lot of possible scores, which can sometimes seem overwhelming to process. There are debates over whether a difference of one point between two scores actually has meaning. While people like to think of these scores as percentages, they're very clearly not, since nothing can score less than 50. This creates a misleading impression for some readers who, for example, might see a 75 as a pretty great score. After all, 75% means that 3 out of every 4 is worse than that score, if it was truly a percentage. However, in the way most people utilize a 100-point scale, 75 is straight up average, in some ways roughly analogous to 50% (but not quite). This can be confusing for people who aren't already familiar with the Parker Scale, and we believe that ratings should ultimately be accessible and intuitive.
Finally, there is a lot of nuance lost in reporting just a single score out of 100. What were points lost for? How were they gained? Sure, you can read the text of the review to find the context, but having to do that to understand every rating undermines the convenience of the ratings themselves. You may notice that many implementations of this system tend toward grouping the vast majority of ratings around the mid-80s and low 90s, leaving readers with a lot of very similar ratings to consider. Two close ratings may have been achieved for wildly different reasons that matter more or less to the reader, and the 100-point scale doesn't do a great job of surfacing that information.
Letter Grade Scale
Some reviewers have eschewed the Parker Scale in favor of the standardized letter grade system it was patterned after. That is, ratings run from a top score of A+ down to the bottom of the barrel at F (or possibly even F-, if necessary). Sometime around the 1940s, letter grades became popular in the United States education system, anchoring letter values around a 100-point scale already in use at the time (A = 90+, B=80+, and so on). It's interesting how that evolution came full circle, eh? Anyway, the letter grade system was also closely bound to existing 4-point scales, which you may recognize as the Grade Point Average (GPA) scale.
Benefits
Letter grade systems are easy to understand, at least for people who are familiar with education systems and other places they've been used for decades. It also creates a clear sequence of quality, between the letter itself and the +/- modifiers that can be applied. By reducing the number of possible values from 50 (in the Parker Scale) to 15 (5 letter grades + 10 modified letters between them), the comparisons between values are intuitive and immediately obvious.
Drawbacks
The flip side to having so few values that can be assigned is that it's less useful for directly comparing two ratings. There are only so many buckets you can assign a product to, which makes it highly likely that a lot of them will have the same score. Any time two products have the same score, we assume they are of equal quality. However, many readers will still ask how they can make a choice between the two. The letter grade system greatly increases the probability of two products having the same score, directly because it tries to simplify the scoring system by removing intermediate scores from the 100-point scale.
Another potential drawback of the letter grade approach is its inflexibility when it comes to comparing it to other rating systems. While always suboptimal to an extent, numerical values can at least be converted between most systems. As long as there is a minimum and maximum to the scale, we can scale ratings up or down to match other systems and get a reasonable approximation. With letter grades, however, each letter naturally represents a range of numerical values along the 100-point scale. For example, A- is traditionally 90-93, A+ is often 97-100, and so on. If an A rating is assigned to a product, what exact numerical value would you convert that to? 94? 95? 96? While similar issues exist with scaling up smaller numerical systems to the 100-point version, the problem in those instances is less one of exactness and more of simply having fewer intervals.
10-Point Scale
A simple scale from 1 to 10. Easy, right? This scale is a bit of an enigma, mostly because it is utilized in divergent ways despite having the same minimum and maximum values. Some reviewers will adhere strictly to the integer numbers, while others prefer using decimal values in between. Some of those who use decimals will implicitly follow the Parker Scale by grouping their ratings around the top half of the scale. In either case, the 10-point scale is a fairly popular alternative to the full 100-point scale.
Benefits
It's easy to count from 1 to 10. The simple version of this scale has many of the same benefits as the letter grade system, in that it reduces scoring down to just a handful of possible values. Some people find this version more familiar due to the prevalence of 10-point scales in certain competitions. We can all think back to some time when we've seen a panel of judges hold up their scoring signs to show 8s, 9s, or even 10s! For those who want more granularity, however, half-step or even decimal intervals can be introduced. Using decimals more or less approximates the 100-point scale, just divided by 10. For those who use the full scale and don't want to get confused with the top-heavy Parker Scale, it's easy to see why they might choose this option instead.
Drawbacks
On the other hand, do we need a 10-point decimal scale if we can just use 100-point systems instead? Perhaps a reviewer feels it's easier to mentally decide their score out of 10, but then couldn't they still report it out of 100? If the integer scale is used instead, that would seem to have similar drawbacks to the letter grade system, in that there aren't enough intervals to differentiate reviews from one another.
5-Point Scale
I like to think of the 5-point scale as the "Star Scale", because that's how many reviewers choose to represent it. This scale often allows for decimal values, but only to indicate a half-step. That is, you might see a score of 3.5 or 4.5, but rarely do reviewers seem to use 3.7 or 2.8, for example. In our experience, this is probably the third most-popular system for rating spirits, after the 100-point and 10-point scales.
Benefits
The immediately obvious value of this system is the way it translates to an intuitive set of symbols such as stars. A review can say they're giving 4 1/2 stars, for example, and it's very clear what that means. The scale is small, yes, but it tracks well with other systems. For example, allowing half-step intervals makes the 5-point scale roughly equal to the 10-point integer scale, with the added benefits of clarity we just outlined. Additionally, since the aforementioned Parker Scale runs from 50-100, that's very similar to just scoring from 5-10, or really, 0-5. As such, the 5-point scale very nicely encapsulates a similar range to the 100-point scale without the confusing exclusion of half the numbers.
Drawbacks
As you're probably already guessing, this system has similar drawbacks to the 10-point scale or letter grades. Reducing the number of intervals necessarily creates a loss of fidelity, and you'll find a lot of products end up getting scores of 3.5, 4, or 4.5. This scale is also slightly harder to convert to the 100-point system. Do you assume a 2.5/5 is 50/100, or do you treat it like the Parker Scale and call it 75/100? All conversions require an inherent understanding of the reviewer's thought process for scoring, but this particular scale is a little more difficult than most.
Multi-Rating Systems
Some reviewers, including Hella Drunk, utilize a system where they publish ratings for multiple categories as well as an overall score. Each category represents a different aspect of the product, demonstrating levels of quality along each gradient. In some systems, the category ratings exist as mathematically independent values from the overall outlook, whereas others average them out or use a weighted calculation to derive the final score.
Benefits
The most obvious benefit of these systems is the ability to give consumers a deeper quantitative look at the product. Some categories are more important than others, and a reviewer may choose to have one influence the final score more than another. Readers may have different opinions on what is important, so this gives them a chance to peek behind the curtain and feel like the ratings are a little less opaque. They can then judge for themselves if the overall score makes sense based on which categories they care most about, which in turn empowers them to make better decisions as consumers!
Drawbacks
Of course, throwing multiple numbers (or letter grades) at your readers can also be confusing. After all, the goal of quantitative analysis is to provide a simple, intuitive look at each product. If not presented well, readers can get lost in the numbers and walk away with more questions than answers.
Conclusion
If you've made it this far, bravo! You just walked through a deep dive into the different rating systems you'll see out in the wild, and hopefully you now feel more empowered as a reader. As we've said before, no system (even ours!) is perfect, but the important thing is to understand the pros and cons of a particular system. Always keep in mind that reviews are subjective to an individual's taste and preferences, and a system is only ever as good as the reviewer's attempts to stay consistent and keep their scores meaningful. Now, go out into the world of rating systems and crunch some numbers!