Hi everyone this is my first article, so please forgive the mistakes and the English (not my first language). My favorite comicbook character is Superman, but I love many characters from DC, Marvel, Image and Vertigo (I don't consider it DC) as well. Usually I am not very vocal, I rather read the fights, than fight them myself (I am a pacifist, a pussy if you will), but the obsession of many users in this site with Rotten Tomatoes Scores (RTS from now) is driving me crazy, so I need to jump in.
For me, it all begun with the BvS tread. Was this movie as successful as expected or lived up to the excitement leading up to it? Resoundingly NO. Was the piece of sh*t many users in the site make it to be? Also, resoundingly NO. I guess I am one of the few people that seem to have enjoyed the movie, but still took it for what it was, an entertaining visual movie with a regular story.
Let's assume we are all nice and we all wanted the movie to be a great success - critically and at the box office. Instead, we got what we got and now many of us are disappointed with the movie. Until here no problem - my problem is when disappointed viewers use RTS as a validation of their frustration, to sh*t on the movie and the people that liked it when they are down. Why does it bother me? From a personal point of view, because, I see this site as a community and I wish we were all more civil and less war-y towards each other and from an objective point of view because statistically speaking RTS is crap. Let's dig -
RT collects scores from hundreds of critics and normalizes them to fit in a 0 to 100 scale - for example if a critic gave a movie 2.5 stars out of 5 his/her RT score is 50 out of 100 (50%). RT then takes all the critics scores and aggregates them into a single metric the RTS. Once at least 80 critic's scores have been aggregated RT publishes a consensus RTS for critics, fans and users all over the world to see. An RTS equal or larger than 60% means the movie is "Fresh", otherwise it is "Rotten". Overall, this process is fine, unless you are a critic and do not like to see your nuanced opinion reduced to an arbitrary number, but that is their problem, not ours. My issue is not really that we are aggregating critics scores into RTS, but how are we doing it. The way that RT scores are aggregated implicitly dismisses middle ground opinions in favor of the extremes.
This is how RTS works. Before any reviews have come out for a movie two counters are set to zero. Movie Positive Reviews = 0 and Movie Total Reviews = 0. RT considers that a critic likes the movie if his/her score is bigger or equal to 60% and counts his review as positive. For every positive review the Movie Positive Reviews Counter as well as the Movie Total Reviews Counter are increased by one.
Movie Positive Reviews = Movie Positive Reviews + 1
Movie Total Reviews = Movie Total Reviews + 1
Note how both an score of 60 or 100, are both weighted equally in the Positive Review counter.
If a critic does not like the movie i.e. it gave the movie a score < 60% only the total reviews counter is increased by one.
Movie Positive Reviews = Movie Positive Reviews
Movie Total Reviews = Movie Total Reviews + 1
Note how both an score of 0 or 59, have the same effect in the Positive Review counter (none).
The final RTS is calculated as the ratio between Movie Positive Reviews and Movie Total Reviews:
RTS = (Movie Positive Reviews)/(Movie Total Reviews).
This is an awful metric for movie quality or anything else -
Imagine 100 critics watch 2 movies - they all agree neither was great, neither was crap, but the second one was slightly better. 50 critics gave movie 1 a score of 57 and 50 critics gave it a score of 59, while 50 critics gave movie 2 a score of 60 and 50 critics gave it a score of 62. If we use RTS to evaluate the critics opinion on the movies we would see that movie 1 has an RTS of 0 (all critics gave it a score < 60), while movie 2 has an RTS of 100 (all critics gave it a score of at least 60). Conclusion: critics think that movie 1 is the worst movie ever produced and movie 2 is the best movie ever produced. Nothing could be further form the truth. An honest metric will show us the average opinion of the critics together with the variation in their opinions. This in statistics are the mean and standard deviation of a measurement. In the case of movie 1 the score would be 58 +-1; and in the case of movie 2 the score would be 61 +- 1. These measurements shows that the critics still consistently liked movie 2 slightly better than movie 1, that the critical opinion is not divided and that critics don't think that movie 1 was crap or movie 2 the best movie ever made. These scores reflects reality much better than RTS (0 vs 100) and it would be equally easy for RT to calculate.
We could have included only the mean value of the critic scores as a metric, but that would have only partially reflected the critical reception of a movie. Including the standard deviation together with the mean tell us not only what the consensus on a movie was, but also how divisive the movie was. Think of a movie like Man of Steel, its RTS is 56%, (its mean score is 62% which would had made it fresh, although barely). RT does not provide the standard deviation on Man of Steel mean socre, and I am not willing to go through 291 reviews to calculate it. However, it will not stretch the imagination of anyone to assume that the 162 positive reviews were very high 88% for example and the 129 negative reviews were quite low 30% for example. I chose these numbers because they average to the actual mean score of Man of Steel ([162*88+129*30]/291 = 62), while they also reflects how divisive the movie was. Calculated with this oversimplified numbers Man of Steel score was 62% +- 29%. This score indicates that on average people though that the movie was ok, but also shows that the movie was very divisive, with people strongly liking or disliking it. From these numbers you could infer that if you don't like Snyder, or DC or darker takes on superheroes you may want to skip the movie - However, if you happen to like DC, Snyder or darker superhero movies you may infer that this is the movie for you. All these are things that we already knew, but it still a nice thought exercise -At least I think so!
To finalize. I think RTS is an intellectually dishonest score. Using the mean +- stdev of the critics' and general audience's score would reflect much better the average sentiment critics and the general audience have of a movie while also providing information on how divisive it was. As with any score, there are flaws, there is the danger of outliers: fanboys/haters giving a movie very high or very low scores to help/destroy the movie score before they have even see it - this can be addressed by eliminating the top and bottom 5% of the scores (i.e. all the 100s and 0s that come before the movie is even out) when calculating the movie final score metric.
Would you rather see RT use this in stead of RTS? Do you agree with me that mean +- stdev is a fairer and more informative way to score a movie?
Thanks for reading : )
PS - the mean score of BvS is 49% - still not great, but far avobe 29%.