Posted April 05, 2012
(reposted from one of the many "rework the ratings" wishes)
Now, 8 of these games are new or unreleased on GOG (Grimrock not having been released yet at all, but rating unreleased games is a separate issue which is probably being taken care of). However, based on how the new 5-star releases really took off in the all-time bestseller ratings (the whole catalog sorted by bestselling, not top selling section on the main page), I'm guessing they are really damn good.
The so-called "Bayesian Average" (or, "Alfred Hitchcock presents" for wannabe math nerds - seriously, the term is completely made up with Bayes' name tacked on for credibility) does not do what the OP thinks it does. It does not lend more weight to lower scores, because it's an average. It gets rid of flukes (small datasets with extreme average values, such as the newer games). So it would fold these new games into the 4.5 pile that the OP is complaining about and probably assign a 4 to MoO3 to boot. it does not increase diversity, it reduces diversity.
The current rating system is as perfect as a single number could be. The filter offers the "cream of the crop" (5 stars only, a collection of ~30 games) and "guaranteed not a turd" (4-5 stars). This is exactly what it should accomplish.
If you want more data, there are two schemes that do not suck outright:
1. An additional review-based rating, weighted by helpfulness of the review in question - note that this should never displace the simple rating, because people would write more junk reviews otherwise.
Disadvantage: encouraging junk reviews (people would still write them, even after having the math explained to them).
2. A breakdown by rating, as Amazon does.
Disadvantage: encouraging cause-voters, as they would be able to see their cause-one-star-votes having an immediate effect.
However, Antoine de Saint-Exupéry had to say this concerning "more data", and I happen to agree:
Grown-ups love figures. When you tell them that you have made a new friend, they never ask you about essential matters. They never say to you, "What does his voice sound like? What games does he love best? Does he collect butterflies?" Instead they demand: "How old is he? How many brothers has he? How much does he weigh? How much money does his father make?" Only from these figures do they think they have learned anything about him.
Which is to say, there are also descriptions, genre classification, editorials, reviews, screenshots, forums, youtube, LPs. One or several human-readable numbers, be they 0-10 or 0-100, aren't going to replace that, no matter if you ascribe the rating method to Bayes or Einstein or Mandelbrot.
FINALLY:
How does a game getting a high user rating from its fans is a problem instead of the intended purpose? You're sort of supposed to give good ratings if you're a fan - because you're a fan for a reason, and that reason is "the game is good, I liked it". This will only be a problem if someone launches a "sign up for GOG and vote for my favorite game" campaign, which I've yet to see (it happens on IMDB and Amazon for different reasons). And even if one arises, awesome, more people signing up is good for business.
Instead of common average, make use Bayesian average in ratings. There are too much games getting too high rates from their fans. To give better evaluation on the true quality of the title for people not familiar with it, use of Bayesian average would be crucial. It gives more emphasis on lower numbers in a specific set of data. This is especially useful when the set of data is few in numbers like it is in Gog.com ratings. What good is an average to people unfamiliar with the title if only the positive votes are counted and most of the titles have 4,5-5 out of 5 stars rating?
As of this writing, out of 371 GOG games (their number) 32 show up when you choose to filter by 5 stars only. This is less than 10%, which is the recommended percentage of As on school bell curves. (Note: Bell curves in school are actually horrible in general, but that's not the point here - the point is people felt good about 10% of something getting the highest grade on a scale of 0 to 5, the problem is that they applied it to living, breathing people instead of movies or games or whatever.) Now, 8 of these games are new or unreleased on GOG (Grimrock not having been released yet at all, but rating unreleased games is a separate issue which is probably being taken care of). However, based on how the new 5-star releases really took off in the all-time bestseller ratings (the whole catalog sorted by bestselling, not top selling section on the main page), I'm guessing they are really damn good.
The so-called "Bayesian Average" (or, "Alfred Hitchcock presents" for wannabe math nerds - seriously, the term is completely made up with Bayes' name tacked on for credibility) does not do what the OP thinks it does. It does not lend more weight to lower scores, because it's an average. It gets rid of flukes (small datasets with extreme average values, such as the newer games). So it would fold these new games into the 4.5 pile that the OP is complaining about and probably assign a 4 to MoO3 to boot. it does not increase diversity, it reduces diversity.
The current rating system is as perfect as a single number could be. The filter offers the "cream of the crop" (5 stars only, a collection of ~30 games) and "guaranteed not a turd" (4-5 stars). This is exactly what it should accomplish.
If you want more data, there are two schemes that do not suck outright:
1. An additional review-based rating, weighted by helpfulness of the review in question - note that this should never displace the simple rating, because people would write more junk reviews otherwise.
Disadvantage: encouraging junk reviews (people would still write them, even after having the math explained to them).
2. A breakdown by rating, as Amazon does.
Disadvantage: encouraging cause-voters, as they would be able to see their cause-one-star-votes having an immediate effect.
However, Antoine de Saint-Exupéry had to say this concerning "more data", and I happen to agree:
Grown-ups love figures. When you tell them that you have made a new friend, they never ask you about essential matters. They never say to you, "What does his voice sound like? What games does he love best? Does he collect butterflies?" Instead they demand: "How old is he? How many brothers has he? How much does he weigh? How much money does his father make?" Only from these figures do they think they have learned anything about him.
Which is to say, there are also descriptions, genre classification, editorials, reviews, screenshots, forums, youtube, LPs. One or several human-readable numbers, be they 0-10 or 0-100, aren't going to replace that, no matter if you ascribe the rating method to Bayes or Einstein or Mandelbrot.
FINALLY:
How does a game getting a high user rating from its fans is a problem instead of the intended purpose? You're sort of supposed to give good ratings if you're a fan - because you're a fan for a reason, and that reason is "the game is good, I liked it". This will only be a problem if someone launches a "sign up for GOG and vote for my favorite game" campaign, which I've yet to see (it happens on IMDB and Amazon for different reasons). And even if one arises, awesome, more people signing up is good for business.