I found this article interesting, particularly from the eye of a statistician. Particularly since some of the same problems that results for “meta-analysis” studies are actually reflected in this articles, which deals with the aggregation of Video Game reviews. In many ways, we see that the problems are similar to the weakness in a meta-analysis study. Namely, different studies (or in this case, reviewers) have different purposes and designs to how they evaluate something, and trying to put them all together in a nice simple number is just, well, too simplifying.
Linked below for reference.
The world of game reviews is often difficult to navigate. Everyone uses different scores, and a large emphasis is placed on the single score given to games by Metacritic, a review-aggregation site. Metacritic uses a scale of 1 to 100 for reviews, a figure calculated by averaging multiple scores. What comes out after that averaging is seen as something akin to a gold standard for judging the quality of a game. Weve been asked numerous times why were not included in the game rankings given by Metacritic: our reviews arent linked from the site, and were not included in the final uber-score. Thats by design.