Sunday, February 24, 2013

The RUR System - a proposed method for Really Useful Reviews

Background

I've been on the internet for almost 18 years now, and I've seen plenty of reviews all over the web, especially. And, in general, it's difficult to trust just anyone else's review, whether for a physical product, an app, a movie, a book, and so on. Sure, there are good, useful reviews out there, which only suggests that certainly there is a proper way to make a review. In general, it should be as objective as possible, and has less subjectivity as much as possible. In many cases, the review is too subjective, it becomes not-so-useful if you have a very different background from the reviewer, or you would totally appreciate the review because you are under similar circumstances as the reviewer. In some cases though, this attracts some much-unneeded internet hate, for example if a known Apple-loving journalist suddenly reviews a non-Apple product. Pageview-driven websites and egotistic website/blog owners like make these kinds of reviews, of course, for the ad revenues or just for bragging rights. But for the rest of us, the bottomline is the most important thing. Is it useful for us? Is it worth our $$$ or at least our time? In this article, I first review current review/rating methods. And finally, I will be proposing a new review method/system that maybe reviewers can use, that would ultimately benefit the readers (or viewers) of those reviews.



Current Rating Methods

The most common way for reviews is awarding "stars." Let's call this one the 5-Star Rating (5SR) System. Usually, the best products would have 5 stars and the worst products would have 1 star. Perhaps this became first used on hotels. But then, they say there is one hotel that is a 7-star hotel. Marketing gimmick.

Anyway, in some cases, they even consider a fraction of a star. For example, one product could be awarded 4.5 out of 5 stars. The problem in this system is the subjectivity. Different reviewers would award different stars to the same products. And then, it is left to readers to assess how many stars is good enough for them. 3? 4? 5? Oftentimes, the rating doesn't matter. Or, what happens is, review readers consider a product if it at least has 3 stars, maybe, and then just assess it for themselves. So, to that point, what good was the review other than to introduce the product? It might have well just been a "news" article which just presents all the facts and the features, and not necessarily a "review" article. But this is not always the case. Sometimes, a reviewer would award a product 2 stars or 5 stars, but a reader could try it out anyway and eventually have to assess it for himself.

There are similar methods as the 5SR system, which would present the same problems. Some use a 4-star or 3-star system, or, instead of stars, maybe reviewers would use "hearts," or in the case of Macworld, "mice." They're the same banana, of course.

A not-so-similar method is the thumbs-up-thumbs-down system, popularized perhaps by the show Siskel & Ebert. Two thumbs up, one each from Siskel and from Ebert, suggested a movie they were reviewing was quite good. Just one thumb up might mean you have to see it for yourself. No thumbs up from either of them mean it's probably a movie not worth watching. I think their reviews were widely accepted, but of course many other reviewers would still differ in opinion, in the same way that Siskel and Ebert themselves might have different views on the same movie.

Now, let us call these earlier-mentioned review systems as Single Rating systems. One problem with them, as mentioned, is subjectivity. Another set of reviewers would review the same thing differently. And then ultimately, the question is, do you share the same views as the reviewers? Only time would tell (i.e. after you've read their past reviews and you've become comfortable with them). And thus, the one way this Single Rating system works: if you're a follower of the reviewer and/or his institution, or you trust them enough. For example, long time Mac fans might trust the views and opinions at Macworld. But again, it's subjective. Perhaps newer Apple fans might disagree with or completely misunderstand Macworld editors.

Multiple Review/Rating (MR) Methods and Review Aggregators

There is actually one existing method that removes a little bit of the subjectivity from reviews -- by asking all different users of the product (viewers in the case of movies, readers in the case of books) make a review. The best example for this is perhaps on Amazon. But it is now found almost anywhere -- like on the Apple iTunes Store and App Store, and so on. Buyers/users of a certain product assign 1- to 5-stars (or equivalent), and give their own comments. Their ratings are then averaged to come up with one rating for the product. The end result is actually a quite useful review. The one problem here is it only becomes a useful review only after enough people use the product and review the product. I guess you'll never run out of users and reviewers. But as mentioned, sometimes you might get say a 4-star rating for a product, because 3 people gave it 5 stars, 1 gave it 4 stars, and 1 gave it 1 star. It's easy to say that maybe the 1-star rating is an "anomaly," or a special case, but that this is really a 5-star product. But this might also be a case where people involved in the production of the product were the ones who gave it 5 stars. But because it has a 4-star rating, it becomes popular at first. Perhaps if let's say 30 people gave it 5 stars, 10 gave it 4, and 10 gave it 1 (and thus an average of 4 stars), then it might be more believable. But you'll never know. The other problem with this is, how much of a commodity the product is (how many actually use the product), and how many of its users are actually willing to review it? It might just be the "vocal minority" who are reviewing the product. Maybe it's a good product, used by many, but reviewed only by some who absolutely disliked the product.

In short, the problems posed by such MR methods is that, for it to work, it requires a sufficient number of reviews from a sufficiently representative number of users. If you are reviewing a product, your one review article might not matter.

There is a way around this, and that is, using a number of professional reviewers to come up with the MR rating. For example, Rotten Tomatoes collects reviews from a number of selected (i.e. trusted) review sources. Then, they would give a percentage of how many of those reviews gave the product (in this case, a movie) a positive review. I usually would like to watch movies that have at least a 70% positive rating on Rotten Tomatoes. I would call this method the "Pro MR Method." In a way, Siskel & Ebert's method is another example, using the reviews from just two professional reviewers. In a way, getting nominated for "academy awards" types of things is another way to see which movies (and/or TV shows, musics, etc.) are good or not. But mostly, you know which movies these are already. Being "winners" in these awards are not always really useful information, too, because again there is a little bit of subjectivity and sometimes politics involved why certain movies win and why certain movies don't. Sometimes, there are two really good movies of different genres, but there has to be just one winner. For the year. If this year's movies were worse than last year's movies, it's too bad that that 2nd best movie last year could've won it this year. See the problems?

On the other hand, Facebook and YouTube and Vimeo and Tumblr and Google+ and Instagram for example, use a voting system. It is therefore an MR method, using only a polarising vote. Like or Dislike. Yes or No. Or in the case of Facebook, Vimeo, Google+ and Tumblr, just a "Like." Or a "+1". Or a thumbs up. It presents again the issue with MR methods, in that it requires a sufficient number of "reviews," but it's not really as indicative of anything, except perhaps when a video or whatever else has plenty more dislikes than likes. In the case of Facebook and the others, except YouTube (where there is a dislike/thumbs-down option), you don't know how good a certain thing really is. It depends on the popularity of the product. For example, a professional guitarist's rendition of his own song that he wrote might get 10,000 likes. But a rather sexy-looking young amateur girl doing the same song might get 100,000 likes. And then perhaps if an established celebrity does his/her own cover of the same song, that video gets 1,000,000 likes. The number of likes or +1s or thumbs up or hearts does not really make for a useful review. In the song video example, if you're not into the same type of music, even 1,000,000 likes does not really mean anything to you. In short, there is an uncounted number of non-votes which could either mean "dislike" or "don't care / either way". There are metrics that could be used but you have to dig them up yourself, like for example, how many % of viewers like the video, and what is the ratio of likes to dislikes. Hmmm, there's a good idea. The % of likes over views though again presents some problems, but the ratio of likes to dislikes is useful, except that it's not directly shown on YouTube. And again, on the other sites, there's no dislike option, so you really don't know. Of course, showing dislikes is not something they would ever consider, for example on Facebook, for marketing reasons. Who wants to see how many people dislike a certain company? Maybe many people will dislike Facebook on Facebook! And anyway, Facebook likes are not really meant to be a "review" system. As mentioned, it is really more for marketing purposes. There is one way where this Like-button-only system (as in Facebook) works, and that is by comparing the number of likes for the same product categories to choose the best one. For example, I checked out the amount of likes for a number of Philippine technology blogs. But again, it's not really indicative of anything useful. One blog has a ton of likes, because it's been in the business long enough, and the blog owner has had some mass media exposure. Whereas, IMHO, there is this other, newer blog which has better writing, but has far fewer likes than that first one. If I review the blogs, though, for example, the first one, and comment on how "different" its articles are from other more professional reviews, I'd probably get a ton of hate from the blog's fans. But well, that's that. People are emotional. But that is what we would like to remove from individual reviews: subjectivity.

Hybrid Systems

Certain websites use a hybrid review/rating system. They have reviews done by their professional staff, and reviews done by the audience, both following the 5-star rating system. A classic example I know of is CNET's Reviews. The problem is, they can't do it for all products. But, as you can imagine, it also gets more complicated. You have to judge for yourself if a 3-star pro review is enough. Or, an editor might give a product only 3-stars, but users give the product 5 stars. What to do? You're left to decide for yourself, in the end.

Other Systems

But actually, CNET pro reviewers do not obviously just give a 5-star rating system. Personally, I trust CNET reviews (when they review a product), because they assign experts to review the product. They don't assign an Android guy to an Apple product, nor do they assign an Apple guy to an Android product.

And more importantly, they provide a "Bottomline" statement, and a list of the pros and cons (i.e. good and bad) in the product. I also see this method over at iMore.com. Because, really that's important to us. The bottomline. The good. And the bad. Oh yeah, and the price.

Proposal for a Really Useful Review method

Now, if you noticed, the review system that really work best for more people is the "Pro MR Method." But there is a problem with this, if you are a (lone) blogger who would like to just review a product. Do you need to gather and interview a bunch of other bloggers just to come up with a review? Well, there's another good idea. It is probably the ideal review method. Another issue though is, how do you choose reviewers to consider? I mean, how can you say that a certain reviewer has enough credibility to review a certain product? I've seen many "professional" bloggers/journalists who have many years of experiences trying out many products, but they don't seem to get certain products still. For example, I've read many bloggers who regularly review different types of smartphones and tablets and the like. But when it comes to reviewing Apple products, they don't know s**t about them. They'd complain about it being "boring," being "too simple," and lacking in processor speed and memory. Yes, I'll say it again. They don't know s**t. But basically, it's a highly subjective thing then. How useful is that kind of review for you, right? They're trying to influence others with their opinion -- i.e. being subjective. And you're basically being insulted, too... if you buy (or don't buy) the product. They were influenced by the marketing tactics for the product they're reviewing, and now they're unknowingly trying to market the same product to you, just so they know they're not alone in the world. What kind of a review is that, right?

Anyways, actually, the best reviews are the simple ones: i.e. the polarising ones. Just a yes or no. Just a thumbs up or a thumbs down. A like or a dislike. No N-star rating systems, where N = 3, 4, 5, etc. A +1 rating is not enough, there has to be a -1 option. A like by itself is not enough, there has to be a dislike option.

In a way, the "pros and cons" reviews also give that simplistic polarising effect. Good or bad. Pro or con. Really useful reviews, no? The "bottomline" review is of course, a simple summary of those pros and cons.

And thus, I am proposing this as the appropriate, really useful review/rating method: state who will benefit, who won't benefit, and who would probably need to weigh the different considerations, perhaps make compromises, and finally make decisions by themselves first.

This could be in the form of a +1, 0, -1 (plus one, zero, minus one) rating system. Or a thumbs up, thumbs down, or no thumbs rating. Or a yes, maybe/depends, no. Or a Must-have, Meh, Definitely No. Or a two star, one star, or no star rating. Or substitute star with like, thumbs up, hearts, and whatever else.

But the point is, you make cases for each. You don't just assign one rating. As an example, for the Apple iPhone 5, a review using this RUR system might say:
Yes
 - for people who like the iPhone hardware and software design, and
 - for those who are already invested in the Apple ecosystem (apps, books, media, etc.)
 - for those who do not consider cost as an issue
No
- for people who have owned and still like their current non-iPhone phones, or
- for people who are not willing to shell out $650 for an unlocked version of $50 per month for a 2-year contract, or
- for more advanced users who like more customisability but do not want to utilise unofficial (i.e. warranty-voiding) tools like "jailbreaking." 
Maybe
- it comes down to whether you like the design of the product, and if you can afford it, but it really doesn't matter what you get if you are getting your first smartphone.
The above is just a quick, short example, but I hope you get the point. It is difficult to remove all subjectivity, but at least by using this method, you can see how it might be more useful to more people. Reading the above, too, I think the "Maybe" (or 1-star or no-thumbs-up rating) sounds a lot like CNET's and others' "bottomline" review statements.

RUR reviewers could also additionally somewhat use the MR method by soliciting inputs from their readers. For example, on my quick, short iPhone 5 review above, a reader might comment that "I don't like to have a phone with a slower processor and less memory than my friend's phones," and thus, I could add that to the "No" list.

And again, it is important to state prices/costs. This could be done separately, but in the above, you can see I worked it out under each of the yes/no/maybe items.

Closing Remarks

If I'll be reviewing something from now on, I will probably be using this RUR method. After some time, maybe there could be aggregators who will aggregate different reviews using this RUR method. But, by using this RUR method, I think reviews can become more useful for more people, less subjective, and maybe attract less hate. The internet is a tool and we want it to just be useful, right?

So, how about you guys? Would you use this method if you were reviewing a product? What do you think of this method of reviewing? Sound out in the comments section below.

No comments:

Post a Comment