Designing a Better Rating Widget

After writing my previous blog post about calculating an average rating properly, it occurred to me that I’ve never really explained my views on rating systems and their relative merits.

Now, first, let me say that I won’t be getting into the whole objective vs. subject voting mess. (The “It’s a classic and a masterpiece, but I don’t want it showing up in my recommendations” problem.) That’s its own mess. This post is purely about using good UI design to encourage consistent input across a large population of users aside from that issue. That said, let’s get started.

These days, it seems like everyone either has or is planning to move to a simple up/down rating system. YouTube is probably the classic example. My understanding is that they moved from a star rating system to a like/dislike rating system because the former didn’t work out well with the general public. (Some people almost never give out high ratings, some people almost never give out low ones, some people found it too much effort to decide, and the result is a mess, mathematically.)

The fundamental problem with cutting things down to such a two-value system is that it fundamentally doesn’t give you much data to work with and it’s my hypothesis that it also encourages noisy, polarized data above and beyond the obvious, since people don’t typically take the time to rationally analyze their impression of middling content. (In essence, it’s throwing the baby out with the bath water.)

So, if we can’t reduce the number of choices users have by that much, how can we tweak the presentation so that users respond more consistently?

First, let’s look at where star ratings actually come from. According to Wikipedia’s Star (classification) page, the first instance of repeated symbols for ratings was in an 1820 guidebook by Mariana Stark, which used repeated exclamation points. Following that, Murray’s Handbooks for Travellers and the Baedeker Guides replaced the exclamation points with stars.

There’s something very important to notice here. Star ratings originated in the context of highlights! They wouldn’t give some semi-permanent pile of horse manure in a random London back alley a zero-star rating… they’d just omit it!

So, our first step should be to acknowledge that mismatch and bring ourselves into line with human psychology. When we have a low opinion of something, we don’t stop at zero. We use colourful words such as “hate”, “loathe”, “despise”, and “detest”, which express negative emotional value …so, let’s make zero the middle of our scale.

Notice something familiar about that change? We’ve reached upvote/downvote. It’s just a very primitive form of what we’re seeking. People may not agree on how much room to leave on each end of a scale for especially good or bad content, but everyone understands the meaning of a transition point between “like” and “dislike”.

So, what’s next? Well, how about ambivalence. In an upvote/downvote system, users are forced to take sides. There’s no way to express “I don’t really care”, “I have no opinion”, or “Its only noteworthy characteristic is how un-noteworthy it is.” …so we at least want three choices:

Thumbs up from Font Awesome by Dave Gandy.
Modified under a CC-BY-SA 3.0 Unported license.

I’m not a graphic designer, but you get the idea. With the addition of a middle choice, it already represents real human opinions much more effectively.

This is actually the bare minimum I consider for a viable system and, when forced to deal with like/dislike systems I resort to using “abstained from voting” as a means of expressing a third value… and that in itself is a clue.

I coloured the middle values yellow to make them distinct from “unset”, but is that really necessary? What’s the difference between “I viewed it and didn’t vote” and “I viewed it and selected the neutral option”? I’d argue that drawing such a distinction is counter-productive hair-splitting, so I’ll use grey for the neutral option going forward.

So, what’s next? Well, how about degree? Humans aren’t stupid, so I’m not willing to give up on a 5-step rating system yet. If we’ve got a clear and obvious meaning for the middle point, it’s not hard for people to consistently answer the question “Did you like/dislike it a little or a lot?”, so let’s put two choices for like and two for dislike.

Thumbs up from Font Awesome by Dave Gandy.
Modified under a CC-BY-SA 3.0 Unported license.

Using color/brightness on hover and/or selection to “light up” the icons spanning from the selected one to the center can reinforce the understanding that this is a “distance from the center” metric, but having everything grey exacerbates a problem that was growing with the face-based approach.

While a soft smile or frown can serve as a general “like” or “dislike” icon, one can dislike something for many reasons. Emotions like anger, disgust, and extreme sadness all have their own distinct facial expressions. It’s easy to mistake the face-based visualization as a request for a qualitative evaluation of one’s emotional state (anger vs. sadness) rather than a quantitative one.

Furthermore, faces are complex shapes which can be difficult to pick details out of at small sizes. What we need is a set of symbols which are generic, international, and scale well.

Historically, I wouldn’t recommend thumbs, since the meanings of gestures vary so widely around the world, but the thumbs up and down icons seem to have taken on enough of an international meaning online to keep them in the running.

…so what’s the alternative choice? Well, how about plus and minus symbols? Math is international and everyone can understand their meanings in context.

Thumbs up from Font Awesome by Dave Gandy.
Modified under a CC-BY-SA 3.0 Unported license.

Doesn’t that look a lot easier to translate opinions into than a simple row of five stars? …and if not, tooltips can give that little extra boost.

Thumbs up from Font Awesome by Dave Gandy.
Modified under a CC-BY-SA 3.0 Unported license.

(And there also seems to be support for this model from the experts whose salaries depend on doing this sort of thing. Every professionally administered survey I’ve ever taken has incorporated questions with the choices “Strongly Disagree”, “Disagree”, “Neither Agree Nor Disagree”, “Agree”, and “Strongly Agree”.)

In summary, don’t be too quick to sacrifice data and throw out a UI that isn’t working. Sometimes, all it needs is a little tweak.

Bonus Tip: Extra Precision

Let’s suppose that you’re trying to upgrade a system that uses out-of-10 rating or you need to serve a more experienced user base like me (who sometimes feel the need to rate something as being “great” rather than “good” or “excellent”). There is also a way to support this without falling back to the “5 stars to the right of zero” problems that started this whole mess.

The secret ingredient is decimals. Users will have a much easier time if you draw a distinction between normal (integer) and exceptional (decimal) rating precision. In fact, “Rate in the range from -2 to +2… use decimals if you need to” is not only easier than “Rate in the range from 1 to 10”, it’s also more powerful since, if necessary, it can be extended to however much or little decimal precision you need.

In my experience though, it’s so rare for me to desire precision beyond “-2 to +2 in steps of 0.5” than I wouldn’t be concerned with it.

So, how do we produce a UI for this? Well, think about the psychological use of decimals. They’re extra precision that’s not normally needed, so they should be out of the main workflow where users don’t have to fret over them.

There are various ways to accomplish this , but the simplest way to visualize them would be a design inspired by the keys on a piano. Play all the white keys, and you can make perfectly good music in the key of C major… but the black keys are there when you want to do something more advanced.

Thumbs up from Font Awesome by Dave Gandy.
Modified under a CC-BY-SA 3.0 Unported license.

…keeping in mind, of course, that, for mobile use, it’d probably be best to hide the half-step buttons and present some kind of alternative method for entering high-precision information, such as a hamburger button with a popup.

(As zero-centered designs lend themselves best to odd numbers of choices, you’ll have to decide whether the tenth choice should be omitted or added onto either end on a case-by-case basis.)

Posted in Geek Stuff | 2 Comments

Calculating and Sorting by Average Ratings: A Literature Review for the Lazy

TL;DR: Read Bayesian ranking of items with up and downvotes or 5 star ratings by Jules Jacobs. I’ll get back to you on how to tune the priors and utility values.

Years ago, when I first had the idea that prompted me to register ficfan.org as a placeholder page, I went looking for a mathematical solution to the problem of bias in story ratings built from very small samples.

Given that this explanation of Wilson scores brought it back to mind, I thought I might as well blog about the topic for anyone who wants to know how to calculate average ratings properly.

I’d recently finished a very basic statistics course and, when I stumbled across How Not To Sort By Average Rating by Evan Miller, I recognized the principle of using the lower bound of a confidence interval, but didn’t know how to generalize it to the multivariate data that is an out-of-5 rating and didn’t find any candidates that were as well-suited to use by a stats novice as Miller’s post.

Since then, that very question got asked on Cross Validated (the StackExchange site for statistics), and a good point was made: The “lower bound of a confidence interval” method will seriously under-estimate things with a low number of ratings.

raegtin gives a good theoretical answer for methods of resolving that problem, but I didn’t yet have the much better stats textbook that’s now in my TODO pile and wasn’t in the mood to soldier through the theory on my own, so I kept looking.

Not longer after that answer, Evan Miller came back with a less hacky solution for up/down ratings which makes good reading if you’re trying to learn the theory without a textbook… but still didn’t meet my “I don’t trust my math. Give me something someone else trusts.” needs.

Now, that said, some people do apply the Wilson score to multi-value data by scaling the range down to between 0 and 1, so a middling rating counts as half an up vote. That’s what this MySQL solution and this Node.JS module do) but it has a big flaw. As Apocalisp pointed out when someone else thought of the idea, it leaves you with 300 3-star ratings being equivalent to 100 5-star ratings. He suggests calculating a Wilson confidence interval for each possible score, then working from there (and I’ve seen it suggested elsewhere), but I wouldn’t feel comfortable with that even if his suggestion had been around earlier, because I couldn’t find a detailed breakdown of why it would produce good results.

Ironically, the oldest resource (What is a better way to sort by a 5 star rating? on StackOverflow) is one I found just recently (Google Fu failure!). Despite that, it’s probably the most useful of the StackOverflow answers.

That said, let’s get on to the aforementioned resources it links.

In 2014, Evan Miller came back with “Ranking Items With Star Ratings“, which is a detailed look at the problem and how to solve it… unfortunately, I was sleep deprived when I encountered it and the massive wall of equations which didn’t end in sample code prompted me to shelve it for later. (Ironically, I can’t evaluate it right now either, because today is the one day this week that I slept terribly and I’m too busy to risk delaying this post until I have time.)

Finally, I came across “Bayesian ranking of items with up and downvotes or 5 star ratings” by Jules Jacobs (written in response to Evan Miller’s improved upvote/downvote code) which is in a form my sleep-fogged brain can handle.

It’s a simple, easy-to-understand explanation for people with minimal background in statistics, it guides you through thinking at the problem from the right direction (eg. what does the utility function really mean?), and comes with Python example code for if you really can’t be bothered to do anything more than copy-paste code.

As for making it efficient, the fact that it’s a modified arithmetic mean allows us to calculate it incrementally as long as we store both the score and the total number of votes with enough precision that we don’t have to worry about rounding errors.

  1. Multiply the average by the total vote count to reverse the final step of the process and produce what I’ll call the “expanded average”.
  2. Perform the weighting calculations for the new value
  3. Add it to the expanded average.
  4. Divide the expanded average by the new total number of votes.

This works because of two properties:

  • Addition is commutative, so it doesn’t matter which order you sum together the individual ratings.
  • The division and multiplication are symmetric, so 1 + 2+ 3 + 4 + 5 and ((1 + 2 + 3 + 4) / 4 * 4) + 5 are mathematically equivalent.

It’s not perfect, since you don’t have a nice averaged number in the range from 0 to 5 to display, but it’s definitely a good start if sorting and graphical visualizations are your goal. (Evan Miller’s approach is probably best if you need to display numbers.)

That leaves only one question: How do you tune your priors and utilities? …I’ll get back to you on that one after I have time to research it.

Posted in Geek Stuff | Leave a comment

Fanfiction – Harry Potter & the Curse’s Cure

How about a Harry Potter fic that, despite being a blend of various clichés, was still interesting enough for me to re-read it?

Harry Potter & the Curse’s Cure by Dragon-Raptor is a harem fic where Hermione and two other girls sleep with Harry at the beginning of the story arc. It has Slytherin House being written as what I call “implied suffering porn”. It has Weasley bashing. It has a manipulative “road to hell is paved with good intentions” Dumbledore. It has some kind of soul-mate bond. It has a veela bond. It has Lily Potter not actually being dead. It gives Harry a massive Potter Estate. It kills off family members of some of Harry’s friends in a rather formulaic way. It has OCs who threaten Voldemort’s significance as a villain. It has an unprofessional use of an ampersand in its title.

…and, despite all of that, it does well enough and has enough novel elements that I remembered it, sought it out again, and re-read it.

It justifies and redeems Ginny… but not in too “happily ever after” a way. It has things getting bad enough that it shocks Dumbledore into re-evaluating himself… yet he struggles with slipping back into old habits and modes of thought. It gives Luna a cat-sized pet dragon. It actually has some entertainingly novel ideas for what the Potter estate encompasses. It gives Hermione an uncle freshly retired from the military who is an enjoyable character in his own right. It gives Lily Potter a “not dead” setup that I’ve not seen done before or since. …and, in general, it starts out tolerable and progresses into being very engaging.

All in all, I’d give it a 4 out of 5 (+1 on a scale of -2 to +2) and say that, once it gets going, both the harem element and the “bringing the muggles in on it” element feel as if they took some small amount of inspiration from Effects and Side-Effects and benefited from doing so.

That said, I do wish it didn’t take its dragon lore from How to Train Your Dragon. When Harry’s animagus form is revealed to be a Night Fury, the fact that it’s a random detail borrowed from another fandom (that’s not being crossed over with) shouts “Hey! Author laziness here!” to me, which needlessly wears on the story’s sense of immersion.

Posted in Fanfiction | 2 Comments

Fanfiction – Delenda Est

Aaand back to reviewing things I liked enough to re-read.

Today’s review is Delenda Est by Lord Silvere and Claihm Solais, a Harry Potter fic that’s probably the only Harry-Bellatrix story which held my interest beyond the plot synopsis.

It’s a time-travel fic, in which a Bellatrix Lestrange who has fallen out of Voldemort’s favour convinces a captured Harry Potter to assist her in suicide and accidentally sends him back in time to her Hogwarts years… in just the right place to pique her younger self’s interest.

Naturally, Harry’s desire to hold onto his animosity toward Bellatrix Lestrange means he doesn’t want anything to do with her, but, through persistence and making herself invaluable to him, she ever-so-slowly drags details out of him.

In a more consequential vein, Harry also finds himself aligned with the less radical patriarchs of the Black and Malfoy family, who were assassinated in some unspecified manner in the original timeline. (While, at the same time, having his heroic tendencies interfering with his desire to maintain a low profile.)

Larger-scale events begin to move in chapter 12, with the first encounter with Lord Voldemort before his rise, when Harry finally tells Bellatrix of her potential self’s allegiance and suicide and, armed with a justification for his knowledge, finally informs his “backers” of Voldemort’s existence and motives.

(On a related note, when Harry suggests Tom as the “alias” to call Lord Voldemort by, it made me wish there was a third central character named Richard so the “Tom, Dick, and Harry” actual-name-as-an-alias set could be complete.)

I’m not sure I want to spoil exactly how the rest unfolds, but I will say that the story progresses in recognizably distinct phases and, after a novel or so worth of text, Harry and Bellatrix wind up taking turns playing vigilante apparition against the death eaters.

That said, I can say that the story is split into two parts, with the second part taking place after an accidental re-activation of the method of time-travel sends Harry and Bellatrix “back” to the (now changed) future where the Potters are still alive.

Writing-wise, the story feels like a fairly standard 4 out of 5. It has few scenes which have a chance of sticking in your memory (one being when James Potter gets mad at some over-eager vanishing ink), and it does rely on a couple of spells that one could argue to be OP, but it has no trouble keeping me reading.

Nonetheless, for all the OCs that it uses and canon characters it has to flesh out or reinterpret, I quite like their characterization in most cases, so this is certainly a case of “standard” being much more than merely “average”.

Unfortunately, the catch is that “in most cases”. I’m not overly fond of the Potter and Black children post-timeskip, compared to the Black and Malfoy elders pre-timeskip, and, given that I’ve seen others make generalized complaints about the direction the story went in its latter half, I can conclude I’m not alone in disliking how the timeskip changed things. (I find Rose insufferable in a way I never felt from Hermione and the others just bore me.) Nonetheless, I did still read to the end despite that.

In total, the story clocks in at 392,449 words, it’s complete, and there is a sequel in progress. (The sequel’s plot has a bit more of a “pulp fiction” feel to it but it more than makes up for it by having three of the main characters be master pranksters who provide many more humorous scenes than in the first story. I’d give said sequel a 4.5 out of 5 for how many times it made me laugh.)

Posted in Fanfiction | 1 Comment

Fanfiction – Stupid Portal

Continuing on last week’s theme of decent Buffy the Vampire Slayer fics that I happened to be reading, this is another crossover incorporating Star Trek.

Stupid Portal by elementalv

However, this time, Buffy and Spike wind up on Picard’s Enterprise because the the three evil nerds (Warren Mears and co.) essentially want their own private crossover TV show to spy on.

What makes this story noteworthy is its blend of character and setting, with a story developing around a prophecy that Buffy and several others will save a race of demons who fled to the Star Trek universe. (Where their use of magic was written off as a weak variant of the kinds of powers used by entities like the Q.)

On the character front, having Spike stuck with Buffy on the Enterprise, later joined by Giles, allows for some engaging focus on Spike’s character and his relationship with Buffy. On the setting front, we get to see the Slayer spirit’s backstory fleshed out and explored with Buffy being the avatar of a primal goddess named Sendaru.

I also like the three original characters that the story introduces: Lieutenant Meg Burns (who volunteers to be a sparring partner for Buffy while she’s learning swordplay), Sendaru, and DB (short for Data’s Bastard), a program cobbled together by Data from Federation and Borg algorithms who became self-aware without anyone noticing. (I especially enjoyed the scenes with DB.)

When it comes to more specific events, I found it very entertaining to see Buffy tell Q exactly what her opinion is of those who associate with the Powers That Be.

That said, I do see two flaws:

First, until the last scene left the story feeling complete, I was wishing quite strongly that there was more time spent exploring DB as a character. A minor nit-pick, but something that could have been avoided with more skill.

Second, while it does peter off pretty quickly and it does have its uses to the story in the middle, having Worf develop an unrequited crush on Buffy in the omitted pre-story portion after Buffy and Spike fall through the portal feels like sloppy writing. (Though, once I made peace with how it was set up, the areas where it came into play turned out fine.)

All in all, given the work put into the characterization and backstory (especially DB and Sendaru) and how the ending left me feeling, I’d rate this around 4.7 out of 5. The final scene even felt sort of like the ending of a Buffy episode.

Posted in Fanfiction | Leave a comment

Sneaking Around The Urge to Procrastinate

About four years ago, I wrote about an epiphany I had while learning Prolog.

Well, now I’m learning Free Pascal as a more typesafe alternative to C for programming my DOS-based retro-gaming PC and a bit of supplementary insight came to mind:

Specifically, my impulses called this exercise pointless… yet it’s no different from the Rust exercises on Exercism that I’ve been using to make up for any skill biases in my hobby projects.

When I realized that, I also realized how to get around that kind of impulse: Saying something is pointless is fundamentally the same as using “What’s the point?” as a rhetorical question… but that question doesn’t have to be rhetorical.

If you ask yourself “What’s the point?”, you can answer “To prove you know it” to trick your baser desires into “conversing”. When they “say” that they do already know it, you can “respond” with “OK, then prove it. Explain, step-by-step, how you’d implement this.”

It was at that point, I realized why I was so resistant to it. The tutorial hadn’t covered collections (ie. arrays, vectors, etc.) yet, and, because I knew Pascal had them, my lazy brain was trying to distract me from how long it has been since I’ve had to group sequences without arrays, generators, slicing, or other fancy functional constructs.

Well, once I realized what was trying to be brushed under the rug, it was easy to take the newly revealed challenge by the horns and then end with “OK, you say that algorithm is a solution… Translating it to code is the work of a minute or two at most. Translate it to Pascal and prove it to me.”

So, what is today’s lesson? If you don’t want to do something because you think you already know how, try demanding an explanation of yourself. Maybe it really is that easy and you’ll convince yourself to just get it over with, or maybe your intuition is trying to keep you from realizing where the real effort lies. Either way, separating the actual mental work from the physical/digital manifestation of it may be all it takes trick your lazy brain into getting stuff done.

Posted in Web Wandering & Opinion | Leave a comment

Fanfiction – Infinity Box

It’s been a busy week and I didn’t have time to do my usual “go back and re-read something which stuck with me as good” routine, so here’s something reasonably worthwhile from my fresh reads:

Infinity Box by HMaxMarius

It’s a response to to Zaion’s Ship of the Line challenge and, for those not familiar with it, the challenge has to do with writing Buffy the Vampire Slayer fanfiction in which, rather than just some knowledge of the French language, some kind of sci-fi ship sticks around when everything is over. …such fics are a bit of a guilty pleasure for me.

While I don’t have time to build a recommendation list, Infinity Box is one I happened to read over the weekend which is complete enough and decent enough to serve as a good sampler. (If you want more, the challenge page contains a listing of responses.)

As with all Ship of the Line stories, it begins with the events of Buffy the Vampire Slayer, Season 2, Episode 6 (“Halloween”), in which everyone gets temporarily turned into their costumes by Ethan Rayne’s idea of a prank. In this story, however, the relevant characters decided to dress to a Star Trek: The Next Generation theme instead.

The fic then does two things which aren’t specified in the challenge, but most fics seem to wind up doing:

  • What exists feels like just the first act in a longer story, though, unlike so many, it’s actually complete.
  • It’s a crossover with Stargate: SG-1 in which they ally themselves with the SGC while maintaining political independence.

Notice that I said most fics wind up doing that. If you want something more complete, there are a handful of things that progressed further and you can can choose the “Length (Desc)” sort order on the listing to find them.

The writing in Infinity Box one is generally good as far as Halloween fics go (they have a whole filterable tag to themselves on Twisting the Hellmouth), HMaxMarius is smart enough to do as little rehashing of canon events as possible, and I’d give it a 4 out of 5, which would be equivalent to +1 on a scale from -2 to +2.

In other words, it’s about average for what I’d find on the Fanfiction.net favourites list of someone like me. (Someone who has taste but also isn’t stingy with their faves.)

On a related note, given that this and alternative Mass Effect first contact fics (which don’t waste time following boring human soliders on the ground) are both guilty pleasures of mine, can anyone suggest another class of fic I might enjoy?

Posted in Fanfiction | Leave a comment