Wednesday, 8 May 2013

Score Conditions

Something I've been thinking about with respect to 868-HACK (yes I'm still working on it, the new version has a new name, I don't know when it will be done) is how players can approach score in a game differently depending on their goals, even when the score system itself is fixed. Some of this comes from conversations with @rocketcatgames and @stiknork.

Zaga-33 had a very minimal score system. The main focus is on the win condition of killing the end-boss; the score essentially measures progress towards achieving that condition (by counting which dungeon level you've reached), with a bonus at the end for how efficiently you achieved it when you do (by counting items remaining in your inventory). I discussed this in a previous post (which also touched on some of the other things I'm writing about here).
Before you've completed the game, playing for score is quite effective: it helps you measure your progress towards the ultimate goal. But it turned out that once you can complete it, playing for score wasn't interesting for very long. Whether you could get a high score depended largely on random factors: whether items and level configurations were favourable. Then once you'd achieved the maximum possible score, there was nothing further to achieve. (This is why I didn't include an online scoreboard, see another post.)
But that wasn't the end of it. @Rocketcatgames found another way to play: trying to get a streak of as many wins as possible in a row. I'd tried to balance the game to almost always be completable, and he put that to the test. So I patched in a streak counter that shows up when you win the game, to keep track of this for him (and anyone else who wanted to try that challenge).
In one of the previous posts I talked about how taking a very coarse measurement for score helps to average out the effects of randomness; here we have the coarsest measurement of all - win or lose - added up over multiple plays, turning out to be the best way to display skill.
The interesting thing that came out of this for me was: playing to get the maximum score and playing to get the longest streak required subtly different priorities - similar skills, but applied differently. To maximise your score requires taking reckless risks and succeeding at them, while streaking requires minimising risk as much as possible. The conservative player uses an unidentified item under controlled conditions, where the consequences won't be too bad if it turns out not to be useful; the risky player gets into a situation where only one item will do and then guesses right - or dies, which is okay because they wouldn't have gotten the top score anyway.

Consider the sequence of scores (or wins/losses) across multiple games, instead of just a single number from one game. One way to interpret that sequence is to just keep the biggest one in it, the "high score", that's fine, but there are other possibilities.
Counting streaks (where there's no win condition, you could look at streaks of scores larger than N).
Average score - Drop7 tracks this, and it affects how it's played (although it's a bit broken because you can cancel a game before the end without it contributing to the average). This is potentially really deep because you have to work out to what extent the higher scores you may get by taking extreme risks balance out the lower scores you get when those chances don't work out. In a game with an exponential score system it might be worth getting nothing most of the time in exchange for the occasional zillion-point game - or it might not.
Also there are different types of average you could take: median, geometric mean, harmonic mean.. or you could look at average streaks, or something weirder - what if the ideal way to display your skill at a game is to lose every seventh time you play and win the rest, or to guarantee that each score is higher than the last - a streak of increasing (or decreasing!) scores?

Measuring streaks of wins is pretty similar to measuring your win percentage - a high proportion of wins implies long streaks; long streaks could come from playing badly but many many times, but are most likely to come from being able to win reliably. So win percentage might be a slightly more accurate way of expressing skill, but I prefer to look at streaks because they feel more exciting - there's a psychological difference. Tension builds as you win multiple games in a row, and you may change your style to be more conservative to try not to break the streak. "I won 30 times in a row" sounds more impressive than "I win 83% of the time". (Also the average becomes hard to change once you've played a lot - though this can be solved by averaging across a rolling window of the last N plays.)

A similar consideration comes up in multiplayer games: are you trying to maximise your chances of being in first place, or maximise your expected position among the players? Often you get situations where anyone could attack the player in the lead to bring them down, but it would cost them their own position and give the game to someone else (a form of kingmaking) - and sparks can fly when players have different implicit ideas about what they should be playing for. And when there's a score, if you're behind do you take risks to try to win or do you aim to maximise your score to lose "by less"? Does the absolute value of your score mean anything at all, or is it just about how it compares to other players? Tournament structures around the game can answer this one way or another - in an elimination tournament you might just want to be first (or to not be last), whereas in Poker you typically only care about your score.

SpaceChem does something interesting with its scores, which the developer spoke about at GDC this year. It offers multiple criteria by which your solution is evaluated, so as well as completing levels you can try to improve your solution along each of these axes. But the criteria aren't independent: improving on one axis will cost you on the others. One really clever side-effect of this he pointed out: optimising for one type of score means you're likely to be below average on others, but all scores are added to all leaderboards regardless of which one the player was aiming at, which means everyone can get to be above average on one of them.

868-HACK, like Zaga-33, has both a score and a binary win condition. And, also like Zaga-33, it's possible to achieve the win condition almost every time with cautious skilled play. But unlike it, your score is not closely tied to the win condition at all. You might get 24 points and then die in the first sector, or get to the end with no points at all. There's often a choice between the two - getting points is risky and reduces your chances of survival; sometimes you can guarantee a higher score if you don't try to get out *ALIVE* as well.
The first thing I did, which is in the 7-day version, was to order the scoreboard so that any winning score - even 0 - is above any losing score. This kind of says that your score means nothing if you're dead. And this is interesting, it's a deeper challenge to figure out how exactly many points you can safely get to the exit with than to just grab the highest numbers you can find.
But over time it became clear that it suffered from a similar problem to Zaga-33: getting the highest scores was largely a matter of luck. I already knew a solution: streaking technology. But since this time scoring and winning are not coupled, the length of a streak alone wasn't enough: instead I'm tracking the cumulative score. It presents a difficult problem: how many points should you get in each game to maximise your score across a streak? This works really well.
The expected sequence of mastery is: trying to get to the end, trying to get a high score, trying to get a high streak score. Note that of course I don't expect all players to be interested in this - I'm perfectly happy if someone stops at an earlier tier. If you just get to the end and are satisfied with that that's completely fine, but if anyone wants to keep on playing then there's a greater challenge to measure themselves against.

When putting a score in a game, don't just say "try to get the biggest number" and be done with it. Consider whether it's a score that makes sense to compare against other people, or if it's (like Zaga-33's) better just as a measure of personal progress. Consider the context around the score, how different goals across the sequence of repeated plays can shape how players approach each individual play. And when you find the most interesting goal to aim for, consider how to present it to players. Show people a number and someone will care about it and try to make it bigger. Be open to players inventing their own approaches, a game can accommodate different styles of play driven by different goals.


  1. Ooh, cumulative streak score sounds nice. I'm glad you're not measuring *all* scores, as that discourages casual play or even just letting a friend have a go.

    I have only two victories in 868 - one was my second or third game, the other my last one. If someone got a higher score then I'd likely play again, but I don't have a huge incentive otherwise. Kinda sad I guess, since I do enjoy just playing the game, but I suppose I burned myself out on it a bit using very scummy and boring tactics to get absolute highest score. The steadier pace of trying for careful but maximised streaks would be far more enjoyable.

  2. Nice thoughts, thanks.

    Another interesting thing about winning streaks in roguelikes is they can act as proof that a game is 'easy enough', even if it might seem very difficult indeed.

    It was mrivan's 23 win streak in Nethack that gave me the inspiration required to change my play-style and ascend that game for the first time.

  3. I'm glad I'm not obsessive about high scores because going for streaks sounds like it would explode my heart. That said, very interesting post and definitely food for thought.

  4. Another thing to consider in multiplayer competitive games is how scoring affects play styles, especially in tournament play. At the first PAX where I showed SpyParty, we did a little tournament on the last day, and we assigned points to Spy missions, thinking we needed to give some score to a Spy who accomplished some stuff before running out of time. Well, this completely warped the way people played the game, they'd leave the AI in control and run the clock down and then pop out and do a single mission right at the end, and stuff like that. I realized that simple win/loss was much more true to the aesthetics and design goals of the game, especially if I'd spent a ton of time balancing for win/lose, so tournaments since then have simply been best n of m games, no score at all.

  5. Yeah, there's a sense in which getting these incentives right matters way more for multiplayer games. In 1p if you can gain an advantage by grinding (or otherwise playing in a tedious way) you're only wasting your own time. In MP all incentives are amplified by the desire to beat your opponent(s), players are less likely to avoid "un-fun" strategies if they think they can get ahead with them, and it wastes everyone's time.

    Worse, this applies as long as players *believe* something is to their advantage, whether or not it actually is.