Game Balance

Level 1: Intro to Game Balance
This Weeks Topic This week is probably going to start a bit slow for those of you who are experienced game designers (or those who are hoping to dive deep into the details). Instead, I want to use this week mostly to get everyone into the mindset of a game designer presented with a balance task, and I want to lay out some basic vocabulary terms so we can communicate about game balance properly. You can think of this week like a tutorial level. The difficulty and pacing of this course will ramp up in the following weeks. What is Game Balance? I would start by asking the question what is game balance? but I answered it in the teaser video already. While perhaps an oversimplification, we can say that game balance is mostly about figuring out what numbers to use in a game. This immediately brings up the question: what if a game doesnt have any numbers or math involved? The playground game of Tag has no numbers, for example. Does that mean that the concept of game balance is meaningless when applied to Tag? The answer is that Tag does in fact have numbers: how fast and how long each player can run, how close the players are to each other, the dimensions of the play area, how long someone is it. We dont really track any of these stats because Tag isnt a professional sport but if it was a professional sport, youd better believe there would be trading cards and websites with all kinds of numbers on them! So, every game does in fact have numbers (even if they are hidden or implicit), and the purpose of those numbers is to describe the game state. How do you tell if a game is balanced? Knowing if a game is balanced is not always trivial. Chess, for example, is not entirely balanced: it has been observed that there is a slight advantage to going first. However, it hasnt been definitively proven whether this imbalance is mechanical (that is, there is a bona fide tactical/strategic advantage to the first move) or psychological (players assume there is a first-move advantage, so they trick themselves into playing worse when they go second). Interestingly, this first-move advantage disappears at lower skill levels; it is only observed at championship tournaments. Keep in mind that this is a game that has been played, in some form, for thousands of years. And we still dont know exactly how unbalanced it is! In the case of Chess, a greater degree of player skill makes the game unbalanced. In some cases, it works the other way around, where skilled players can correct an inherent imbalance through clever play. For example, in Settlers of Catan, much of the game revolves around trading resources with other players. If a single player has a slight gameplay advantage due to an improved starting position, the other players can agree to simply not trade with that player for a time (or only offer unfair trades at the expense of that player) until such time as the starting positions equalize. This would not happen in casual games, as the players would be unable to recognize a slight early-game advantage; at the tournament level, however, players would be more likely to spot an inherent imbalance in the game, and act accordingly. In short, game balance is not an easy or obvious task. (But you probably could have figured that out, given that Im going to talk for ten straight weeks on the subject!) Towards a critical vocabulary Just like last summer, we need to define a few key terms that well use as we talk about different kinds of balance.
Determinism For our purposes, I define a deterministic game as one where if you start with a given game state and perform a particular action, it will always produce the same resulting new game state. Chess and Go and Checkers are all deterministic. You never have a situation where you move a piece, but due to an unexpected combat die roll the piece gets lost somewhere along the way, or something. (Unless youre playing a nondeterministic variant, anyway.) Candyland and Chutes & Ladders are not deterministic. Each has a random mechanism for moving players forward, so you never know quite how far youll move next turn. Poker is not deterministic, either. You might play several hands where you appear to have the same game state (your hand and all face-up cards on the table are the same), but the actual results of the hand may be different because you never know what the opponents cards are. Rock-Paper-Scissors is not deterministic, in the sense that any given throw (like Rock) will sometimes win, sometimes lose, and sometimes draw, depending on what the opponent does. Note that there are deterministic elements to all of these games. For example, once you have rolled your die in Chutes & Ladders, called the hand in Poker, or made your throw in Rock-PaperScissors, resolving the turn is done by the (deterministic) rules of the game. If you throw Rock and your opponent throws Paper, the result is always the same. Non-determinism The opposite of a deterministic game is a non-deterministic game. The easiest way to illustrate the difference is by comparing the arcade classic Pac-Man with its sequel Ms. Pac-Man. The original Pac-Man is entirely deterministic. The ghosts follow an AI that is purely dependent on the current game state. As a result, following a pre-defined sequence of controller inputs on a given level will always produce the exact same results, every time. Because of this deterministic property, some players were able to figure out patterns of movements; the game changed from one of chasing and being chased to one of memorizing and executing patterns. This ended up being a problem: arcade games required that players play for 3 minutes or less, on average, in order to remain profitable. Pattern players could play for hours. In Ms. Pac-Man, an element of non-determinism was added: sometimes the ghosts would choose their direction randomly. As a result, Ms. Pac-Man returned the focus of gameplay from pattern execution to quick thinking and reaction, and (at the championship levels, at least) the two games play quite differently. Now, this is not to say that a non-deterministic game is always better. Remember, Chess and Go are deterministic games that have been played for thousands of years; as game designers today, we count ourselves lucky if our games are played a mere two or three years from the release date. So my point is not that one method is superior to the other, but rather that analyzing game balance is done differently for deterministic versus non-deterministic games. Deterministic games can theoretically undergo some kind of brute-force analysis, where you look at all the possible moves and determine the best one. The number of moves to consider may be so large (as with the game Go) that a brute-force solve is impossible, but in at least some cases (typically early-game and end-game positions) you can do a bit of number-crunching to figure out optimal moves. Non-deterministic games dont work that way. They require you to use probability to figure out the odds of winning for each move, with the understanding that any given playthrough might give a different actual result. Solvability This leads to a discussion of whether a game is solvable. When we say a game is solvable, in
general, we mean that the game has a single, knowable best action to take at any given point in play, and it is possible for players to know what that move is. In general, we find solvability to be an undesirable trait in a game. If the player knows the best move, they arent making any interesting decisions; every decision is obvious. That said, there are lots of kinds of solvability, and some kinds are not as bad as others. Trivial solvability Normally, when we say a game is solvable in a bad way, we mean that it is trivially solvable: it is a game where the human mind can completely solve the game in real-time. Tic-Tac-Toe is a common example of this; young children who havent solved the game yet find it endlessly fascinating, but at some point they figure out all of the permutations, solve the game, and no longer find it interesting. We can still talk about the balance of trivially solvable games. For example, given optimal play on both sides, we know that Tic-Tac-Toe is a draw, so we could say in this sense that the game is balanced. However, we could also say that if you look at all possible games of Tic-Tac-Toe that could be played, youll find that there are more ways for X to win than O, so you could say it is unbalanced because there is a first-player advantage (although that advantage can be negated through optimal play by both players). These are the kinds of balance considerations for a trivially solvable game. Theoretical complete solvability There are games like Chess and Go which are theoretically solvable, but in reality there are so many permutations that the human mind (and even computers) cant realistically solve the entire game. Here is a case where games are solvable but still interesting, because their complexity is beyond our capacity to solve them. It is hard to tell if games like this are balanced, because we dont actually know the solution and dont have the means to actually solve it. We must rely on our game designer intuition, the (sometimes conflicting) opinions of expert players, or tournament stats across many championshiplevel games, to merely get a good guess as to whether the game is balanced. (Another impractical way to balance these games is to sit around and wait for computers to become powerful enough to solve them within our lifetimes, knowing that this may or may not happen.) Solving non-deterministic games You might think that only deterministic games can be solved. After all, non-deterministic games have random or unknown elements, so optimal play does not guarantee a win (or even a draw). However, I would say that non-deterministic games can still be solved, its just that the solution looks a lot different: a solution in this case is a set of actions that maximize your probability of winning. The card game Poker provides an interesting example of this. You have some information about what is in your hand, and what is showing on the table. Given this information, it is possible to compute the exact odds of winning with your hand, and in fact championship players are capable of doing this in real-time. Because of this, all bets you make are either optimal, or they arent. For example, if you compute you have a 50/50 chance of winning a $300 pot, and you are being asked to pay $10 to stay in, that is clearly an optimal move for you; if you lost $10 half of the time and won $300 the other half, you would come out ahead. In this case, the solution is to make the bet. You might wonder, if Poker is solvable, what stops it from becoming a boring grind of players computing odds with a calculator and then betting or not based on the numbers? From a game balance perspective, such a situation is dangerous: not only do players know what the best move is (so there are only obvious decisions), but sometimes optimal play will end in a loss, effectively punishing a player for their great skill at odds computation! In games like this, you need some kind
of mechanism to get around the problem of solvability-leading-to-player-frustration. The way Poker does this, and the reason its so interesting, is that players may choose to play suboptimally in order to bluff. Your opponents behavior may influence your decisions: if the guy sitting across from you is betting aggressively, is it because he has a great hand and knows something you dont know? Or is he just bad at math? Or is he good at math, and betting high with a hand that cant really win, but hes trying to trick you into thinking his hand is better than it really is? This human factor is not solvable, but the solvable aspects of the game are used to inform players, which is why at the highest levels Poker is a game of psychology, not math. It is these psychological elements that prevent Poker from turning into a game of pure luck when played by skilled individuals. Solving intransitive games Intransitive games are a fancy way of saying games like Rock-Paper-Scissors. Since the outcome depends on a simultaneous choice between you and your opponent, there does not appear to be an optimal move, and therefore there is no way to solve it. But in fact, the game is solvable its just that the solution looks a bit different from other kinds of games. The solution to Rock-Paper-Scissors is a ratio of 1:1:1, meaning that you should throw about as many of each type as any other. If you threw more of one type than the others (say, for example, you favored Paper), your opponent could throw the thing that beats your preferred throw (Scissors) more often, which lets them win slightly more than average. So in general, the solution to RPS is to throw each symbol with equal frequency in the long term. Suppose we made a rules change: every win with Rock counts as two wins instead of one. Then we would have a different solution where the ratios would be uneven. There are mathematical ways to figure out exactly what this new ratio would be, and we will talk about how to do that later in this course. You might find this useful, for example, if youre making a real-time strategy game with some units that are strong against other unit types (in an intransitive way), but you want certain units to be more rare and special in gameplay than others. So, you might change the relative capabilities to make certain units more cost-efficient or more powerful overall, which in turn would change the relative frequencies of each unit type appearing (given optimal play). Perfect information A related concept to solvability is that of information availability. In a game with perfect or complete information, all players know all elements of the game state at all times. Chess and Go are obvious examples. You might be able to see, then, that any deterministic game with perfect information is at least theoretically, completely solvable. Other games have varying degrees of incomplete information, meaning that each player does not know the entire game state. Card games like Hearts or Poker work this way; in these games, each player has privileged information where they know some things the opponents dont, and in fact part of the game is trying to figure out the information that the other players know. With Hearts in particular, the sum of player information is the game state; if players combined their information, the game would have perfect information. Yet other games have information that is concealed from all of the players. An example of this is the card game Rummy. In this game, all players know what is in the discard pile (common information), each player knows what is in his or her own hand but no one elses hand (privileged information), and no player knows what cards remain in the draw deck or what order those cards are placed in (hidden information). Trading-card games like Magic: the Gathering offer additional layers of privileged information, because players have some privileged information about the possibility space of the game. In
particular, each player knows the contents of cards in their own deck, but not their opponents, although neither player knows the exact order of cards in their own draw pile. Even more interesting, there are some cards that can give you some limited information on all of these things (such as cards that let you peek at your opponents hand or deck), and part of the challenge of deck construction is deciding how important it is to gain information versus how important it is to actually attack or defend. Symmetry Another concept that impacts game balance is whether a game is symmetric or asymmetric. Symmetric games are those where all players have exactly the same starting position and the same rules. Chess is almost symmetric, except for that pesky little detail about White going first. Could you make Chess symmetric with a rules change? Yes: for example, if both players wrote down their moves simultaneously, then revealed and resolved the moves at the same time, the game would be completely symmetric (and in fact there are variants along these lines). Note that in this case, symmetry requires added complexity; you need extra rules to handle cases where two pieces move into or through the same square, or when one piece enters a square just as another piece exits the square. In one respect, you could say that perfectly symmetric games are automatically balanced. At the very least, you know that no player is at an advantage or disadvantage from the beginning, since they have the exact same starting positions. However, symmetry alone does not guarantee that the game objects or strategies within the game are balanced; there may still be certain pieces that are much more powerful than others, or certain strategies that are clearly optimal, and symmetry doesnt change that. Perfect symmetry is therefore not an easy way out for designers to make a balanced game. The Metagame The term metagame literally means the game surrounding the game and generally refers to the things players do when theyre not actively playing the game, but their actions are still affecting their chances to win their next game. Trading card games like Magic: the Gathering are a clear example of this: in between games, players construct a deck, and the contents of that deck affect their ability to win. Another example would be championship-level Poker or even worldtournament Rock-Paper-Scissors, players analyze the common behaviors and strategies of their opponents. Professional sports have all kinds of things going on in between games: scouting, drafting, trading, training, and so on. For games that have a strong metagame, balance of the metagame is an important consideration. Even if the game itself is balanced, a metagame imbalance can destroy the balance of the game. Professional sports are a great example. Here is a positive feedback loop that is inherent in any professional sport: teams that win more games, get more money; more money lets them attract better players, which further increases their chance of winning more games. (With apologies to anyone who lives in New York, this is the reason everyone else hates the Yankees.) Other sports have metagame mechanics in place to control this positive feedback. American Football includes the following: Drafts. When a bunch of players leave their teams to be picked up by other teams, the weakest team gets to choose first. Thus, the weakest teams pick up the strongest players each year. Salary caps. If there is a limit to how much players can make, it prevents a single team from being able to throw infinite money at the problem. Even weaker teams are able to match the max salary for a few of their players. Player limits. There are a finite number of players allowed on any team; a good team cant just have an infinite supply of talent.
These metagame mechanics are not arbitrary or accidental. They were put in place on purpose, by people who know something about game balance, and its part of the reason why any given Sunday, the weakest team in the NFL might be able to beat the strongest team. From this, you might think that fixing the metagame is a great way to balance the game. Trading card games offer two examples of where this tactic fails. First, lets go back to the early days of Magic: the Gathering. Some cards are rarer than others. Thus, some rare cards ended up being flat-out better than their more common counterparts. Richard Garfield clearly thought that rarity itself was a way to balance the game. (In his defense, this was not an unreasonable assumption at the time. He had no way of knowing that some people would spend thousands of dollars on cards just to get a full set of rares, nor did he know that people would largely ignore the rules for ante which served as an additional balancing factor.) Today, trading card game designers are more aware of this problem; while one does occasionally see games where more rare = more powerful, players are (thankfully) less willing to put up with those kinds of shenanigans. Second, TCGs have a problem that video games dont have: once a set of cards is released, it is too late to fix it with a patch if some kind of gross imbalance is discovered. In drastic cases they can restrict or outright ban a card, or issue some kind of errata, but in most cases this is not practical; the designers are stuck. Occasionally you might see a designer that tries to balance an overpowered card in a previous set by creating a counter-card in the next set. This is a metagame solution: if all the competitive decks use Card X, then a new Card Y that punishes the opponent for playing Card X gives players a new metagame option but if Card Y does nothing else, it is only useful in the context of the metagame. This essentially turns the metagame into Rock (dominant deck) Paper (deck with counter-card) Scissors (everything else). This may be preferable to a metagame with only one dominant strategy, but its not much better, and it mostly shifts the focus from the actual play of the game to the metagame: you may as well just show your opponent your deck and determine a winner that way. This is admittedly an extreme example, and there are other ways to work around an imbalance like this. The counter-card might have other useful effects. The game overall might be designed such that player choices during the game contribute greatly to the outcome, where the deck is more of an influence on your play style than a fixed strategy. Still, some games have gone so far as to print a card that says When your opponent plays [specific named card], [something really bad happens to them] with no other effect, so I thought this was worth bringing up. Game balance versus metagame balance In professional sports, metagame fixes make the game more balanced. In TCGs, metagame fixes feel like a hack. Why the difference? The reason is that in sports, the imbalance exists in the metagame to begin with, so a metagame fix for this imbalance is appropriate. In TCGs, the imbalance is either part of the game mechanics or individual game objects (i.e. specific cards); the metagame imbalances that result from this are a symptom and not the root cause. As a result, a metagame fix for a TCG is a response to a symptom, while the initial problem continues unchecked. The lesson here is that a game balance problem in one part of a game can propagate to and manifest in other areas, so the problems you see during playtesting are not always the exact things that need to be fixed. When you identify an imbalance, before slapping a fix on it, ask yourself why this imbalance is really happening, what is actually causing it and then, what is causing that, and what is causing that, and so on as deep as you can go. Game Balance Made Easy, For Lazy People Im going to try to leave you each week with some things you can do right now to improve the balance of a game youre working on, and then some homework that you can do to improve your
skills. Since we just talked about vocabulary (symmetry, determinism, solvability, perfect information, and the metagame) this week, theres not a lot to do, so instead Im going to start by saying what not to do. If youre having trouble balancing a game, the easiest way to fix it is to get your players to do this for you. One way to do this is auction mechanics. There is nothing wrong with auctions as a game mechanic, mind you they are often very compelling and exciting but they can be used as a crutch to cover up a game imbalance, and you need to be careful of that. Let me give an example of how this works. Suppose youre a designer at Blizzard working on Warcraft IV, and you have an Orcs-vs-Humans two-player game that you want to balance, but you think the Orcs are a little more powerful than the humans (but not much). You decide the best way to balance this is to reduce the starting resources of the Orcs; if the Humans start with, say, 100 Gold maybe the Orcs start with a little less. How much less? Well, thats what game balance is all about, but you have no idea how much less. Heres a solution: make players bid their starting Gold on the right to play the Orcs at the start of the game. Whoever bids the most, loses their bid; the other player starts with the full 100 Gold and plays the weaker Humans. Eventually, players will reach a consensus and start bidding about the same amount of Gold, and this will make things balanced. I say this is lazy design because there is a correct answer here, but instead of taking the trouble to figure it out, you instead shift that burden to the players and make them balance the game for you. Note that this can actually be a great tool in playtesting. Feel free to add an auction in a case like this, let your testers come to a consensus of how much something is worth, then just cost it accordingly in the final version (without including the auction). Heres another way to get players to balance your game for you: in a multiplayer free-for-all game, include mechanics that let the players easily gang up on the leader. That way, if one player finds a game imbalance, the other players can cooperate to bring them down. Of course, this brings other gameplay problems with it. Players may sandbag (play suboptimally on purpose) in order to not attract too much attention. Players who do well (even without rules exploits) may feel like the other players are punishing them for being good players. Kill-the-leader mechanics serve as a strong negative feedback loop, and negative feedback has other consequences: the game tends to take longer, early-game skill is not as much a factor as late-game, and some players may feel that the outcome of the game is more decided on their ability to not be noticed than their actual game skill. Again, there is nothing inherently wrong with giving players the ability to form alliances against each other but doing it for the sole purpose of letting players deal with your poor design and balancing skills should not be the first and only solution. Okay, is there anything you can do right now to improve the balance of a game youre working on? I would say, examine your game to see if you are using your players as a game balance crutch (through auctions, kill-the-leader mechanics, or similar). Try removing that crutch and seeing what happens. You might find out that these mechanics are covering up game imbalances that will become more apparent when theyre removed. When you find the actual imbalances that used to be obscured, you can fix them and make the game stronger. (You can always add your auctions or killthe-leader mechanics back in later, if they are important to the gameplay.) Homework Ill go out on a limb and guess that if youre reading this, you are probably playing at least one game in your spare time. If you work in the game industry as a designer, you may be playing a game at your day job for research. Maybe you have occasion to watch other people play, either while playtesting your own game, or on television (such as watching a game show or a professional sports match). As you play (or watch) these games this week, dont just play/watch for fun. Instead, think about
the actions in the game and ask yourself if you think the game is balanced or not. Why do you think that? If you feel its not, where are the imbalances? What are the root causes of those imbalances, and how would you change them if you wanted to fix them? Write down your thoughts if it helps. The purpose of this is not to actually improve the game youre examining, but to give you some practice in thinking critically about game balance. Its emotionally easier to find problems in other peoples games than your own (even if the actual process is the same), so start by looking at the balance or imbalance in other peoples games first.
Level 2: Numeric Relationships

This Weeks Topic This week, Im going to talk about the different kinds of numbers you see in games and how to classify them. This is going to be important later, because you cant really know how to balance a game or how to choose the right numbers unless you first know what kinds of numbers youre dealing with. Sometimes, a balance change is as simple as replacing one kind of number with another, so understanding what kinds of numbers there are and getting an intuition for how they work is something we need to cover before anything else. In particular, were going to be examining relationships between numbers. Numbers in games dont exist in a vacuum. They only have meaning in relation to each other. For example, suppose I tell you that the main character in a game does 5 damage when he attacks. That tells you nothing unless you know how much damage enemies can take before they keel over dead. Now you have two numbers, Damage and Hit Points, and each one only has meaning in relation to the other. Or, suppose I tell you that a sword costs 250 Gold. That has no meaning, until I tell you that the player routinely finds bags with thousands of Gold lying around the country side, and then you know the sword is cheap. Or, I tell you that the player only gets 1 Gold at most from winning each combat, and then its really expensive. Even within a game, the relative value of something can change; maybe 250 Gold is a lot at the start of the game but its pocket change at the end. In World of Warcraft, 1 Gold used to be a tidy sum, but today it takes tens or hundreds to buy the really epic loot. With all that said, what kinds of ways can numbers be related to each other? Identity and Linear Relationships Probably the simplest type of relationship, which math geeks would call an identity relationship, is where two values change in exactly the same way. Add +1 to one value, its equivalent to adding +1 to the other. For game balance purposes, you can treat the two values as identical. You would think that in such a case, you might just make a single value, but there are some cases where it makes sense to have two different values that just happen to have a one-to-one conversion. As an example, Ultima III: Exodus has Food, something that each character needed to not starve to death in a dungeon. You never got food as an item drop, and could only buy it from food vendors in towns. Food decreases over time, and has no other value (and cannot be sold or exchanged for anything else); its only purpose is to act as a continual slow drain on your resources. Each character also has Gold, something that they find while adventuring. Unlike food, Gold doesnt degrade over time, and it is versatile (you can use it to bribe guards, buy hints, purchase weapons or armor or purchase Food). While these are clearly two separate values that serve very different purposes within the game, each unit of Food costs 1 Gold (10 Food costs 10 Gold, 1000 Food costs 1000 Gold, and so on). Food and Gold have an identity relationship although it is one-way in this case, since you can convert Gold to Food but not vice versa. A more general case of an identity relationship is the linear relationship, where the conversion rate between two values is a constant. If a healing spell always costs 5 MP and heals exactly 50 HP, then there is a 1-to-10 linear relationship between MP and HP. If you can spend 100 Gold to gain +1 Dexterity, theres a 100-to-1 linear relationship between Gold and Dexterity. And so on. Note that we are so far ignoring cases where a relationship is partly random (maybe that healing spell heals somewhere between 25 and 75 HP, randomly chosen each time). Randomness is something well get into in a few weeks, so were conveniently leaving that out of the picture for now. Exponential and Triangular Relationships
Sometimes, a linear relationship doesnt work for your game. You may have a relationship where there are either increasing or diminishing returns. For example, suppose a player can pay resources to gain additional actions in a turn-based strategy game. One extra action might be a small boost, but three or four extra actions might be like taking a whole extra turn it might feel a lot more than 3 or 4 times as powerful as a single action. This would be increasing returns: each extra action is more valuable than the last. You would therefore want the cost of each extra action to increase, as you buy more of them. Or, maybe you have a game where players have incentive to spend all of their in-game money every turn to keep pace with their opponents, and hoarding cash has a real opportunity cost (that is, they miss out on opportunities they would have had if theyd spent it instead). In this case, buying a lot of something all at once is actually not as good as buying one at a time, so it makes sense to give players a discount for buying in bulk as it were. Here we have a decreasing return, where each extra item purchased is not as useful as the last. In such cases, you need a numeric relationship that increases or decreases its rate of exchange as you exchange more or less at a time. The simplest way to do this is an exponential relationship: when you add to one value, multiply the other one. An example is doubling: for each +1 you give to one value, double the other one. This gives you a relationship where buying 1, 2, 3, 4 or 5 of something costs 1, 2, 4, 8 or 16, respectively. As you can see, the numbers get really big, really fast when you do this. Because the numbers get prohibitively large very quickly, you have to be careful when using exponential relationships. For example, nearly every card in any Collectible Card Game that Ive played that has the word double on it somewhere (as in, one card doubles some value on another card) ends up being too powerful. I know offhand of one exception, and that was an all-or-nothing gamble where it doubled your attack strength but then made you lose at the end of the turn if you hadnt won already! The lesson here is to be very, very careful when using exponentials. What if you want something that increases, but not as fast as an exponential? A common pattern in game design is the triangular relationship. If youre unfamiliar with the term, you have probably at least seen this series: 1, 3, 6, 10, 15, 21, 28, That is the classic triangular pattern (so called because several ways to visualize it involve triangles). In our earlier example, maybe the first extra action costs 1 resource; the next costs 2 (for a running total of 3), the next costs 3 (for a total of 6), and so on. An interesting thing to notice about triangular numbers is when you look at the difference between each successive pair of numbers. The difference between the first two numbers (1 and 3) is 2. The difference between the next two numbers (3 and 6) is 3. The next difference (between 6 and 10) is 4. So the successive differences are linear: they follow the pattern 1, 2, 3, 4 Triangular numbers usually make a pretty good first guess for increasing costs. What if you want a decreasing cost, where something starts out expensive and gets cheaper? In that case, figure out how much the first one should cost, then make each one after that cost 1 less. For example, suppose you decide the first Widget should cost 7 Gold. Then try making the second cost 6 Gold (for a total of 13), the third costs 5 Gold (total of 18), and so on. Note that in this case, you will eventually reach a point where each successive item costs zero (or even negative), which gets kind of ridiculous. This is actually a pretty common thing in game balance, that if you have a math formula the game balance will break at the mathematical extremes. The design solution is to set hard limits on the formula, so that you dont ever reach those extremes. In our Widget example above, maybe the players are simply prevented from buying more than 3 or 4 Widgets at a time.
Other Numeric Relationships While linear and triangular relationships are among the most common in games, they are not the only ones available. In fact, there are an infinite number of potential numeric relationships. If none of the typical relationships work for your game, come up with your own custom relationship! Maybe you have certain cost peaks, where certain thresholds cost more than others because those have in-game significance. For example, if everything in your game has 5 hit points, there is actually a huge difference between doing 4 or 5 damage, so that 5th point of damage will probably cost a lot more than you would otherwise expect. You might have oscillations, where several specific quantities are particularly cheap (or expensive). You can create any ratio between two values that you want but do so with some understanding of what effect it will have on play! Relationships Within Systems Individual values in a game usually exist within larger systems. By analyzing all of the different numbers and relationships between them in a games systems, we can gain a lot of insight into how the game is balanced. Let us take a simple example: the first Dragon Warrior game for the NES. In the games combat system, you have four main stats: Hit Points (HP), Magic Points (MP), Attack and Defense. This is a game of attrition; you are exploring game areas, and every few steps you get attacked by an enemy. You lose if your HP is ever reduced to zero. How are all of these numbers related? Random encounters are related to HP: each encounter reduces HP (you can also say it the other way: by walking around getting into fights, you can essentially convert HP into encounters). This is an inverse relationship, as more encounters means less HP. Theres a direct relationship between HP and Defense: the more defense you have, the less damage you take, which means your HP lasts longer. Effectively, increasing your Defense is equivalent to giving yourself a pile of extra HP. Ironically, we see the same relationship between HP and Attack. The higher your attack stat, the faster you can defeat an enemy. If you defeat an enemy faster, that means it has less opportunity to damage you, so you take less damage. Thus, you can survive more fights with higher Attack. MP is an interesting case, because you can use it for a lot of things. There are healing spells that directly convert MP into HP. There are attack spells that do damage (hopefully more than youd do with a standard attack); like a higher Attack stat, these finish combats earlier, which means they preserve your HP. There are buff/debuff spells that likewise reduce the damage you take in a combat. There are teleport spells that take you across long distances, so that you dont have to get in fights along the way, so these again act to preserve your HP. So even though MP is versatile, virtually all of the uses for it involve converting it (directly or indirectly) into HP. If you draw this all out on paper, youll see that everything Attack, Defense, MP, Monster Encounters is linked directly to HP. As the loss condition for the game, the designers put the HP stat in the middle of everything! This is a common technique, making a single resource central to all of the others, and it is best to make this central resource either the win or loss condition for the game. Now, theres one additional wrinkle here: the combat system interacts with two other systems in the game through the monster encounters. After you defeat a monster, you get two things: Gold and Experience (XP). These interact with the economic and leveling systems in the game, respectively. Lets examine the leveling system first. Collect enough XP and youll level up, which increases all of your stats (HP, MP, Attack and Defense). As you can see, this creates a feedback loop: defeating enemies causes you to gain a level, which increases your stats, which lets you defeat more enemies. And in fact, this would be a positive feedback loop that would cause the player to gain high levels
of power very fast, if there werent some kind of counteracting force in the game. That counteraction comes in the form of an increasing XP-to-Level relationship, so it takes progressively more and more XP to gain a level. Another counteracting force is that of player time; while the player could maximize their level by just staying in the early areas of the game beating on the weakest enemies, the gain is so slow that they are incentivized to take some risks so they can level a little faster. Examining the economic system, Gold is used for a few things. Its primary use is to buy equipment which permanently increases the players Attack or Defense, thus effectively converting Gold into extra permanent HP. Gold can also be used to buy consumable items, most of which mimic the effects of certain spells, thus you can (on a limited basis, since you only have a few inventory slots) convert Gold to temporary MP. Here we see another feedback loop: defeating monsters earns Gold, which the player uses to increase their stats, which lets them defeat even more monsters. In this case, what prevents this from being a positive feedback loop is that its limited by progression: you have a limited selection of equipment to buy, and the more expensive stuff requires that you travel to areas that you are just not strong enough to reach at the start of the game. And of course, once you buy the most expensive equipment in the game, extra Gold doesnt do you much good. Another loop that is linked to the economic system, is that of progression itself. Many areas in the game are behind locked doors, and in order to open them you need to use your Gold to purchase magic keys. You defeat monsters, get Gold, use it to purchase Keys, and use those keys to open new areas which have stronger monsters (which then let you get even more Gold/XP). Of course, this loop is itself limited by the players stats; unlocking a new area with monsters that are too strong to handle does not help the player much. How would a designer balance things within all these systems? By relating everything back to the central value of HP, and then comparing. For example, say you have a healing spell and a damage spell, and you want to know which is better. Calculate the amount of HP that the player would no longer lose as a result of using the damage spell and ending the combat earlier, and compare that to the amount of HP actually restored by the healing spell. Or, say you want to know which is better, a particular sword or a particular piece of armor. Again, figure out how much extra HP each would save you. Now, this does not mean that everything in the game must be exactly equal to be balanced. For example, you may want spells that are learned later in the game to be more cost-effective, so that the player has reason to use them. You may also want the more expensive equipment to be less costeffective, in order to make the player really work for it. However, at any given time in the game, you probably want the choices made available at that time to be at least somewhat balanced with each other. For example, if the player reaches a new town with several new pieces of equipment, you would expect those to be roughly equivalent in terms of their HP-to-cost ratios. Another Example You might wonder, if this kind of analysis works for a stat-driven game like an RPG, is it useful for any other kind of game? The answer is yes. Lets examine an action title, the original Super Mario Bros. (made popular from the arcade and NES versions). What kinds of resources do we have in Mario? There are lives, coins, and time (from a countdown timer). Theres actually a numeric score. And then there are objects within the game coin blocks, enemies, and so on which can sometimes work for or against you depending on the situation. Let us proceed to analyze the relationships. Coins: there is a 100-to-1 relationship between Coins and Lives, since collecting 100 coins awards an extra life. There is a 1-to-200 relationship between Coins and Score, since collecting a coin gives 200 points. There is a relationship between Coin Blocks and Coins, in that each block gives you some number of coins.
Time: there is a 100-to-1 relationship between Time and Score, since you get a time bonus at the end of each level. There is also an inverse relationship between Time and Lives, since running out of time costs you a life. Enemies: there is a relationship between Enemies and Score, since killing enemies gives you from 100 to 1000 points (Depending on the enemy). There is an inverse relationship between Enemies and Lives, since sometimes an enemy will cost you a life. (In a few select levels there is potentially a positive relationship between Enemies and Lives, as stomping enough enemies in a combo will give extra lives, but that is a special case.) Lives: there is this strange relationship between Lives and everything else, because losing a life resets the Coins, Time and Enemies on a level. Note that since Coins give you extra Lives, and losing a Life resets Coins, any level with more than 100 Coins would provide a positive feedback loop where you could die intentionally, get more than 100 Coins, and repeat to gain infinite lives. The original Super Mario Bros. did not have any levels like this, but Super Mario 3 did. Relationship between Lives and Score: There is no direct link between Lives and Score. However, losing a Life resets a bunch of things that give scoring opportunities, so indirectly you can convert a Life to Score. Interestingly, this does not happen the other way around; unlike other arcade games of the time, you cannot earn extra Lives by getting a sufficiently high Score. Looking at these relationships, we see that Score is actually the central resource in Super Mario Bros. since everything is tied to Score. This makes sense in the context of early arcade games, since the win condition is not beat the game, but rather, get the highest score. How would you balance these resources with one another. There are a few ways. You can figure out how many enemies you kill and their relative risks (that is, which enemies are harder to kill and which are more likely to kill you). Compare that with how many coins you find in a typical level, and how much time you typically complete the level with. Then, you can either change the amount of score granted to the player from each of these things (making a global change throughout the game), or you can vary the number of coins and enemies, the amount of time, or the length of a level (making a local change within individual levels). Any of these techniques could be used to adjust a players expected total score, and also how much each of these activities (coin collecting, enemy stomping, time completion) contributes to the final score. When youre designing a game, note that you can change your resources around, and even eliminate a resource or change the central resource to something else. The Mario series survived this quite well; the games that followed the original eliminated Score entirely, and everything was later related to Lives. Interactions Between Relationships When you form chains or loops of resources and relationships between them, the relationships stack with each other. They can either combine to become more intense, or they can cancel each other out (completely or partially). We just saw one example of this in the Mario games, with Lives and Coins. If you have a level that contains 200 Coins, then the 100 Coins to 1 Life relationship combines with 1 Life to 200 Coins in that level, to create a doubling effect where you convert 1 Life to 2 Lives in a single iteration. Heres another example, from the PS2 game Baldurs Gate: Dark Alliance. In this action-RPG, you get XP from defeating enemies, which in turn causes you to level up. The XP-to-Level relationship is triangular: going from Level 1 to Level 2 requires 1000 XP, Level 2 to Level 3 costs 2000 XP, rising to Level 4 costs 3000 XP, and so on. Each time you level up, you get a number of upgrade points to spend on special abilities. These also follow a triangular progression: at Level 2 you get 1 upgrade point; at Level 3 you get 2 points; the
next level gives you 3 points, then the next gives you 4 points, and so on. However, these relationships chain together, since XP gives you Levels and Levels give you Upgrade Points. Since XP is the actual resource the player is earning, it is the XP-to-Points ratio we care about, and the two triangular relationships actually cancel with each other to form a linear relationship of 1000 XP to 1 Upgrade Point. While the awarding of these upgrade points is staggered based on levels, on average you are earning them at a constant XP rate. How does Time fit into this (as in, the amount of time the player spends on the game)? If the player were fighting the same enemies over and over for the same XP rewards, there would be a triangular increase in the amount of time it takes to earn a level (and a constant amount of time to earn each Upgrade Point, on average). However, as with most RPGs, there is a system of increasing XP rewards as the player fights stronger monsters. This increasing XP curve doesnt increase as fast as the triangular progression of level-ups, which means that it doesnt completely cancel out the triangular effect, but it does partly reduce it in other words, you level up slightly faster in the early game and slower in the late game, but the play time between level gains doesnt increase as fast as a triangular relationship. Note, however, the way this interacts with Upgrade Points. Since the XP-to-Point ratio is linear, and the player gets an increasing amount of XP per unit time, they are actually getting an increasing rate of Upgrade Point gain! This kind of system has some interesting effects. By changing the rate of XP gain (that is, exactly how fast the XP rewards increase for defeating enemies) you can change both the rate of leveling up and the rate of Upgrade Point gains. If the XP rewards increase faster than the triangular rate of the levels themselves, the player will actually level up faster as the game progresses. If the XP rewards increase more slowly than the rate of level ups, the player will level faster in the early game and slower in the late game (which is usually what you want, as it gives the player frequent rewards early on and starts spacing them out once theyve committed to continued play). If the XP rewards increase at exactly the same rate, the player will level up at a more or less constant rate. Suppose you decide to have the player gain levels faster in the early game and slower in the late game, but you never want them to go longer than an hour between levels. How would you balance the XP system? Simple: figure out what level they will be at in the late game, scale the XP gains to take about an hour per level up at that point, and then work your way backwards from there. Note another useful property this leveling system has: it provides a negative feedback loop that keeps the player in a narrow range of levels during each point in the game. Consider two situations: Over-leveling: The player has done a lot of level-grinding and is now too powerful for the enemies in their current region. For one thing, theyll be able to defeat the nearby enemies faster, so they dont have to stick around too long. For another, the XP gains arent that good if their level is already high; they are unlikely to gain much in the way of additional levels by defeating weaker enemies. The maximum level a player can reach is effectively limited by the XP-reward curve. Under-leveling: Suppose instead the opposite case, where the player has progressed quickly through the game and is now at a lower level than the enemies in the current region. In this case, the XP gains will be relatively high (compared to the players level), and the player will only need to defeat a few enemies to level up quickly. In either case, the games system pushes the players level towards a narrow range in the middle of the extremes. It is much easier to balance a combat system to provide an appropriate level of challenge, when you know what level the player will be at during every step of the way! How Relationships Interact How do you know how two numeric relationships will stack together? Heres a quick-reference guide:
Two linear relationships that combine: multiply them together. If you can turn 1 of Resource A into 2 Resource B, and 1 Resource B into 5 Resource C, then there is a 1-to-10 conversion between A and C (25). Linear relationship combines with an increasing (triangular or exponential) relationship: the increasing relationship just gets multiplied by a bigger number, but the nature of the curve stays the same. Linear relationship counteracts an increasing relationship: if the linear conversion is large, it may dominate early on, but eventually the increasing relationship will outpace it. Exactly where the two curves meet and the game shifts from one to the other depends on the exact numbers, and tweaking these can provide an interesting strategic shift for the players. Two increasing relationships combine: you end up with an increasing relationship thats even faster than either of the two individually. Two increasing relationships counteract one another: depends on the exact relationships. In general, an exponential relationship will dominate a triangular one (how fast this happens depends on the exact numbers used). Two identical relationships (such as two pure triangulars) will cancel out to form a linear or identity relationship. If Youre Working On a Game Now Are you designing your own game right now? Try this: make a list of every resource or number in your game on a piece of paper. Put a box around each, and spread the boxes out. Then, draw arrows between each set of boxes that has a direct relationship in your game, and label the arrow with the kind of relationship (linear, triangular, exponential, etc.). Use this diagram to identify a few areas of interest in the balance of your game: Do you see any loops where a resource can be converted to something else, then maybe something else, and then back to the original? If you get back more of the original than you started with by doing this, you may have just identified a positive feedback loop in your game. Do you see a central resource that everything else seems tied to? If so, is that central resource either the win or loss condition, or does it seem kind of arbitrary? If not, does it make sense to create a new central resource, perhaps by adding new relationships between resources? You can then use this diagram to predict changes to gameplay. If you change the nature of a relationship, you might be able to make a pretty good guess at what other relationships will also change as a result, and what effect that might have on the games systems overall. If your game is a single-player game with some kind of progression system, Time (as in, the amount of time the player spends actually playing the game) should be one of your resources, and you can use your diagram to see if the rewards and power gains the player gets from playing are expected to increase, decrease, or remain constant over time. Homework Heres your game balance challenge for this week. First, choose any single-player game that youve played and are familiar with, that has progression mechanics. Examples of games with progression are action-adventure games (Zelda), action-RPGs (Diablo), RPGs (Final Fantasy), or MMORPGs (World of Warcraft). Ill recommend that you choose something relatively simple, such as an NESera game or earlier. Youre going to analyze the numbers in this game, and as youve seen from the earlier examples here, even simple games can have pretty involved systems. In these games, there is some kind of progression where the player gains new abilities and/or improves their stats over time. As the player progresses, enemies get stronger; again this could just mean they have higher stats, or they might also gain new abilities that require better strategy and tactics to defeat.
Start by asking yourself this question: overall, what was the difficulty curve of the game like? Did it start off easy and get slowly, progressively harder? Or, did you notice one or more of these undesirable patterns: A series of levels that seemed to go by very slowly, because the player was underpowered at the time and did not gain enough power fast enough to compensate, so you had to grind for a long time in one spot. A sudden spike in difficulty with one dungeon that had much more challenging enemies than those that came immediately before or after. A dungeon that was much easier than was probably intended, allowing you to blast through it quickly since you were much more powerful than the inhabitants by the time you actually reached it. The hardest point in the game was not at the end, but somewhere in the middle. Perhaps you got a certain weapon, ally, or special ability that was really powerful, and made you effectively unbeatable from that point on until the end of the game. So far, all youre doing is using your memory and intuition, and it probably takes you all of a few seconds to remember the standout moments of epic win and horrible grind in your chosen game. Its useful to build intuition, but it is even better to make your intuition stronger by backing it up with math. So, once youve written down your intuitive guesses at the points where the game becomes unbalanced, lets start analyzing. First, seek a strategy guide or FAQ that gives all of the numbers for the game. A web search may turn up surprisingly detailed walkthroughs that show you every number and every resource in the game, and exactly how they are all related. Next, make a list on paper of all of the resources in the game. Using the FAQ as your guide, also show all relationships between the resources (draw arrows between them, and label the arrows with the relationship type). From this diagram, you may be able to identify exactly what happened. For example, maybe you seemed to level up a lot in one particular dungeon, gaining a lot of power in a short time. In such a case, you might start by looking at the leveling system: perhaps there is a certain range of levels where the XP requirements to gain a level are much lower than the rest of the progression curve. You might also look at the combat reward system: maybe you just gain a lot more XP than expected from the enemies in that dungeon. As another example, maybe the game felt too easy after you found a really powerful weapon. In this case youd look at the combat system: look at how much damage you do versus how much enemies can take, as separate curves throughout the game, and identify the sudden spike in power when you get that weapon. You may be able to graphically see the relationship of your power level versus that of the enemies over time. Lastly, if you do identify unbalanced areas of the game from this perspective, you should be able to use your numbers and curves to immediately suggest a change. Not only will you know exactly which resource needs to be changed, but also by how much.
This exercise will probably take you a few hours, as researching a game and analyzing the numbers is not a trivial task (even for a simple game). However, after doing this, you will be much more comfortable with identifying resources and relationships in games, and also being able to use your understanding of a games systems to improve the balance of those systems.
Level 3: Transitive Mechanics and Cost Curves

This Weeks Topic This week is one of the most exciting for me, because we really get to dive deep into nuts-and-bolts game balance in a very tangible way. Well be talking about something that Ive been doing for the past ten years, although until now Ive never really written down this process or tried to communicate it to anyone. Im going to talk about how to balance transitive mechanics within games. As a reminder, intransitive is like Rock-Paper-Scissors, where everything is better than something else and there is no single best move. In transitive games, some things are just flat out better than others in terms of their in-game effects, and we balance that by giving them different costs, so that the better things cost more and the weaker things cost less in the game. How do we know how much to cost things? That is a big problem, and that is what well be discussing this week. Examples of Transitive Mechanics Just to contextualize this, what kinds of games do we see that have transitive mechanics? The answer is, most of them. Here are some examples: RPGs often have currency costs to upgrade your equipment and buy consumable items. Leveling is also transitive: a higher-level character is better than a lower-level character in nearly every RPG I can think of. Shooters with progression mechanics like BioShock and Borderlands include similar mechanics. In BioShock, for example, there are costs to using vending machines to get consumable items, and you also spend ADAM to buy new special abilities; higher-level abilities are just better (e.g. doing more damage) than their lower-level counterparts, but they cost more to buy. Professional sports in the real world do this with monetary costs: a player who is better at the game commands a higher salary. Sim games (like The Sims and Sim City) have costs for the various objects you can buy, and often these are transitive. A really good bed in The Sims costs more than a cheap bed, but it also performs its function of restoring your needs much more effectively. Retro arcade games generally have a transitive scoring mechanism. The more dangerous or difficult an enemy, the more points you get for defeating it. Turn-based and real-time strategy games may have a combination of transitive and intransitive mechanics. Some unit types might be strong or weak against others inherently (in an intransitive fashion), like the typical footmen beat archers, archers beat fliers, fliers beat footmen comparison. However, you also often see a class of units that all behave similarly, but with stronger and more expensive versions of weaker ones such as light infantry vs. heavy infantry. Tower Defense games are often intransitive in that certain tower types are strong against certain kinds of attacks, like splash damage is strong against enemies that come clustered together (intransitive), but in most of these games the individual towers are upgradeable to stronger versions of themselves and the stronger versions cost more (transitive). Collectible-card games are another example where there may be intransitive mechanics (and there are almost always intransitive elements to the metagame, thank goodness that is, a better deck isnt just more expensive), but the individual cards themselves generally have some kind of cost and they are all balanced according to that cost, so that more expensive cards are more useful or powerful. You might notice something in common with most of these examples: in nearly all cases, there is some kind of resource that is used to buy stuff: Gil in Final Fantasy, Mana in Magic: the
Gathering, ADAM in BioShock. Last week we talked about relating everything to a single resource in order to balance different game objects against each other, and as you might expect, this is an extension of that concept. However, another thing we said last week is that a central resource should be the win or loss condition for the game, and we see that is no longer the case here (the loss condition for an RPG is usually running out of Hit Points, not running out of Gold Pieces). In games that deal with costs, it is common to make the central resource something artificially created for that purpose (some kind of currency in the game) rather than a win or loss condition, because everything has a monetary cost. Costs and Benefits With all that said, lets assume we have a game with some kind of currency-like resource, and we want to balance two things where one might be better than the other but it costs more. Ill start with a simple statement: in a transitive mechanic, everything has a set of costs and a set of benefits, and all in-game effects can be put in terms of one or the other. When we think of costs were usually thinking in terms of resource costs, like a sword that costs 250 Gold. But when I use this term, Im defining it more loosely to be any kind of drawback or limitation. So it does include resource costs, because that is a setback in the game. But for example, if the sword is only half as effective against demons, that is part of a cost as well because its less powerful in some situations. If the sword can only be equipped by certain character classes, thats a limitation (you cant just buy one for everyone in your party). If the sword disintegrates after 50 encounters, or if it does 10% of damage dealt back to the person wielding it, or if it prevents the wielder from using magic spells I would call all of these things costs because they are drawbacks or limitations to using the object that were trying to balance. If costs are everything bad, then benefits are everything good. Maybe it does a lot of damage. Maybe it lets you use a neat special ability. Maybe it offers some combination of increases to your various stats. Some things are a combination of the two. What if the sword does 2x damage against dragons? This is clearly a benefit (its better than normal damage sometimes), but its also a limitation on that benefit (it doesnt do double damage all the time, only in specific situations). Or maybe a sword prevents you from casting Level 1 spells (obviously a cost), but if most swords in the game prevent you from casting all spells, this is a less limiting limitation that provides a kind of net benefit. How do you know whether to call something a cost or a benefit? For our purposes, it doesnt matter: a negative benefit is the same as a cost, and vice versa, and our goal is to equalize everything. We want the costs and benefits to be equal, numerically. Whether you add to one side or subtract from the other, the end result is the same. Personally, I find it easiest to keep all numbers positive and not negative, so if something would be a negative cost Ill call it a benefit. That way I only have to add numbers and never subtract them. Adding is easier. But if you want to classify things differently, go ahead; the math works out the same anyway. So, this is the theory. Add up the costs for an object. Add up the benefits. The goal is to get those two numbers to be equal. If the costs are less than the benefits, its too good: add more costs or remove some benefits. If the costs are greater than the benefits, its too weak; remove costs or add benefits. You might be wondering how we would relate two totally different things (like a Gold cost and the number of Attack Points you get from equipping a sword). We will get to that in a moment. But first, theres one additional concept I want to introduce. Overpowered vs. Underpowered vs. Overcosted vs. Undercosted Lets assume for now that we can somehow relate everything back to a single resource cost so they can be directly compared. And lets say that we have something that provides too many benefits for
its costs. How do we know whether to reduce the benefits, increase the costs, or both? In most cases, we can do either one. It is up to the designer what is more important: having the object stay at its current cost, or having it retain its current benefits. Sometimes its more important that you have an object within a specific cost range because you know thats what the player can afford when they arrive in the town that sells it. Sometimes you just have this really cool effect that you want to introduce to the game, and you dont want to mess with it. Figure out what you want to stay the same and then change the other thing. Sometimes, usually when youre operating at the extreme edges of a system, you dont get a choice. For example, if you have an object thats already free, you just cant reduce the cost anymore, so it is possible youve found an effect that is just too weak at any cost. Reducing the cost further is impossible, so you have no choice: you must increase the benefits. We have a special term for this: we say the object is underpowered, meaning that it is specifically the level of benefits (not the cost) that must be adjusted. Likewise, some objects are just too powerful to exist in the game at any cost. If an object has an automatic I win / you lose effect, it would have to have such a high cost that it would be essentially unobtainable. In such cases we say it is overpowered, that is, that the level of benefits must be reduced (and that a simple cost increase is not enough to solve the problem). Occasionally you may also run into some really unique effects that cant easily be added to, removed, or modified; the benefits are a package deal, and the only thing you can really do is adjust the cost. In this case, we might call the object undercosted if it is too cheap, or overcosted if it is too expensive. I define these terms because it is sometimes important to make the distinction between something that is undercosted and something thats overpowered. In both cases the object is too good, but the remedy is different. There is a more general term for an object that is simply too good (although the cost or benefits could be adjusted): we say it is above the curve. Likewise, an object that is too weak is below the curve. What do curves have to do with anything? Well see as we talk about our next topic. Cost Curves Lets return to the earlier question of how to relate things as different as Gold, Attack Points, Magic Points, or any other kinds of stats or abilities we attach to an object. How do we compare them directly? The answer is to put everything in terms of the resource cost. For example, if we know that each point of extra Attack provides a linear benefit and that +1 Attack is worth 25 Gold, then its not hard to say that a sword that gives +10 Attack should cost 250 Gold. For more complicated objects, add up all the costs (after putting them in terms of Gold), add up all the benefits (again, converting them to their equivalent in Gold), and compare. How do you know how much each resource is worth? That is what we call a cost curve. Yes, this means you have to take every possible effect in the game, whether it be a cost or a benefit, and find the relative values of all of these things. Yes, it is a lot of work up front. On the bright side, once you have this information about your game, creating new content that is balanced is pretty easy: just put everything into your formula and you can pretty much guarantee that if the numbers add up, its balanced. Creating a Cost Curve The first step, and the reason its called a cost curve and not a cost table or cost chart or cost double-entry accounting ledger is that you need to figure out a relationship between increasing resource costs and increasing benefits. After that, you need to figure out how all game effects (positive and negative) relate to your central resource cost. Neither of these is usually obvious. Defining the relationship between costs and benefits
The two might scale linearly: +1 cost means +1 benefit. This relationship is pretty rare. Costs might be on an increasing curve, where each additional benefit costs more than the last, so incremental gains get more and more expensive as you get more powerful. You see this a lot in RPGs, for example. The amount of currency you receive from exploration or combat encounters is increasing over time. As a result, if youre getting more than twice as much Gold per encounter as you used to earlier, even if a new set of armor costs you twice as much as your old one, it would actually take you less time to earn the gold to upgrade. Additionally, the designer might want incremental games for other design reasons, such as to create more interesting choices. For example, if all stat gains cost the same amount, its usually an obvious decision to dump all of your gold into increasing your one or two most important stats while ignoring the rest; but if each additional point in a stat costs progressively more, players might consider exploring other options. Either way, you might see an increasing curve (such as a triangular or exponential curve), where something twice as good actually costs considerably more than twice as much. Some games have costs on a decreasing curve instead. For example, in some turn-based strategy games, hoarding resources has an opportunity cost. In the short term, everyone else is buying stuff and advancing their positions, and if you dont make purchases to keep up with them, you could fall hopelessly behind. This can be particularly true in games where purchases are limited: wait too long to buy your favorite building in Puerto Rico and someone else might buy it first; or, wait too long to build new settlements in Settlers of Catan and you may find that other people have built in the best locations. In cases like this, if the designer wants resource-hoarding to be a viable strategy, they must account for this opportunity cost by making something that costs twice as much be more than twice as good. Some games have custom curves that dont follow a simple, single formula or relationship. For example, in Magic: the Gathering, your primary resource is Mana and you generally are limited to playing one Mana-generating card per turn. If a third of your deck is cards that generate Mana, youll get (on average) one Mana-genrating card every three card draws. Since your opening hand is 7 cards and you typically draw one card per turn, this means a player would typically gain one Mana per turn for the first four turns, and then one mana every three turns thereafter. Thus, we might expect to see a shift in the cost curve at or around five mana, where suddenly each additional point of Mana is worth a lot more, which would explain why some of the more expensive cards have crazy-huge gameplay effects. In some games, any kind of cost curve will be potentially balanced, but different kinds of curves have different effects. For example, in a typical Collectible Card Game, players are gaining new resources at a constant rate throughout the game. If a game has an increasing cost curve where higher costs give progressively smaller gains, it puts a lot of focus on the early game: cheap cards are almost as good as the more expensive ones, so bringing out a lot of forces early on provides an advantage over waiting until later to bring out only slightly better stuff. If instead you feature a decreasing cost curve where the cheap stuff is really weak and the expensive stuff is really powerful, this instead puts emphasis on the late game, where the really huge things dominate. You might have a custom curve that has sudden jumps or changes at certain thresholds, to guide the play of the game into definite early-game, mid-game and late-game phases. None of these are necessarily right or wrong in a universal sense. It all depends on your design goals, in particular your desired game length, number of turns, and overall flow of the gameplay. At any rate, this is one of your most important tasks when balancing transitive systems: figuring out the exact nature of the cost curve, as a numeric relationship between costs and benefits. Defining basic costs and benefits The next step in creating a cost curve is to make a complete list of all costs and benefits in your game. Then, starting with common ones that are used a lot, identify those objects that only do one thing and nothing else. From there, try to figure out how much that one thing costs. (If you were
unsure about the exact mathematical nature of your cost curve, something like this will probably help you figure that out.) Once youve figured out how much some of the basic costs and benefits are worth, start combining them. Maybe you know how much it costs to have a spell that grants a damage bonus, and also how much it costs to have a spell that grants a defense bonus. What about a spell that gives both bonuses at the same time? In some games, the cost for a combined effect is more than their separate costs, since you get multiple bonuses for a single action. In other games, the combined cost is less than the separate costs, since both bonuses are not always useful in combination or might be situational. In other games, the combined cost is exactly the sum of the separate costs. For your game, get a feel for how different effects combine and how that influences their relative costs. Once you know how to cost most of the basic effects in your game and how to combine them, this gives you a lot of power. From there, continue identifying how much new things cost, one at a time. At some point you will start also identifying non-resource costs (drawbacks and limitations) to determine how much they cost. Approach these the same way: isolate one or more objects where you know the numeric costs and benefits of everything except one thing, and then use basic arithmetic (or algebra, if you prefer) to figure out the missing number. Another thing youll eventually need to examine are benefits or costs that have limitations stacked on them. If a benefit only works half of the time because of a coin-flip whenever you try to use it, is that really half of the cost compared to if it worked all the time, or is it more or less than half? If a benefit requires you to meet conditions that have additional opportunity costs (you can only use this ability if you have no Rogues in your party), what is that tradeoff worth in terms of how much it offsets the benefit? An Example: Cost Curves in Action To see how this works in practice, Im going to use some analysis to derive part of the cost curve for Magic 2011, the recent set that was just promoted recently for Magic: the Gathering. The reason Im choosing this game is that CCGs are among the most complicated games to balance in these terms a typical base or expansion set may have hundreds of cards that need to be individually balanced so if we can analyze Magic then we can use this for just about anything else. Note that by necessity, were going into spoiler territory here, so if you havent seen the set and are waiting for the official release, consider this your spoiler warning. For convenience, well examine Creature cards specifically, because they are the type of card that is the most easily standardized and directly compared: all Creatures have a Mana cost (this is the games primary resource), Power and Toughness, and usually some kind of special ability. Other card types tend to only have special, unique effects that are not easily compared. For those of you who have never played Magic before, that is fine for our purposes. As youll see, you wont need to understand much of the rules in order to go through this analysis. For example, if I tell you that the Flying ability gives a benefit equivalent to 1 mana, you dont need to know (or care) what Flying is or what it does; all you need to know is that if you add Flying to a creature, the mana cost should increase by 1. If you see any jargon that you dont recognize, assume you dont need to know it. For those few parts of the game you do need to know, Ill explain as we go. Let us start by figuring out the basic cost curve. To do this, we first examine the most basic creatures: those with no special abilities at all, just a Mana cost, Power and Toughness. Of the 116 creatures in the set, 11 of them fall into this category (Ill ignore artifact creatures for now, since those have extra metagame considerations). Before I go on, one thing you should understand about Mana costs is that there are five colors of Mana: White (W), Green (G), Red (R), Black (B), and Blue (U). Theres actually a sixth type called colorless which means any color you want. Thus, something with a cost of G4 means five mana, one of which must be Green, and the other four can be anything (Green or otherwise). We
would expect that colored Mana has a higher cost than colorless, since it is more restrictive. Here are the creatures with no special abilities: W, 2/1 (that is, a cost of one White mana, power of 2, toughness of 1) W4, 3/5 W1, 2/2 U4, 2/5 U1, 1/3 B2, 3/2 B3, 4/2 R3, 3/3 R1, 2/1 G1, 2/2 G4, 5/4
Looking at the smallest creatures, we immediately run into a problem with three creatures (Im leaving the names off, since names arent relevant when it comes to balance): W, 2/1 R1, 2/1 G1, 2/2 Apparently, all colors are not created equal: you can get a 2/1 creature for either W (one mana) or R1 (two mana), so an equivalent creature is cheaper in White than Red. Likewise, R1 gets you a 2/1 creature, but the equivalent-cost G1 gets you a 2/2, so you get more creature for Green than Red. This complicates our analysis, since we cant use different colors interchangeably. Or rather, we could, but only if we assume that the game designers made some balance mistakes. (Such is the difficulty of deriving the cost curve of an existing game: if the balance isnt perfect, and its never perfect, your math may be slightly off unless you make some allowances.) Either way, it means we cant assume every creature is balanced on the same curve. In reality, I would guess the designers did this on purpose to give some colors an advantage with creatures, to compensate for them having fewer capabilities in other areas. Green, for example, is a color thats notorious for having really big creatures and not much else, so its only fair to give it a price break since its so single-minded. Red and Blue have lots of cool spell toys, so their creatures might be reasonably made weaker as a result. Still, we can see some patterns here just by staying within colors: W, 2/1 W1, 2/2 B2, 3/2 B3, 4/2
Comparing the White creatures, adding 1 colorless is equivalent to adding +1 Toughness. Comparing the Black creatures, adding 1 colorless mana is equivalent to adding +1 Power. We might guess, then, that 1 colorless (cost) = 1 Power (benefit) = 1 Toughness (benefit). We can also examine similar creatures across colors to take a guess: W, 2/1 R1, 2/1 W4, 3/5 U4, 2/5
From these comparisons, we might guess that Red and Blue seem to have an inherent -1 Power or
-1 Toughness cost compared to White, Black and Green. Is the cost curve linear, +1 benefit for each additional colored mana? It seems to be up to a point, but there appears to be a jump around 4 or 5 mana: W, 2/1 (3 power/toughness for W) W4, 3/5 (5 additional power/toughness for 4 additional colorless mana) G1, 2/2 (4 power/toughness for G1) G4, 5/4 (5 additional power/toughness for 3 additional colorless mana)
As predicted earlier, there may be an additional cost bump at 5 mana, since getting your fifth mana on the table is harder than the first four. Green seems to get a larger bonus than White. From all of this work, we can take our first guess at a cost curve. Since we have a definite linear relationship between colorless mana and increased power/toughness, we will choose colorless mana to be our primary resource, with each point of colorless representing a numeric cost of 1. We know that each point of power and toughness provides a corresponding benefit of 1. Our most basic card, W for 2/1, shows a total of 3 benefits (2 power, 1 toughness). We might infer that W must have a cost of 3. Or, using some knowledge of the game, we might instead guess that W has a cost of 2, and that all cards have an automatic cost of 1 just for existing the card takes up a slot in your hand and your deck, so it should at least do something useful, even if its mana cost is zero, to justify its existence. Our cost curve, so far, looks like this: Cost of 0 provides a benefit of 1. Increased total mana cost provides a linear benefit, up to 4 mana. The fifth point of mana provides a double benefit (triple for Green), presumably to compensate for the difficulty in getting that fifth mana on the table. Our costs are: Baseline cost = 1 (start with this, just for existing) Each colorless mana = 1 Each colored mana = 2 Total mana cost of 5 or more = +1 (or +2 for Green creatures)
Our benefits are: +1 Power or +1 Toughness = 1 Being a Red or Blue creature = 1 (apparently this is some kind of metagame privilege). We dont have quite enough data to know if this is accurate. There may be other valid sets of cost and benefit numbers that would also fit our observations. But if these are accurate, we could already design some new cards. How much would a 4/3 Blue creature cost? The benefit is 1 (Blue) + 4 (Power) + 3 (Toughness) = 8. Our baseline cost is 1, our first colored mana (U) is 2, and if we add four colorless mana that costs an extra 4 but that also makes for a total mana cost of 5, which would give an extra +1 to the cost for a total of 8. So we would expect the cost to be U4. What would a 4/1 Green creature cost? The benefit is 5 (4 Power + 1 Toughness). A mana cost of G2 provides a cost of 5 (1 as a baseline, 2 for the colored G mana, and 2 for the colorless mana). What if I proposed this card: W3 for a 1/4 creature. Is that balanced? We can add it up: the cost is 1 (baseline) + 2 (W) + 3 (colorless) = 6. The benefit is 1 (power) + 4 (toughness) = 5. So this creature is exactly 1 below the curve, and could be balanced by either dropping the cost to W2 or increasing it to 2/4 or 1/5.
So you can see how a small amount of information lets us do a lot, but also how we are limited: we dont know what happens when we have several colored mana, we dont know what happens when we go above 5 (or below 1) total mana, and we dont know how to cost any special abilities. We could take a random guess based on our intuition of the game, but first lets take a look at some more creatures. In particular, there are 18 creature cards in this set that only have standard special abilities on them: W3, 3/2, Flying WW3, 5/5, Flying, First Strike, Lifelink, Protection from Demons and Dragons WW2, 2/3, Flying, First Strike WW3, 4/4, Flying, Vigilance W1, 2/1, Flying WW, 2/2, First Strike, Protection from Black W2, 2/2, Flying U3, 2/4, Flying BB, 2/2, First Strike, Protection from White B2, 2/2, Swampwalk B1, 2/1, Lifelink R3, 3/2, Haste GG5, 7/7, Trample GG, 3/2, Trample G3, 2/4, Reach GG3, 3/5, Deathtouch G, 0/3, Defender, Reach GG4, 6/4, Trample
How do we proceed here? The easiest targets are those with only a single ability, like all the White cards with just Flying. Its pretty clear from looking at all of those that Flying has the same benefit of +1 power or +1 toughness, which in our math has a benefit of 1. We can also make some direct comparisons to the earlier list of creatures without abilities to derive benefits of several special abilities: B2, 3/2 B2, 2/2, Swampwalk R3, 3/3 R3, 3/2, Haste
Swampwalk and Haste (whatever those are) also have a benefit of 1. And we can guess from the B1, 2/1, Lifelink card and our existing math that Lifelink is also a benefit of 1. We run into something curious when we examine some red and blue creatures at 4 mana. Compare the following: W3, 3/2, Flying W4, 3/5 (an extra +1 cost but +2 benefit, due to crossing the 5-mana threshold) U3, 2/4 Flying (identical total cost to the W3 but +1 benefit in Blue?) R3, 3/3 (identical total cost and benefit to the W3, but Red?)
It appears that perhaps Red and Blue get their high-cost-mana bonus at a threshold of 4 mana rather than 5. Additionally, Flying may be cheaper for Blue than it is for White but given that it would seem to have a cost of zero here, we might instead guess that the U3 creature is slightly above the curve. We find another strange comparison in Green:
G3, 2/4, Reach (cost of 6, benefit of 6+Reach?) G4, 5/4 (cost of 8, benefit of 9?) At first glance, both of these would appear to be above the curve by 1. Alternatively, since the extra bonus seems to be consistent, this may have been intentional. We might guess that Green gets a high-cost bonus not just at 5 total mana, but also at 4 total mana, assuming that Reach (like the other abilities weve seen) has a benefit of 1. (In reality, if you know the game, Reach gives part of the bonus of Flying but not the other part, so it should probably give about half the benefit of Flying. Unfortunately, Magic does not offer half-mana costs in standard play, so the poor G3 is probably destined to be either slightly above or below the curve.) Lets assume, for the sake of argument, that the benefit of Reach is 1 (or that the original designers intended this to be the benefit and balanced the cards accordingly, at least). Then we can examine this card to learn about the Defender special ability: G, 0/3, Defender, Reach The cost is 1 (baseline) + 2 (G mana) = 3. The benefit is 3 (toughness) + 1 (Reach) + ? (Defender). From this, it would appear Defender would have to have a benefit of negative 1 for the card to be balanced. Whats going on? If youve played Magic, this makes sense. Defender may sound like a special ability, but its actually a limitation: it means the card is not allowed to attack. We could therefore consider it as an additional cost of 1 (rather than a benefit of -1) and the math works out. Weve learned a lot, but there are still some things out of our immediate grasp right now. Wed love to know what happens when you have a second colored mana (does it also have a +2 cost like the first one?), and wed also like to know what happens when you get up to 6 or 7 total mana (are there additional high cost bonus adjustments?). While we have plenty of cards with two colored mana in their cost, and a couple of high-cost Green creatures, all of these also have at least one other special ability that we havent costed yet. We cant derive the costs and benefits for something when there are multiple unknown values; even if we figured out the right total level of benefits for our GG4 creature, for example, we wouldnt know how much of that benefit was due to the second Green mana cost, how much came from being 6 mana total, and how much came from its Trample ability. Does this mean were stuck? Thankfully, we have a few ways to proceed. One trick is to find two cards that are the same, except for one thing. Those cards may have several things we dont know, but if we can isolate just a single difference then we can learn something. For example, look at these two cards: GG4, 6/4, Trample GG5, 7/7, Trample We dont know the cost of GG4 or GG5, and we dont know the benefit of Trample, but we can see that adding one colorless mana that takes us from 6 to 7 gives us a power+toughness benefit of 4. A total cost of 7 must be pretty hard to get to! We can also examine these two cards that have the same mana cost: WW3, 5/5, Flying, First Strike, Lifelink, Protection from Demons and Dragons WW3, 4/4, Flying, Vigilance From here we might guess that Vigilance is worth +1 power, +1 toughness, First Strike, Lifelink, and the Protection ability, making Vigilance a really freaking awesome special ability that has a benefit of at least 4. Or, if we know the game and realize Vigilance just isnt that great, we can see that the 5/5 creature is significantly above the curve relative to the 4/4. We still dont know how much two colored mana costs, so lets use another trick: making an educated guess, then trying it out through trial and error. As an example, lets take this creature:
GG, 3/2, Trample We know the power and toughness benefits are 5, and since most other single-word abilities (Flying, Haste, Swampwalk, Lifelink) have a benefit of 1, we might guess that Trample also has a benefit of 1, giving a total benefit of 6. If thats true, we know that the cost is 1 (baseline) + 2 (first G), so the second G must cost 3. Intuitively, this might make sense: having two colored mana places more restrictions on your deck than just having one. We can look at this another way, comparing two similar creatures: G1, 2/2 GG, 3/2, Trample The cost difference between G1 and GG is the difference between a cost of 1 (colorless) and the cost of the second G. The benefit difference is 1 (for the extra power) + 1 (for Trample, we guess). This means the second G has a cost of 2 more than a colorless mana, which is a cost of 3. Were still not sure, though. Maybe the GG creature is above the curve, or maybe Green has yet another creature bonus we havent encountered yet. Lets look at the double-colored-mana White creatures to see if the pattern holds: WW, 2/2, First Strike, Protection from Black WW2, 2/3, Flying, First Strike WW3, 4/4, Flying, Vigilance Assuming that Protection from Black, First Strike, and Vigilance each have a +1 benefit (similar to other special abilities), most of these seem on the curve. WW is an expected cost of 6; 2/2, First Strike, Protection from Black seems like a benefit of 6. WW3 is a cost of 10 (remember the +1 for being a total of five mana); 4/4, Flying, Vigilance is also probably 10. The math doesnt work as well with WW2 (cost of 8); the benefits of 2/3, Flying and First Strike only add up to 7. So, this card might be under the curve by 1. Having confirmed that the second colored mana is probably a cost of +3, we can head back to Green to figure out this Trample ability. GG, 3/2, Trample indeed gives us a benefit of 1 for Trample, as we guessed earlier. Now that we know Trample and the second colored mana, we can examine our GG4 and GG5 creatures again to figure out exactly whats going on at the level of six or seven mana, total. Lets first look at GG4, 6/4, Trample. This has a total benefit of 11. The parts we know of the cost are: 1 (baseline) + 2 (first G) + 3 (second G) + 4 (colorless) + 1 (above 4 mana) + 1 (above 5 mana) = 12, so not only does the sixth mana apparently have no extra benefit but were already below the curve. (Either that, or Trample is worth more when you have a really high power/toughness, as we havent considered combinations of abilities yet.) Lets compare to GG5, 7/7, Trample. This has a benefit of 15. Known costs are 1 (baseline) + 2 (first G) + 3 (second G) + 5 (colorless) + 1 (above 4 mana) + 1 (above 5 mana) = 13, so going from five to seven mana total has an apparent additional benefit of +2. We might then guess that the benefit is +1 for 6 mana and another +1 for 7 mana, and that the GG4 is just a little below the curve. Lastly, we have this Deathtouch ability that we can figure out how, from the creature that is GG3, 3/5, Deathtouch. The cost is 1 (baseline) + 2 (first G) + 3 (second G) + 3 (colorless) + 1 (above 4 mana) + 1 (above 5 mana) = 11. Benefit is 8 (power and toughness) + Deathtouch, which implies Deathtouch has a benefit of 3. This seems high, when all of the other abilities are only costed at 1, but if youve played Magic you know that Deathtouch really is a powerful ability, so perhaps the high number makes sense in this case. From here, there are an awful lot of things we can do to make new creatures. Just by going through this analysis, weve already identified several creatures that seem above or below the curve.
(Granted, this is an oversimplification. Some cards are legacy from earlier sets and may not be balanced along the current curve. And every card has keywords which dont do anything on their own, but some other cards affect them, so there is a metagame benefit to having certain keywords. For example, if a card is a Goblin, and theres a card that gives all Goblins a combat bonus, thats something that makes the Goblin keyword useful so in some decks that card might be worth using even if it is otherwise below the curve. But keep in mind that this means some cards may be underpowered normally but overpowered in the right deck, which is where metagame balance comes into play. Were concerning ourselves here only with transitive balance, not metagame balance, although we must understand that the two do affect each other.) From this point, we can examine the vast majority of other cards in the set, because nearly all of them are just a combination of cost, power, toughness, maybe some basic special abilities weve identified already, and maybe one other custom special ability. Since we know all of these things except the custom abilities, we can look at almost any card to evaluate the benefit of its ability (or at least, the benefit assigned to it by the original designer). While we may not know which cards with these custom abilities are above or below the curve, we can at least get a feel for what kinds of abilities are marginally useful versus those that are really useful. We can also put numbers to them, and compare the values of each ability to see if they feel right. Name That Cost! Lets take an example: W1, 2/2, and it gains +1 power and +1 toughness whenever you gain life. How much is that ability worth? Well, the cost is 4, the power/toughness benefit is 4, so that means this ability is free either its nearly worthless, or the card is above the curve. Since theres no intrinsic way to gain life in the game without using cards that specifically allow it, and since gaining life tends to be a weak effect on its own (since it doesnt bring you closer to winning), we might guess this is a pretty minor effect, and perhaps the card was specifically designed to be slightly above the curve in order to give a metagame advantage to the otherwise underpowered mechanic of life-gaining. Heres another: W4, 2/3, when it enters play you gain 3 life. Cost is 8; power/toughness benefit is 5. That means the life-gain benefit is apparently worth 3 (+1 cost per point of life). Another: UU1, 2/2, when it enters play return target creature to its owners hand. The cost here is 7; known benefits are 5 (4 for power/toughness, 1 for being Blue), so the return effect has a benefit of 2. And another: U1, 1/1, tap to force target enemy creature to attack this turn if able. Cost is 4, known benefit is 3 (again, 2 for power/toughness, 1 for Blue), so the special ability is costed as a relatively minor benefit of 1. Heres one with a drawback: U2, 2/3, Flying, can only block creatures with Flying. Benefit is 5 (power/toughness) + 1 (blue) + 1 (Flying) = 7. Mana cost is 1 (baseline) + 2 (U) + 2 (colorless) = 5, suggesting that the blocking limitation is a +2 cost. Intuitively, that seems wrong, when Defender (complete inability to block) is only +1 cost, suggesting that this card is probably a little above the curve. Another drawback: B4, 4/5, enters play tapped. Benefit is 9. Mana cost is 1 (baseline) + 2 (B) + 4 (colorless) + 1 (above 5 mana) = 8, so the additional drawback must have a cost of 1. Heres a powerful ability: BB1, 1/1, tap to destroy target tapped creature. Mana cost is 7. Power/toughness benefit is 2, so the special ability appears to cost 5. That seems extremely high; on the other hand, it is a very powerful ability, it combos well with a lot of other cards, so it might be justified. Or we might argue its strong (maybe a benefit of 3 or 4) but not quite that good, or maybe that its even stronger (benefit of 6 or 7) based on seeing it in play and comparing to other strong abilities we identify in the set, but this at least gives us a number for comparison. So, you can see here that the vast majority of cards can be analyzed this way, and we could use this
technique to get a pretty good feel for the cost curve of what is otherwise a pretty complicated game. Not all of the cards fit on the curve, but if you play the game for awhile youll have an intuitive sense of which cards are balanced and which feel too good or too weak. By using those feels balanced creatures as your baseline, you could then propose a cost curve and set of numeric costs and benefits, and then verify that those creatures are in fact on the curve (and that anything youve identified as intuitively too strong or too weak are correctly shown by your math as above or below the curve). Using what you do know, you can then take pretty good guesses at what you dont know, to identify other cards (those you dont have an opinion on yet) as being potentially too good or too weak. In fact, even if youre a player and not a game designer, you can use this technique to help you identify which cards youre likely to see at the tournament/competitive level. Rules of Thumb How do you know if your numbers are right? A lot of it comes down to figuring out what works for your particular game, through a combination of your designer intuition and playtesting. Still, I can offer a couple of basic pieces of advice. First, a limited or restricted benefit is never a cost, and its benefit is always at least a little bit greater than zero. If you have a sword that does extra damage to Snakes, and there are only a few Snakes in the game in isolated locations, that is a very small benefit but it is certainly not a drawback. Second, if you give the player a choice between two benefits, the cost of the choice must be at least the cost of the more expensive of the two benefits. Worst case, the player takes the better (more expensive) benefit every time, so it should be costed at least as much as what the player will choose. In general, if you give players a choice, try to make those choices give about the same benefit; if it is a choice between two equally good things, that choice is a lot more interesting than choosing between an obviously strong and an obviously weak effect. Lastly, sometimes you have to take a guess, and youre not in a position to playtest thoroughly. Maybe you dont have a big playtest budget. Maybe your publisher is holding a gun to your head, telling you to ship now. Whatever the case, youve got something that might be a little above or a little below the curve, and you might have to err on one side or the other. If youre in this situation, its better to make an object too weak than to make it too strong. If its too weak, the worst thing that happens is no one uses it, but all of the other objects in the game can still be viable this isnt optimal, but its not game-breaking. However, if one object is way too strong, it will always get used, effectively preventing everything else thats actually on the curve from being used since the balanced objects are too weak by comparison. A sufficiently underpowered object is ruined on its own; a sufficiently overpowered object ruins the balance of the entire game. Cost curves for new games So far, weve looked at how to derive a cost curve for an existing game, a sort of design reverse engineering to figure out how the game is balanced. This is not necessarily an easy task, as it can be quite tedious at times, but it is at least relatively straightforward. If youre making a new game, creating a cost curve is much harder. Since the game doesnt exist yet, you havent played it in its final form yet, which means you dont have as much intuition for what the curve is or what kinds of effects are really powerful or really weak. This means you have to plan on doing a lot of heavy playtesting for balance purposes, after the core mechanics are fairly solidified, and you need to make sure the project is scheduled accordingly. Another thing that makes it harder to create a cost curve for a new game is that you have the absolute freedom to balance the numbers however you want. With an existing game you have to keep all the numbers in line with everything that youve already released, so you dont have many degrees of freedom; you might have a few options on how to structure your cost curve, but only a
handful of options will actually make any sense in the context of everything youve already done. With a new game, however, there are no constraints; you may have thousands of valid ways to design your cost curve far more than youll have time to playtest. When making a new game, youll need to grit your teeth, do the math where you can, take your best initial guess and then get something into your playtesters hands as early as you can, so you have as much time as possible to learn about how to balance the systems in your game. Theres another nasty problem when designing cost curves for new games: changes to the cost curve are expensive in terms of design time. As an example, lets say youre making a 200-card set for a CCG, and one of the new mechanics youre introducing is the ability to draw extra cards, and 20 cards in the set use this mechanic in some way or other. Suppose you decide that drawing an extra card is a benefit of 2 at the beginning, but after some playtesting it becomes clear that it should actually be a benefit of 3. You now have to change all twenty cards that use that mechanic. Keep in mind that you will get the math wrong, because no one ever gets game balance right on the first try, and you can see where multiple continuing changes to the cost curve mean redoing the entire set several times over. If you have infinite time to playtest, you can just make these changes meticulously and one at a time until your balance is perfect. In the real world, however, this is an unsolved problem. The most balanced CCG that Ive ever worked on, was a game where the cost curve was generated after three sets had already been released; it was the newer sets released after we derived the cost curve that were really good in terms of balance (and they were also efficient in terms of development time because the basic using the math and nothing else cards didnt even need playtesting). Since then, Ive tried to develop new games with a cost curve in mind, and I still dont have a good answer for how to do this in any kind of reasonable way. Theres one other unsolved problem, which I call the escalation of power problem, that is specific to persistent games that build on themselves over time CCGs, MMOs, sequels where you can import previous characters, Facebook games, expansion sets for strategy games, and so on. Anything where your game has new stuff added to it over time, rather than just being a single standalone product. The problem is, in any given set, you are simply not going to be perfect. Every single object in your game will not be perfectly balanced along the curve. Some will be a little above, others will be a little below. While your goal is to get everything as close to the cost curve as possible, you have to accept right now that a few things will be a little better than theyre supposed to even if the difference is just a microscopic rounding error. Over time, with a sufficiently large and skilled player base, the things that give an edge (no matter how slight that edge) will rise to the top and become more common in use. And players will adapt to an environment where the best-of-the-best is what is seen in competitive play, and players become accustomed to that as the standard cost curve. Knowing this, the game designer faces a problem. If you use the old cost curve and produce a new set of objects that is (miraculously) perfectly balanced, no one will use it, because none of it is as good as the best (above-the-curve) stuff from previous sets. In order to make your new set viable, you have to create a new cost curve thats balanced with respect to the best objects and strategies in previous sets. This means, over time, the power level of the cost curve increases. It might increase quickly or slowly depending on how good a balancing job you do, but you will see some non-zero level of power inflation over time. Now, this isnt necessarily a bad thing, in the sense that it basically forces players to keep buying new stuff from you to stay current: eventually their old strategies, the ones that used to be dominant, will fall behind the power curve and theyll need to get the new stuff just to remain competitive. And if players keep buying from us on a regular basis, thats a good thing. However, theres a thin line here, because when players perceive that we are purposefully increasing the power level of the game just to force them to buy new stuff, that gives them an opportunity to exit our game and find something else to do. Were essentially giving an ultimatum, buy or leave, and doing that is dangerous because a lot of players will choose the leave option. So, the escalation-of-power
problem is not an excuse for lazy design; while we know the cost curve will increase over time, we want that to be a slow and gradual process so that older players dont feel overwhelmed, and of course we want the new stuff we offer them to be compelling in its own right (because its fun to play with, not just because its uber-powerful). If Youre Working On a Game Now If you are designing a game right now, and that game has any transitive mechanics that involve a single resource cost, see if you can derive the cost curve. You probably didnt need me to tell you that, but Im saying it anyway, so nyeeeah. Keep in mind that your game already has a cost curve, whether you are aware of it or not. Think of this as an opportunity to learn more about the balance of your game. Homework Ill give you three choices for your homework this week. In each case, there are two purposes here. First, you will get to practice the skill of deriving a cost curve for an existing game. Second, youll get practice applying that curve to identify objects (whether those be cards, weapons, or whatever) that are too strong or too weak compared to the others. Option 1: More Magic 2011 If you were intrigued by the analysis presented here on this blog, continue it. Find a spoiler list for Magic 2011 online (you shouldnt have to look that hard), and starting with the math weve identified here, build as much of the rest of the cost curve as you can. As you do this, identify the cards that you think are above or below the curve. For your reference, heres the math we have currently (note that you may decide to change some of this as you evaluate other cards): Mana cost: 1 (baseline); 1 for each colorless mana; 2 for the first colored mana, and 3 for the second colored mana. High cost bonus: +1 cost if the card requires 4 or more mana (Red, Blue and Green creatures only); +1 cost if the card requires 5 or more mana (White, Black, and Green creatures only yes, Green gets both bonuses); and an additional +1 cost for each total mana required above 5. Special costs: +1 cost for the Defender special ability. Benefits: 1 per point of power and toughness. 1 for Red and Blue creatures. Special benefits: +1 benefit for Flying, First Strike, Trample, Lifelink, Haste, Swampwalk, Reach, Vigilance, Protection from White, Protection from Black. +2 benefit for Deathtouch. You may also find some interesting reading in Mark Rosewaters archive of design articles for Magic, although finding the relevant general design stuff in the sea of articles on the minutiae of specific cards and sets can be a challenge (and its a big archive!). Option 2: D&D If CCGs arent your thing, maybe you like RPGs. Take a look at whatever Dungeons & Dragons Players Handbook edition youve got lying around, and flip to the section that gives a list of equipment, particularly all the basic weapons in the game, along with their Gold costs. Here youll have to do some light probability that we havent talked about yet, to figure out the average damage of each weapon (hint: if you roll an n-sided die, the average value of that die is (n+1)/2, and yes that means the average may be a fraction; if youre rolling multiple dice, compute the average for each individual die and then add them all together). Then, relate the average weapon damage to the Gold cost, and try to figure out the cost curve for weapons. Note that depending on the edition, some weapons may have special abilities like longer range, or doing extra damage against certain enemy types. Remember to only try to figure out the math for something when you know all but one of the costs or benefits, so start with the simple melee weapons and once youve got a basic cost curve, then try to derive the more complicated ones.
If you find that this doesnt take you very long and you want an additional challenge, do the cost curve for armors in the game as well, and see if you can find a relation between damage and AC. Option 3: Halo 3 If neither of the other options appeals to you, take a look at the FPS genre, in particular Halo 3. This is a little different because there isnt really an economic system in the game, so theres no single resource used to purchase anything. However, there is a variety of weapons, and each weapon has a lot of stats: effective range, damage, fire rate, and occasionally a special ability such as area-effect damage or dual-wield capability. For this exercise, use damage per second (dps) as your primary resource. Youll have to find a FAQ (or experiment by playing the game or carefully analyzing gameplay videos on YouTube) to determine the dps for each weapon; to compute dps, take the amount of damage and multiply by fire rate (in shots-per-second) and that is dps. Relate everything else to dps to try to figure out the tradeoffs between dps and accuracy, range, and each special ability. (For some things like accuracy that cant be easily quantified, you may have to fudge things a bit by just making up some numbers).
Then, analyze. Which weapons feel above or below the curve based on your cost curve? How much dps would you add or remove from each weapon to balance it? And of course, is this consistent with your intuition (either from playing the game, or reading comments in player forums)?
Level 4: Probability and Randomness

Readings/Playings Read this article on Gamasutra by designer Tyler Sigman. I affectionately refer to it as the Orc Nostril Hair article, but it provides a pretty good primer for probabilities in games. This Weeks Topic Up until now, nearly everything weve talked about was deterministic, and last week we really went deep into transitive mechanics and took them about as far as I can go with them. But so far weve ignored a huge aspect of many games, the non-deterministic aspects: in other words, randomness. Understanding the nature of randomness is important for game designers because we make systems in order to craft certain player experiences, so we need to know how those systems work. If a system includes a random input, we need to understand the nature of that randomness and how to modify it to get the results that we want. Dice Lets start with something simple: a die-roll. When most people think of dice, they are thinking of six-sided dice, also known as d6s. But most gamers have encountered plenty of other dice: foursided (d4), eight-sided (d8), d12, d20 and if youre really geeky, you might have a d30 or d100 lying around. In case you havent seen this terminology before, d with a number after it means a die with that many sides; if there is a number before the d that is the number of dice rolled so for example, in Monopoly you roll 2d6. Now, when I say dice here, that is shorthand. We have plenty of other random-number generators that arent globs of plastic but that still serve the same function of generating a random number from 1 to n. A standard coin can be thought of as a d2. Ive seen two designs for a d7, one which looks like a die and the other which looks more like a seven-sided wooden pencil. A four-sided dreidel (also known as a teetotum) is equivalent to a d4. The spinner that comes with Chutes & Ladders that goes from 1 to 6 is equivalent to a d6. A random-number generator in a computer might create a random number from 1 to 19 if the designer wants it to, even though theres no 19sided die inside there (Ill actually talk a bit more about numbers from computers next week). While all of these things look different, really they are all equivalent: you have an equal chance of choosing one of several outcomes. Dice have some interesting properties that we need to be aware of. The first is that each side is equally likely to be rolled (Im assuming youre using a fair die and not a rigged one). So, if you want to know the average value of a roll (also known as the expected value by probability geeks), just add up all the sides and divide by the number of sides. The average roll for a standard d6 is 1+2+3+4+5+6 = 21, divided by the number of sides (6), which means the average is 21/6 = 3.5. This is a special case, because we assume all results are equally likely. What if you have custom dice? For example, Ive seen one game with a d6 with special labels: 1, 1, 1, 2, 2, 3, so it behaves sort of like this weird d3 where youre more likely to get a 1 than a 2 and more likely to get 2 than 3. Whats the average roll for this die? 1+1+1+2+2+3 = 10, divided by 6, equals 5/3 or about 1.66. So if youve got that custom die and you want players to roll three of them and add the results, you know theyll roll an average of about a total of 5, and you can balance the game on that assumption. Dice and Independence As I said before, were going on the assumption that each roll is equally likely. This is true no matter how many dice are rolled. Each die roll is what we call independent, meaning that previous rolls do not influence later rolls. If you roll dice enough times you definitely will see streaks of numbers, like a run of high or low numbers or something, and well talk later about why that is, but
it doesnt mean the dice are hot or cold; if you roll a standard d6 and get two 6s in a row, the probability of rolling another 6 is exactly 1/6. It is not more likely because the die is running hot. It is not less likely because we already got two 6s, so were due for something else. (Of course, if you roll it twenty times and get 6 on each time, the odds of getting a 6 on the twenty-first roll are actually pretty good because it probably means you have a rigged die!) But assuming a fair die, each roll is equally likely, independent of the others. If it helps, assume that were swapping out dice each time, so if you roll a 6 twice, remove that hot die from play and replace with a new, fresh d6. For those of you that knew this already, I apologize, but I need to be clear about that before we move on. Making Dice More or Less Random Now, lets talk about how you can get different numbers from different dice. If youre only making a single roll or a small number of rolls, the game will feel more random if you use more sides of the dice. The more you roll a die, or the more dice you roll, the more things will tend towards the average. For example, rolling 1d6+4 (that is, rolling a standard 6-sided die and adding 4 to the result) generates a number between 5 and 10. Rolling 5d2 also generates a number between 5 and 10. But the single d6 roll has an equal chance of rolling a 5, or 8, or 10; the 5d2 roll will tend to create more rolls of 7 and 8 than any other results. Same range, and even the same average value (7.5 in both cases), but the nature of the randomness is different. Wait a minute. Didnt I just say that dice dont run hot or cold? And now Im saying if you roll a lot of them, they tend towards an average? Whats going on? Let me back up. If you roll a single die, each roll is equally likely. That means if you roll a lot, over time, youll roll each side about as often as the others. The more you roll, the more youll tend towards the average, collectively. This is not because previous numbers force the die to roll what hasnt been rolled before. Its because a small streak of 6s (or 20s or whatever) ends up not being much influence when you roll another ten thousand times and get mostly average rolls so you might get a bunch of high numbers now, but you might also get a bunch of low numbers later, and over time it all tends to go towards the mean. Not because the die is being influenced by previous rolls (seriously, the die is a piece of plastic, is doesnt exactly have a brain to be thinking gosh, I havent come up 2 in awhile), but because thats just what tends to happen in large sets of rolls. Your small streak in a large ocean of die-rolls will be mostly drowned out. So, doing the math for a random die roll is pretty straightforward, at least in terms of finding the average roll. There are also ways to quantify how random something is, a way of saying that 1d6+4 is more random than 5d2 in that it gives a more even spread, mostly you do that by computing something called standard deviation and the larger that is the more random it is, but that takes more computation than I want to get into today (Ill get into it later on). All Im asking you to know is that in general, fewer dice rolled = more random. While Im on the subject, more faces on a die is also more random since you have a wider spread. Computing Probability through Counting You might be wondering: how can we figure out the exact probability of getting a specific roll? This is actually pretty important in a lot of games, because if youre making a die roll in the first place there is probably some kind of optimal result. And the answer is, we count two things. First, count the total number of ways to roll dice (no matter what the result). Then, count the number of ways to roll dice that get the result you actually want. Divide the first number by the second and youve got your probability; multiply by 100 if you want the percentage. Examples Heres a very simple example. You want to roll 4 or more on 1d6. There are 6 total possible results (1, 2, 3, 4, 5, or 6). Of those, 3 of the results (4, 5, or 6) are a success. So your probability is 3 divided by 6, or 0.5, or 50%.
Heres a slightly more complicated example. You want to roll an even number on 2d6. There are 36 total results (6 for each die, and since neither die is influenced by the other you multiply 6 results by 6 to get 36). The tricky thing with questions like this, is that its easy to double-count. For example, there are actually two ways to roll the number 3 on 2d6: 1+2 and 2+1. Those look the same, but the difference is which number appears on the first die, and which appears on the second die. If it helps, think of each die as having a different color, so maybe you have a red die and a blue die in this case. And then you can count the ways to roll an even number: 2 (1+1), 4 (1+3), 4 (2+2), 4 (3+1), 6 (1+5), 6 (2+4), 6 (3+3), 6 (4+2), 6 (5+1), 8 (2+6), 8 (3+5), 8 (4+4), 8 (5+3), 8 (6+2), 10 (4+6), 10 (5+5), 10 (6+4), 12 (6+6). It turns out there are exactly 18 ways to do this out of 36, also 0.5 or 50%. Perhaps unexpected, but kind of neat. Monte Carlo Simulations What if you have too many dice to count this way? For example, say you want to know the odds of getting a combined total of 15 or more on a roll of 8d6. There are a LOT of different individual results of eight dice, so just counting by hand takes too long. Even if we find some tricks to group different sets of rolls together, it still takes a really long time. In this case the easiest way to do it is to stop doing math and start using a computer, and there are two ways to do this. The first way gets you an exact answer but takes a little bit of programming or scripting. You basically have the computer run through every possibility inside a for loop, evaluating and counting up all the iterations total and also all the iterations that are a success, and then have it spit out the answers at the end. Your code might look something like this: int wincount=0, totalcount=0; for (int i=1; i<=6; i++) { for (int j=1; j<=6; j++) { for (int k=1; k<=6; k++) { // insert more loops here if (i+j+k+ >= 15) { wincount++; } totalcount++; } } } float probability = wincount/totalcount; If you dont know programming but you just need a ballpark answer and not an exact one, you can simulate it in Excel by having it roll 8d6 a few thousand times and take the results. To roll 1d6 in Excel, use this formula: =FLOOR(RAND()*6)+1 When you dont know the answer so you just try it a lot, theres a name for that: Monte Carlo simulation, and its a great thing to fall back on when youre trying to do probability calculations and you find yourself in over your head. The great thing about this is, we dont have to know the math behind why it works, and yet we know the answer will be pretty good because like we learned before, the more times you do something, the more it tends towards the average. Combining Independent Trials
If youre asking for several repeated but independent trials, so the result of one roll doesnt affect other rolls, we have an extra trick we can use that makes things a little easier. How do you tell the difference between something thats dependent and something thats independent? Basically, if you can isolate each individual die-roll (or series) as a separate event, then it is independent. For example, rolling a total of 15 on 8d6 is not something that can be split up into several independent rolls. Since youre summing the dice together, what you get on one die affects the required results of the others, because its only all of them added together to give you a single result. Heres an example of independent rolls: say you have a dice game where you roll a series of d6s. On the first roll, you have to get 2 or higher to stay in the game. On the second roll you have to get 3 or higher. Third roll requires 4 or higher, fourth roll is 5 or higher, and fifth roll requires a 6. If you make all five rolls successfully, you win. Here, the rolls are independent. Yes, if you fail one roll it affects the outcome of the entire dice game, but each individual roll is not influenced by the others; for example, if you roll really well on your second roll, that doesnt make you any more or less likely to succeed on future rolls. Because of this, we can consider the probability of each roll separately. When you have separate, independent probabilities and you want to know what is the probability that all of them will happen, you take each of the individual probabilities and multiply them together. Another way to think about this: if you use the word and to describe several conditions (as in, what is the probability that some random event happens and that some other independent random event happens?), figure out the individual probabilities and multiply. No matter what you do, do not ever add independent probabilities together. This is a common mistake. To see why it doesnt work, imagine a 50/50 coin flip, and youre wondering what is the probability that youll get Heads twice in a row. The probability of each is 50%, so if you add those together youd expect a 100% chance of getting Heads, but we know thats not true, because you could get Tails twice. If instead you multiply, you get 50%*50% = 25%, which is the correct probability of getting Heads twice. Example Lets go back to our d6 game where you have to roll higher than 2, then higher than 3, and so on up to 6. In this series of 5 rolls, what are the chances you make all of them? As we said before, these are independent trials, so we just compute the odds of each roll and then multiply together. The first roll succeeds 5/6 times. The second roll, 4/6. The third, 3/6. The fourth, 2/6, and the fifth roll 1/6. Multiplying these together, we get about 1.5% So, winning this game is pretty rare, so youd want a pretty big jackpot if you were putting that in your game. Negation Heres another useful trick: sometimes its hard to compute a probability directly, but you can figure out the chance that the thing wont happen much easier. Heres an example: suppose we make another game where you roll 6d6, and you win if you roll at least one 6. What are your chances of winning? There are a lot of things to compute here. You might roll a single 6, which means one of the dice is showing 6 and the others are all showing 1-5, and there are 6 different ways to choose which die is showing 6. And then you might roll two 6s, or three, or more, and each of those is a separate computation, and it gets out of hand pretty quickly. However, theres another way to look at this, by turning it around. You lose if none of the dice are showing 6. Here we have six independent trials, each of which has a probability of 5/6 (the die can roll anything except 6). Multiply those together and you get about 33%. So you have about a 1 in 3 chance of losing.
Turning it around again, that means a 67% (or 2 in 3) chance of winning. The most obvious lesson here is that if you take a probability and negate it, just subtract from 100%. If the odds of winning are 67%, the odds of not winning are 100% minus 67%, or 33%. And vice versa. So if you cant figure out one thing but its easy to figure out the opposite, figure out that opposite and then subtract from 100%. Combining Conditions Within a Single Independent Trial A little while ago, I said you should never add probabilities together when youre doing independent trials. Are there any cases where you can add probabilities together? Yes, in one special situation. When you are trying to find the probability for several non-overlapping success criteria in a single trial, add them together. For example, the probability of rolling a 4, 5 or 6 on 1d6 is equal to the probability of rolling 4 plus the probability of rolling 5 plus the probability of rolling 6. Another way of thinking about this: when you use the word or in your probability (as in, what is the probability that you will get a certain result or a different result from a single random event?), figure out the individual probabilities and add them together. One important trait here is that you add up all possible outcomes for a game, the combined probabilities should add up to exactly 100%. If they dont, youve done your math wrong, so this is a good reality check to make sure you didnt miss anything. For example, if you were analyzing the probability of getting all of the hands in Poker, if you add them all up you should get exactly 100% (or at least really close if youre using a calculator you might get a very slight rounding error, but if youre doing exact numbers by hand it should be exact). If you dont, it means there are probably some hands that you havent considered, or that you got the probabilities of some hands wrong, so you need to go back and check your figures. Uneven Probabilities So far, weve been assuming that every single side on a die comes up equally often, because thats how dice are supposed to work. But occasionally you end up with a situation where there are different outcomes that have different chances of coming up. For example, theres this spinner in one of the Nuclear War card game expansions that modifies the result of a missile launch: most of the time it just does normal damage plus or minus a few, but occasionally it does double or triple damage, or blows up on the launchpad and damages you, or whatever. Unlike the spinner in Chutes & Ladders or A Game of Life, the Nuclear War spinner has results that are not equally probable. Some results have very large sections where the spinner can land so they happen more often, while other results are tiny slivers that you only land on rarely. Now, at first glance this is sort of like that 1, 1, 1, 2, 2, 3 die we were talking about earlier, which was sort of like a weighted 1d3, so all we have to do is divide all these sections evenly, find the smallest unit that everything is a multiple of, and then make this into a d522 roll (or whatever) with multiple sides of the die showing the same thing for the more common results. And thats one way to do it, and that would technically work, but theres an easier way. Lets go back to our original single standard d6 roll. For a normal die, we said to add up all of the sides and then divide by the number of sides, but what are we really doing there? We could say this another way. For a 6-sided die, each side has exactly 1/6 chance of being rolled. So we multiply each sides result by the probability of that result (1/6 for each side in this case), then add all of these together. Doing this, we get (1*1/6) + (2*1/6) + (3*1/6) + (4*1/6) + (5*1/6) + (6*1/6), which gives us the same result (3.5) as we got before. And really, thats what were doing the whole time: multiplying each outcome by the probability of that outcome. Can we do this with the Nuclear War spinner? Sure we can. And this will give us the average result if we add these all together. All we have to do is figure out the probability of each spin, and multiply by the result.
Another Example This technique of computing expected value by multiplying each result by its individual probability also works if the results are equally probable but weighted differently, like if youre rolling dice but you win more on some rolls than others. As an example, heres a game you might be able to find in some casinos: you place a wager, and roll 2d6. If you roll the lowest three numbers (2, 3 or 4) or the highest four numbers (9, 10, 11 or 12), you win an amount equal to your wager. The extreme ends are special: if you roll 2 or 12, you win double your wager. If you roll anything else (5, 6, 7 or 8), you lose your wager. This is a pretty simple game. But what is the chance of winning? We can start by figuring out how many times you win: There are 36 ways to roll 2d6, total. How many of these are winning rolls? Theres 1 way to roll two, and 1 way to roll twelve. There are 2 ways to roll three and eleven. There are 3 ways to roll four, and 3 more ways to roll ten. There are 4 ways to roll nine. Adding these all up, there are 16 winning rolls out of 36.
So, under normal conditions, you win 16 times out of 36 slightly less than 50%. Ah, but two of those times, you win twice as much, so thats like winning twice! So if you play this game 36 times with a wager of $1 each time, and roll each possible roll exactly once, youll win $18 total (you actually win 16 times, but two of those times it counts as two wins). Since you play 36 times and win $18, does that mean these are actually even odds? Not so fast. If you count up the number of times you lose, there are 20 ways to lose, not 18. So if you play 36 times for $1 each, youll win a total of $18 from the times when you win but youll lose $20 from the twenty times you lose! As a result, you come out very slightly behind: you lose $2 net, on average, for every 36 plays (you could also say that on average, you lose 1/18 of a dollar per play). You can see how easy it is to make one misstep and get the wrong probability here! Permutations So far, all of our die rolls assume that order doesnt matter. Rolling a 2+4 is the same as rolling a 4+2. In most cases, we just manually count the number of different ways to do something, but sometimes thats impractical and wed like a math formula. Heres an example problem from a dice game called Farkle. You start each round by rolling 6d6. If youre lucky enough to roll one of each result, 1-2-3-4-5-6 (a straight), you get a huge score bonus. Whats the probability that will happen? There are a lot of different ways to have one of each! The answer is to look at it this way: one of the dice (and only one) has to be showing 1. How many ways are there to do that? Six there are 6 dice, and any of them can show the 1. Choose that and put it aside. Now, one of the remaining dice has to show 2. Theres five ways to do that. Choose it and put it aside. Continuing along these lines, four remaining dice can show 3, three dice of the remaining ones after that can show 4, two of the remaining dice after that can show 5, and at the end youre left with a single die that must show 6 (no choice involved in that last one). To figure out how many ways there are to roll a straight, we multiply all the different, independent choices: 6x5x4x3x2x1 = 720 that seems like a lot of ways to roll a straight. To get the probability of rolling a straight, we have to divide 720 by the number of ways to roll 6d6, total. How many ways can we do that? Each die can show 6 sides, so we multiply 6x6x6x6x6x6 = 46656 (a much larger number!). Dividing 720/46656 gives us a probability of about 1.5%. If you were designing this game, thats good to know so you can design the scoring system accordingly. We can see why Farkle gives you such a high score bonus for rolling a straight; it only happens very rarely!
This result is interesting for another reason. It shows just how infrequently we actually roll exactly according to probability in the short term. Sure, if we rolled a few thousand dice, we would see about as many of each of the six numbers on our rolls. But rolling just six dice, we almost never roll exactly one of each! We can see from this, another reason why expecting dice to roll what hasnt been rolled yet because we havent rolled 6 in awhile so were about due is a fools game. Dude, Your Random Number Generator Is Broken This brings us to a common misunderstanding of probability: the assumption that everything is split evenly in the short term, which it isnt. In a small series of die-rolls, we expect there to be some unevenness. If youve ever worked on an online game with some kind of random-number generator before, youve probably heard this one: a player writes tech support to tell you that your random number generator is clearly broken and not random, and they know this because they just killed 4 monsters in a row and got 4 of the exact same drop, and those drops are only supposed to happen 10% of the time, so this should almost never happen, so clearly your die-roller is busted. You do the math. 1/10 * 1/10 * 1/10 * 1/10 is 1 in 10,000, which is pretty infrequent. This is what the player is trying to tell you. Is there a problem? It depends. How many players are on your server? Lets say youve got a reasonably popular game, and you get 100,000 daily players. How many of those kill four monsters in a row? Maybe all of them, multiple times per day, but lets be conservative and say that half of them are just there to trade stuff in the auction house or chat on the RP servers or whatever, so only half of them actually go out monster-hunting. Whats the chance this will happen to someone? On a scale like that, youd expect it to happen several times a day, at least! Incidentally, this is why it seems like every few weeks at least, someone wins the lottery, even though that someone is never you or anyone you know. If enough people play each week, the odds are youll have at least one obnoxiously lucky sap somewhere but that if you play the lottery, youve got worse odds of winning than your odds of being hired at Infinity Ward. Cards and Dependence Now that weve talked about independent events like die-rolling, we have a lot of powerful tools to analyze the randomness of many games. Things get a little more complicated when we talk about drawing cards from a deck, because each card you draw influences whats left in the deck. If you have a standard 52-card deck and draw, say, the 10 of Hearts, and you want to know the probability that the next card is also a heart, the odds have changed because youve already removed a heart from the deck. Each card that you remove changes the probability of the next card in the deck. Since each card draw is influenced by the card draws that came before, we call this dependent probability. Note that when I say cards here I am talking about any game mechanic where you have a set of objects and you draw one of them without replacing, so in this case deck of cards is mechanically equivalent to a bag of tiles where you draw a tile and dont replace it, or an urn where youre drawing colored balls from (Ive never actually seen a game that involves drawing balls from an urn, but probability professors seem to have a love of them for some reason). Properties of Dependence Just to be clear, with cards Im assuming that you are drawing cards, looking at them, and removing them from the deck. Each of these is an important property. If I had a deck with, say, six cards numbered 1 through 6, and I shuffled and drew a card and then reshuffled all six cards between card draws, that is equivalent to a d6 die roll; no result influences the future ones. Its only if I draw cards and dont replace them that pulling a 1 on my first roll makes it more likely Ill draw 6 next time (and it will get more and more likely until I finally draw
it, or until I reshuffle). The fact that we are looking at the cards is also important. If I pull a card from the deck but dont look at it, I have no additional information, so the probabilities havent really changed. This is something that may sound counterintuitive; how does just flipping a card over magically change the probabilities? But it does, because you can only calculate the probability of unknown stuff based on what you do know. So, for example, if you shuffle a standard deck, reveal 51 cards and none of them is the Queen of Clubs, you know with 100% certainty that this is what the missing card is. If instead you shuffle a standard deck, and take 51 cards away without revealing them, the probability that the last card is the Queen of Clubs is still 1/52. For each additional card you reveal, you get more information. Calculating probabilities for dependent events follows the same principles as independent, except its a little trickier because the probabilities are changing whenever you reveal a card. So you have to do a lot of multiplying different things together rather than multiplying the same thing against itself if you want to repeat a challenge. Really, all this means is we have to put together everything weve done already, in combination. Example You shuffle a standard 52-card deck, and draw two cards. Whats the probability that youve drawn a pair? There are a few ways to compute this, but probably the easiest is to say this: whats the probability that the first card you draw makes you totally ineligible to draw a pair? Zero, so the first card doesnt really matter as long as the second card matches it. No matter what we draw for our first card, were still in the running to draw a pair, so we have a 100% chance that we can still get a pair after drawing the first card. Whats the probability that the second card matches? There are 51 cards remaining in the deck, and 3 of them match (normally itd be 4 out of 52, but you already removed a matching card on your first draw!) so the probability ends up being exactly 1/17. (So, the next time that guy sitting across the table from you in Texas Hold Em says wow, another pocket pair? Must be my lucky day you know theres a pretty good chance hes bluffing.) What if we add two jokers so its now a 54-card deck, and we still want to know the chance of drawing a pair? Occasionally your first card will be a joker, and there will only be one matching card in the rest of the deck, rather than 3. How do we figure this out? By splitting up the probabilities and then multiplying each possibility. Your first card is either going to be a Joker, or Something Else. Probability of a Joker is 2/54, probability of Something Else is 52/54. If the first card is a Joker (2/54), then the probability of a match on the second card is 1/53. Multiplying these together (we can do that since theyre separate events and we want both to happen), we have 1/1431 less than a tenth of a percent. If the first card is Something Else (52/54), the probability of a match on the second card is up to 3/53. Multiplying these together, we have 78/1431 (a little more than 5.5%). What do we do with these two results? Since they do not overlap, and we want to know the probability of either of them, we add! 79/1431 (still around 5.5%) is the final answer. If we really wanted to be careful, we could calculate the probability of all other possible results: drawing a Joker and not matching, or drawing Something Else and not matching, and adding those together with the probability of winning, and we should get exactly 100%. I wont do the math here for you, but feel free to do it yourself to confirm. The Monty Hall Problem This brings us to a pretty famous problem that tends to really confuse people, called the Monty Hall problem. Its called that because there used to be this game show called Lets Make a Deal, with
your host, Monty Hall. If youve never seen the show, it was sort of like this inverse The Price Is Right. In The Price Is Right, the host (used to be Bob Barker, now its Drew Carey? Anyway) is your friend. He wants to give away cash and fabulous prizes. He tries to give you every opportunity to win, as long as youre good at guessing how much their sponsored items actually cost. Monty Hall wasnt like that. He was like Bob Barkers evil twin. His goal was to make you look like an idiot on national television. If you were on the show, he was the enemy, you were playing a game against him, and the odds were stacked in his favor. Maybe Im being overly harsh, but when your chance of being selected as a contestant seems directly proportional to whether youre wearing a ridiculous-looking costume, I tend to draw these kinds of conclusions. Anyway, one of the biggest memes from the show was that youd be given a choice of three doors, and they would actually call them Door Number 1, Door Number 2 and Door Number 3. Theyd give you a door of your choice for free! Behind one door, youre told, is a fabulous prize like a Brand New Car. Behind the other doors, theres no prize at all, no nothing, those other two doors are worthless. Except the goal is to humiliate you, so they wouldnt just have an empty door, theyd have something silly-looking back there like a goat, or a giant tube of toothpaste, or something something that was clearly not a Brand New Car. So, youd pick your door, and Monty would get ready to reveal if you won or not but wait, before we do that, lets look at one of the other doors that you didnt choose. Since Monty knows where the prize is, and theres only one prize and two doors you didnt choose, no matter what he can always reveal a door without a prize. Oh, you chose Door Number 3? Well, lets reveal Door Number 1 to show you that there was no prize there. And now, being the generous guy he is, he gives you the chance to trade your Door Number 3 for whatevers behind Door Number 2 instead. And heres where we get into probability: does switching doors increase your chance of winning, or decrease it, or is it the same? What do you think? The real answer is that switching increases your chance of winning from 1/3 to 2/3. This is counterintuitive. If you havent seen this problem before, youre probably thinking: wait, just by revealing a door weve magically changed the odds? But as we saw with our card example earlier, that is exactly what revealed information does. Your odds of winning with your first pick are obviously 1/3, and I think everyone here would agree to that. When that new door is revealed, it doesnt change the odds of your first pick at all its still 1/3 but that means the other door now has a 2/3 chance of being the right one. Lets look at it another way. You choose a door. Chance of winning: 1/3. I offer to swap you for both of the other doors, which is basically what Monty Hall is doing. Sure, he reveals one of them to not be a prize, but he can always do that, so that doesnt really change anything. Of course youd want to switch! If youre still wondering about this and need more convincing, clicking here will take you to a wonderful little Flash app that lets you explore this problem. You can actually play, starting with something like 10 doors and eventually working down your way to 3; theres also a simulator where you can give it any number of doors from 3 to 50 and just play on your own, or to have it actually run a few thousand simulations and give you how many times you would have won if you stayed versus when you switched. Monty Hall, Redux Now, in practice on the actual show, Monty Hall knew this, because he was good at math even if his contestants werent. So heres what hed do to change the game a little. If you picked the door with the prize behind it, which does happen 1/3 of the time, hed always offer you the chance to switch. After all, if youve got a car and then you give it away for a goat, youre going to look pretty dumb, which is exactly what he wants, because thats the kind of evil guy he is. But if you pick a door with no prize behind it, hell only offer you the chance to switch about half of those times, and the other
half hell just show you your Brand New Goat and boot you off the stage. Lets analyze this new game, where Monty can choose whether or not to give you the chance to switch. Suppose he follows this algorithm: always let you switch if you picked the door with the car, otherwise he has a 50/50 chance of giving you your goat or giving you the chance to switch. Now what are your chances of winning? 1/3 of the time, you pick the prize right away and he offers you to switch. Of the remaining 2/3 of the time (you pick wrong initially), half of the time hell offer to switch, half the time he wont. Half of 2/3 is 1/3, so basically 1/3 of the time you get your goat and leave, 1/3 of the time you picked wrong and he offers the switch, and 1/3 of the time you picked right and he offers the switch. If he offers an exchange, we already know that the 1/3 of the time when he gives you your goat and you leave didnt happen. That is useful information, because it means our chance of winning has now changed. Of the 2/3 of the time where were given a choice, 1/3 means we guessed right, and the other 1/3 means we guessed wrong, so if were given a choice at all it means our probability of winning is now 50/50, and theres no mathematical advantage to keeping or switching. Like in Poker, this is no longer a game of math and now a game of psychology. Did Monty offer you a choice because he thinks youre a sucker who doesnt know that switching is the right choice, and that youll stubbornly hold onto the door you picked because psychologically its worse to have a car and then lose it? Or does he think youre smart and that youll switch, and hes offering you the chance because he knows you guessed right at the beginning and youll take the bait and fall into his trap? Or maybe hes being uncharacteristically nice, and goading you into doing something in your own best interest, because he hasnt given away a car in awhile and his producers are telling him the audience is getting bored and hed better give away a big prize soon so their ratings dont drop? In this way, Monty manages to offer a choice (sometimes) while still keeping the overall probability of winning at 1/3. Remember, a third of the time youll just lose outright. A third of the time youll guess right initially, and 50% of that time youll win (1/3 x 1/2 = 1/6). And a third of the time, youll guess wrong initially but be given the choice to switch, and 50% of that time youll win (also 1/6). Add the two non-overlapping win states together and you get 1/3, so whether you switch or stay your overall odds are 1/3 throughout the whole game no better than if you just guessed and he showed you the door, without any of this switching business at all! So the point of offering to switch doors is not done for the purpose of changing the odds, but simply because drawing out the decision makes for more exciting television viewing. Incidentally, this is one of the same reasons Poker can be so interesting, is that most of the formats involve slowly revealing cards in between rounds of betting (like the Flop, Turn and River in Texas Hold Em), because you start off with a certain probability of winning and that probability is changing in between each betting round as more cards are revealed. The Sibling Problem And that brings us to another famous problem that tends to throw people, the Siblings problem. This is about the only thing Im writing about today that isnt directly related to games (although I guess that just means I should challenge you to come up a game mechanic that uses this). Its more a brain teaser, but a fun one, and in order to solve it you really have to be able to understand conditional probability like weve been talking about. The question is this: I have a friend with two kids, and at least one of them is a girl. What is the probability that the other one is also a girl? Assume that in the normal human population, theres a 50/50 chance of having a boy or a girl, and assume that this is universally true for any child (in reality some men actually do produce more X or Y sperm, so that would skew the odds a bit where if you know one of their kids is already a girl, that the odds are slightly higher theyll have more
girls, and then there are conditions like hermaphrodism, but for our purposes lets ignore that and assume that each kid is an independent trial with an equal chance of being male or female). Intuitively, since were dealing with a core 1/2 chance, we would expect the answer would be something like 1/2 or 1/4 or some other nice, round number thats divisible by 2. The actual answer is 1/3. Wait, what? The trick here is that the information we were given narrows down the possibilities. Lets say the parents are Sesame Street fans and so no matter what the sex, they name their kids A and B. Under normal conditions, there are four possibilities that are equally likely: A and B are both boys, A and B are both girls, A is boy and B is girl, or A is girl and B is boy. Since we know at least one of them is a girl, we can eliminate the possibility that A and B are both boys, so we have three (still equally likely) scenarios remaining. Since theyre equally likely and there are three of them, we know each one has a probability of 1/3. Only one of those three scenarios involves two girls, so the answer is 1/3. The Sibling Problem, Redux It gets weirder. Suppose instead I tell you my friend has two children, and one is a girl who was born on a Tuesday. Assume that under normal conditions, a child is equally likely to be born on any of the seven days of the week. Whats the probability the other child is also a girl? Youd think the answer would still be 1/3; what does Tuesday have to do with anything? But again, intuition fails. The actual answer is 13/27, which isnt just unintuitive, its plain old weird-looking. Whats going on here? Tuesday actually changes the odds, again because we dont know which child it was, or if both children were born on Tuesday. By the same logic as earlier, we count all valid combinations of children where at least one is a Tuesday girl. Again assuming the children are named A and B, the combinations are: A is a Tuesday girl, B is a boy (there are 7 possibilities here, one for each day of the week that B could be born on). B is a Tuesday girl, A is a boy (again, 7 possibilities). A is a Tuesday girl, B is a girl born on a different day of the week (6 possibilities). B is a Tuesday girl, A is a non-Tuesday girl (again, 6 possibilities). A and B are both girls born on Tuesday (1 possibility, but we have to take care not to double-count this). Adding it up, there are 27 different, equally likely combinations of children and days with at least one Tuesday girl. Of those, 13 possibilities involve two girls. Again, this is totally counterintuitive, and apparently designed for no other reason than to make your brain hurt. If youre still scratching your head, ludologist Jesper Juul has a nice explanation of this problem on his website. If Youre Working on a Game Now If a game youre designing has any randomness, this is a great excuse to analyze it. Choose a random element you want to analyze. For that element, first ask yourself what kind of probability youre expecting to see, what makes sense to you in the context of the game. For example, if youre making an RPG and looking at the probability that the player will hit a monster in combat, ask yourself what to-hit percentage feels right to you. Usually in console RPGs, misses by the player are very frustrating, so you wouldnt usually want them to miss a lot maybe 10% of the time or less? If youre an RPG designer you probably know better than I, but you should have some basic idea of what feels right. Then, ask yourself if this is something thats dependent (like cards) or independent (like dice). Break down all possible results, and the probabilities of each. Make sure your probabilities sum to 100%. And lastly, of course, compare the actual numbers to the numbers you were expecting. Is this
particular random die-roll or card-draw acting how you want it to, or do you see signs that you need to adjust the numbers? And of course, if you do find something to adjust, you can use these same calculations to figure out exactly how much to adjust it! Homework Your homework this week is meant to help you practice your probability skills. I have two dice games and a card game for you to analyze using probability, and then a weird mechanic from a game I once worked on that provides a chance to try out a Monte Carlo simulation. Game #1: Dragon Die This is a dice game that I invented with some co-workers one day (thanks Jeb Havens and Jesse King!) specifically to mess with peoples heads on probability. Its a simple casino game called Dragon Die, and its a dice gambling contest between you and the House. You are given a standard 1d6, and you roll it. Youre trying to roll higher than the House. The House is given a non-standard 1d6 its similar to yours, but instead of a 1 it has a Dragon on it (so the House die is Dragon-2-34-5-6). If the House rolls a Dragon, then the House automatically wins and you automatically lose. If you both roll the same number, its a push, and you both re-roll. Otherwise, the winner is whoever rolls highest. Obviously, the odds are slightly against the player here, because the House has this Dragon advantage. But how much of an advantage is it? Youre going to calculate it. But first, before you do, exercise your intuition. Suppose I said this game was offered with a 2 to 1 payout. That is, if you win, you keep your bet and get twice your bet in winnings. So, if you bet $1 and win, you keep your $1 and get $2 extra, for a total of $3. If you lose, you just lose your standard bet. Would you play? That is, intuitively, do you think the odds are better or worse than 2 to 1? Said another way, for every 3 games you play, do you expect to win more than once, or less than once, or exactly once, on average? Once youve used your intuition, do the math. There are only 36 possibilities for both dice, so you should have no problem counting them all up. If youre not sure about this 2 to 1 business, think of it this way: suppose you played the game 36 times (wagering $1 each time). A win nets you $2 up, a loss causes you to lose $1, and a push is no change. Count up your total winnings and losses and figure out if you come out ahead or behind. And then ask yourself how close your intuition was. And then realize how evil I am. And yes, if youre wondering, the actual dice-roll mechanics here are something Im intentionally obfuscating, but Im sure youll all see through that once you sit down and look at it. Try and solve it yourself. Ill post all answers here next week. Game #2: Chuck-a-Luck There is a gambling dice game called Chuck-a-Luck (also known as Birdcage, because sometimes instead of rolling dice theyre placed in a wire cage that somewhat resembles a Bingo cage). This is a simple game that works like this: place your bet (say, $1) on any number from 1 to 6. You then roll 3d6. For each die that your number shows up on, you get $1 in winnings (and you get to keep your original bet). If no dice show your number, the house takes your $1 and you get nothing. So, if you place on 1 and you roll triple 1s, you actually win $3. Intuitively, it seems like this is an even-odds game. Each die is individually a 1/6 chance of winning, so adding all three should give you a 3/6 chance of winning. But of course, if you calculate that way youre adding when these are separate die-rolls, and remember, youre only allowed to add if youre talking about separate win conditions from the same die. You need to be multiplying something. When you count out all possible results (youll probably find it easier to do this in Excel than by hand since there are 216 results), it still looks at first like an even-odds game. But in reality, the
odds of winning are actually slightly in favor of the House; how much? In particular, on average, how much money do you expect to lose each time you play this game? All you have to do is add up the gains and losses for all 216 results, then divide by 216, so this should be simple but as youll see, there are a few traps you can fall into, which is why Im telling you right now that if you think its even-odds, youve got it wrong. Game #3: 5-Card Stud Poker When youve warmed up with the previous two exercises, lets try our hand at dependent probability by looking at a card game. In particular, lets assume Poker with a 52-card deck. Lets also assume a variant like 5-card Stud where each player is dealt 5 cards, and thats all they get. No ability to discard and draw, no common cards, just a straight-up you get 5 cards and thats what you get. A Royal Flush is the 10-J-Q-K-A in the same suit, and there are four suits, so there are four ways to get a Royal Flush. Calculate the probability that youll get one. One thing Ill warn you about here: remember that you can draw those five cards in any order. So you might draw an Ace first, or a Ten, or whatever. So the actual way youll be counting these, there are actually a lot more than 4 ways to get dealt a Royal Flush, if you consider the cards to be dealt sequentially! Game #4: IMF Lottery This fourth question is one that cant easily be solved through the methods weve talked about today, but you can simulate it pretty easily, either with programming or with some fudging around in Excel. So this is a way to practice your Monte Carlo technique. In a game I worked on that Ive mentioned before called Chron X, there was this really interesting card called IMF Lottery. Heres how it worked: youd put it into play. At the end of your turn, the game would roll a percentile, and there was a 10% chance it would leave play, and a random player would gain 5 of each resource type for every token on the card. The card didnt start with any tokens, but if it stayed around then at the start of each of your turns, it gained a token. So, there is a 10% chance youll put it into play, end your turn, and itll leave and no one gets anything. If that doesnt happen (90% chance), then there is a further 10% chance (actually 9% at this point, since its 10% of 90%) that on the very next turn, itll leave play and someone will get 5 resources. If it leaves play on the turn after that (10% of the remaining 81%, so 8.1% chance) someone gets 10 resources, then the next turn it would be 15, then 20, and so on. The question is, what is the expected value of the number of total resources youll get from this card when it finally leaves play? Normally, wed approach this by finding the probability of each outcome, and multiplying by the outcome. So there is a 10% chance you get 0 (0.1*0 = 0). Theres a 9% chance you get 5 resources (thats 9%*5 = 0.45 resources). Theres an 8.1% chance you get 10 resources (8.1%*10 = 0.81 resources total, expected value). And so on. And then we add all of these up. Now, you can quickly see a problem: there is always going to be a chance that it will not leave play, so this could conceivably stay in play without leaving forever, for an infinite number of turns, so theres no actual way to write out every single possibility. The techniques we learned today dont give us a way to deal with infinite recursion, so well have to fake it. If you know enough programming or scripting to feel comfortable doing this, write a program to simulate this card. You should have a while loop that initializes a variable to zero, rolls a random number, and 10% of the time it exits the loop. Otherwise it adds 5 to the variable, and iterates. When it finally breaks out of the loop, have it increment the total number of trials by 1, and the total number of resources by whatever the variable ended up as. Then, re-initialize the variable and try again. Run this a few thousand times. At the end, divide total number of resources by total number of trials, and thats your Monte Carlo expected value. Run the program a few times to see if the numbers youre getting are about the same; if theres still a lot of variation in your final numbers,
increase the number of iterations in the outer loop until you start getting some consistency. And you can be pretty sure that whatever you come up with is going to be about right. If you dont know programming (or even if you do), this is an excuse to exercise your Excel skills. You can never have enough Excel skills as a game designer. Here youll want to make a good use of the IF and RAND statements. RAND takes no values, it just returns a random decimal number between 0 and 1. Usually we combine it with FLOOR and some plusses or minuses to simulate a die roll, as I mentioned earlier. In this case, though, we just have a 10% check for the card leaving play, so we can just check if RAND is less then 0.1 and not mess with this other stuff. IF takes in three values. In order: a condition thats either true or false, and then a value to return if its true, and then a value to return if its false. So the following statement will return 5 ten percent of the time, and 0 the other ninety percent of the time: =IF(RAND()<0.1,5,0) There are a lot of ways to set this up, but if I were doing it, Id use a formula like this for the cell that represents the first turn, lets say this is cell A1: =IF(RAND()<0.1,0,-1) Here Im using negative one as shorthand for this card hasnt left play and given out any resources yet. So if the first turn ended and the card left play right away, A1 would be zero; otherwise its -1. For the next cell, representing the second turn: =IF(A1>-1, A1, IF(RAND()<0.1,5,-1)) So if the first turn ended and the card left play right away, A1 would be 0 (number of resources), and this cell would just copy that value. Otherwise A1 would be -1 (hasnt left play yet), and this cell proceeds to roll randomly again: 10% of the time it returns 5 (for the 5 resources), the rest of the time it is still -1. Continuing this formula for additional cells simulates additional turns, and whatever cell is at the end gives you a final result (or -1 if it never left play after all of the turns youre simulating). Take this row of cells, which represents a single play of the card, and copy and paste for a few hundred (or a few thousand) rows. We might not be able to do an infinite test for Excel (there are only so many cells that fit in a spreadsheet), but we can at least cover the majority of cases. Then, have a single cell where you take the average of the results of all the turns (Excel helpfully provides the AVERAGE() function for this). In Windows, at least, you can hit F9 to reroll all your random numbers. As before, do that a few times and see if the values you get are similar to each other. If theres too much variety, double the number of trials and try again. Unsolved Problems If you happen to have a Ph.D. in Probability already and the problems above are too easy for you, here are two problems that Ive wondered about for years, but I dont have the math skills to solve them. If you happen to know how to do these, post as a comment; Id love to know how. Unsolved #1: IMF Lottery The first unsolved problem is the previous homework. I can do a Monte Carlo simulation (either in C++ or Excel) pretty easily and be confident of the answer for how many resources you get, but I dont actually know how to come up with a definitive, provable answer mathematically (since this is an infinite series). If you know how, post your math after doing your own Monte Carlo simulation to verify the answer, of course. Unsolved #2: Streaks of Face Cards
This problem, and again this is way beyond the scope of this blog post, is a problem I was posed by a fellow gamer over 10 years ago. They witnessed a curious thing while playing Blackjack in Vegas: out of an eight-deck shoe, they saw ten face cards in a row (a face card is 10, J, Q or K, so there are 16 of them in a standard 52-card deck, which means there are 128 of them in a 416-card shoe). What is the probability that there is at least one run of ten or more face cards, somewhere, in an eight-deck shoe? Assume a random, fair shuffle. (Or, if you prefer, what are the odds that there are no runs of ten or more face cards, anywhere in the sequence?) You can simplify this. Theres a string of 416 bits. Each bit is 0 or 1. There are 128 ones and 288 zeros scattered randomly throughout. How many ways are there to randomly interleave 128 ones and 288 zeros, and how many of those ways involve at least one clump of ten or more 1s?
Every time Ive sat down to solve this problem, it seems like it should be really easy and obvious at first, but then once I get into the details it suddenly falls apart and becomes impossible. So before you spout out a solution, really sit down to think about it and examine it, work out the actual numbers yourself, because every person Ive ever talked to about this (and this includes a few grad students in the field) has had that same its obvious no, wait, its not reaction. This is a case where I just dont have a technique for counting all of the numbers. I could certainly bruteforce it with a computer algorithm, but its the mathematical technique that Id find more interesting to know.
Level 5: Probability and Randomness Gone Horribly Wrong

Answers to Last Weeks Questions If you want to check your answers from last week: Dragon Die First off, note that the so-called Dragon Die is really just a 1d6+1 in disguise. If you think of the Dragon as a 7 since it always wins, the faces are 2-3-4-5-6-7, so really this is just asking how a +1 bonus to a 1d6 die roll affects your chance of rolling higher. It turns out the answer is: a lot more than most people think! If you write out all 36 possibilities of 2-7 versus 1-6, you find there are 21 ways to lose to the House (7-1, 7-2, 7-3, 7-4, 7-5, 7-6, 6-1, 6-2, 6-3, 6-4, 6-5, 5-1, 5-2, 5-3, 5-4, 4-1, 4-2, 4-3, 3-1, 3-2, 2-1), 5 ways to draw (6-6, 5-5, 4-4, 3-3, 2-2), and 10 ways to win (5-6, 4-5, 4-6, 3-4, 3-5, 3-6, 2-3, 2-4, 2-5, 2-6). We ignore draws, since a draw is just a re-roll, which we would keep repeating until we got a win or loss event, so only 31 results end in some kind of resolution. Of those, 21 are a loss and 10 are a win, so this game only gives a 10/31 chance of winning. In other words, you win slightly less than 1 time in 3. Chuck-a-luck Since there are 216 ways to get a result on 3d6, it is easier to do this in Excel than by hand. If you count it up, though, you will find that if you choose the number 1 to pay out, there are 75 ways to roll a single win (1-X-X where X is any of the five losing results, so there are 25 ways to do this; then X-1-X and X-X-1 for wins with the other two dice, counting up to 75 total). There are likewise 15 ways to roll a double win (1-1-X, 1-X-1, X-1-1, five each of three different ways = 15), and only one way to roll a triple win (1-1-1). Since all numbers are equally likely from 1 to 6, the odds are the same no matter which of the six numbers you choose to pay out. To get expected value, we multiply each of the 216 results by its likelihood; since all 216 die roll results are equally likely, we simply add all results together and divide by 216 to get the expected win or loss percentage. (75 ways to win * $1 winnings) + (15 ways for double win * $2) + (1 way for triple win * $3) = $108 in winnings. Since we play 216 times and 108 is half of 216, at first this appears to be evenodds. Not so fast! We still must count up all the results when we lose, and we lose more than 108 times. Out of the 216 ways to roll the dice, there are only 91 ways to win, but 125 ways to lose (the reason for the difference is that while doubles and triples are more valuable when they come up on your number, there are a lot more ways to roll doubles and triples on a number that isnt yours). Each of those 125 losses nets you a $1 loss. Adding everything up, we get an expected value of negative $17 out of 216 plays, or an expected loss of about 7.9 cents each time you play this game with a $1 bet. That may not sound like much (7.9 cents is literally pocket change), but keep in mind this is per dollar, so the House advantage of this game is actually 7.9%, one of the worst odds in the entire casino! Royal Flush In this game, youre drawing 5 cards from a 52-card deck, sequentially. The cards must be 10-J-QK-A but they can be in any order. The first card can be any of those 5 cards of any suit, so there are 20 cards total on your first draw that make you potentially eligible for a royal flush (20/52). For the second card, it must match the suit of the first card you drew, so there are only 4 cards out of the remaining 51 in the deck that you can draw (4/51). For the third card, there are only 3 cards out of the remaining 50 that you need for that royal flush (3/50). The fourth card has 2 cards out of 49, and
the final card must be 1 card out of 48. Multiplying all of these together, we get 480 / 311,875,200, or 1 in 649,740. If you want a decimal, this divides to 0.0000015 (or if you multiply by 100 to get a percentage, 0.00015%, a little more than one ten-thousandth of one percent). For most of us, this means seeing a natural Royal Flush from 5 cards is a once in a lifetime event, if that. IMF Lottery If you check the comments from last week, several folks found solutions for this that did not require a Monte Carlo simulation. The answer is 45 resources, i.e. the card will stay in play for an average of 10 turns. Since it has a 10% chance of leaving play each turn, this actually ends up being rather intuitive but, as weve seen from probability, most things are not intuitive. So the fact that this one problem ends up with an answer thats intuitive, is itself counterintuitive. This Weeks Topic Last week, I took a brief time-out from the rest of the class where weve been talking about how to balance a game, to draw a baseline for how much probability theory I think every designer needs to know, just so we can compute some basic odds. This week Im going to blow that all up by showing you two places where the true odds can go horribly wrong: human psychology, and computers. Human Psychology When I say psychology, this is something I touched on briefly last week: most people are really terrible at having an intuition for true odds. So even if we actually make the random elements of our games perfectly fair, which as well see is not always trivial, an awful lot of players will perceive the game as being unfair. Therefore, as game designers, we must be careful to understand not just true probability, but also how players perceive the probability in our games and how that differs from reality, so that we can take that into account when designing the play we want them to experience. Computers Most computers are deterministic machines; they are just ones and zeros, following deterministic algorithms to convert some ones and zeros into other ones and zeros. Yet somehow, we must get a nondeterministic value (a random number) from a deterministic system. This is done through some mathematical sleight-of-hand to produce what we call pseudorandom numbers: numbers that sorta kinda look random, even though in reality they arent. Understanding the difference between random and pseudorandom has important implications for video game designers, and even board game designers if they ever plan to make a computer version of their game (which happens often with hit board games), or if they plan to include electronic components in their board games that have any kind of randomness. But First Luck Versus Skill Before we get into psychology and computers, theres this implicit assumption that weve mostly ignored for the past week, thats worth discussing and challenging. The assumption is that adding some randomness to a game can be good, but too much randomness is a bad thing and maybe we have some sense that all games fall along some kind of continuum between two extremes of 100% skill-based (like Chess or Go) and 100% luck-based (like Chutes & Ladders or Candyland). If these are the only games we look at, we might go so far as to think of a corresponding split between casual and hardcore: the more luck in a game, the more casual the audience; the more a games outcome relies on skill, the more we think of the game as hardcore. This is not always the case, however. For example, Tic-Tac-Toe has no randomness at all, but we dont normally think of it as a game that requires a lot of skill. Meanwhile, each hand of Poker is highly random, but we still think of it as a game where skill dominates. And yet, the game of Blackjack is also random, but aside from counting cards, we see that more as a game of chance than Poker.
Then we get into physical contests like professional sports. On the one hand, we see these as games of skill. Yet, enthusiasts track all kinds of statistics on players and games, we talk about percentage chances of a player making a goal or missing a shot or whatever, sports gamblers make cash bets on the outcomes of games, as if these were not games of skill but games of chance. Whats going on here? There are a few explanations. Poker vs. Blackjack Why the difference in how we perceive Poker and Blackjack? The difference is in when the player makes their bet, and what kind of influence the players choices have on the outcome of the game. In Poker, a successful player computes the odds to come up with a probability calculation that they have the winning hand, and they factor that into their bet along with their perceived reactions of their opponents. As more cards are revealed, the player adjusts their strategy. The players understanding of the odds and their ability to react to changes has a direct relation to their performance in the game. In Blackjack, by contrast, you place a bet at the beginning of the hand before you know what cards youre dealt, and you generally dont have the option to raise or fold as you see more cards revealed. Blackjack does have some skill, but its a very different kind of skill than Poker. Knowing when to hit or stand or split or double down based on your total, the dealers showing card, and (if youre counting cards) the remaining percentage of high cards in the deck these things are skill in the same sense as skill at Pac-Man. You are memorizing and following a deterministic pattern, but you are not making any particularly interesting decisions. You simply place your bet according to an algorithm, and expect that over time you do as well as you can, given the shuffle of the cards. Its the same reason we dont think of the casino as having a lot of skill at Craps or Roulette, just because it wins more than it loses. Professional Sports What about the sports problem, where a clearly skill-based game seems like it involves random dierolls? The reason for the paradox is that it all depends on your frame of reference. If you are a spectator, by definition you have no control over the outcome of a game; as far as youre concerned the outcome is a random event. If you are actually a player on a sports team, the game is won or lost partly by your level of skill. This is why on the one hand sports athletes get paid based on how much they win (its not a gamble for them), but sports fans can still gamble on the random (from their perspective) outcome of the game. Action Games Randomness works a little differently in action-based video games (like typical First-Person Shooter games), where the players are using skill in their movement and aiming to shoot their opponents and avoid getting shot. We think of these as skill-based games, and in fact these games seem largely incompatible with randomness. There is enough chaos in the system without random die-rolls: if Im shooting at a moving target, Ill either hit or miss, based on a number of factors that are difficult to have complete control of. Now, suppose instead the designer thought theyd be clever and add a small perturbation on bullet fire from a cheap pistol, to make it less accurate. You line up someone in your sights, pull the trigger and miss, because it decided to randomly make the bullet fly too far to the left. How would the player react to that? Well, they might not notice. The game is going so fast, youre running, theyre running, you squeeze off a few shots, you miss, you figure you must have just not been as accurate as you thought (or they dodged well). Or if theyre standing still, you sneak up behind them, and youre sure you have the perfect shot and still miss, youll feel like the game just robbed you; thats not fun and it doesnt make the game more interesting, it just makes you feel like youre being punished
arbitrarily for being a good shot. Does that mean luck plays no role in action games? I think you can increase the luck factor to even the playing field, but you have to be very careful about how you do it. Heres a common example of how a lot of FPSs increase the amount of luck in the game: headshots. The idea here is that if you shoot someone in the head rather than the rest of the body, its an instant kill, or something like that. Now, you might be thinking wait, isnt that a skill thing? Youre being rewarded for accuracy, hitting a small target and getting bonus damage because youre just that good right? In some cases that is true, depending on the game, but in a lot of games (especially older ones) that kind of accuracy just really isnt possible in most situations; youre moving, theyre moving, the head has a tiny hit box, the gun doesnt let you zoom in enough at a distance to be really sure if you arent off by one or two pixels in any direction so from a distance, at least, a head shot isnt something that most players can plan on. Sometimes itll happen anyway, just by accident if youre shooting in the right general direction so sometimes, through no fault of the player, theyll get a headshot. This evens the playing field slightly; without headshots, if it takes many shots to score a frag, the more skilled players will almost always win because they are better at dodging, circle-strafing, and any other number of techniques that allow them to outmaneuver and outplay a weaker player. With headshots, the weaker player will occasionally just get an automatic kill by accident, so it makes it more likely that the weaker player will see some successes, which is usually what you want as a designer. Shifting Between Luck and Skill As we just saw, with action games, adding more of a luck-factor to the game is tricky but still possible, by creating these unlikely (but still possible to perform by accident) events. With slower, more strategic games, adding luck (or skill) is more straightforward. To increase the level of luck in the game: Change some player decisions into random events. Reduce the number of random events in the game (this way the Law of Large Numbers doesnt kick in as much, thus the randomness is less likely to be evenly distributed). Increase the impact that random events have on the game state. Increase the range of randomness, such as changing a 1d6 roll to a 1d20 roll. If you want to increase the level of skill in the game, do the reverse of any or all of the above. What is the best mix of luck and skill for any given game? That depends mostly on your target audience. Very young children may not be able to handle lots of choices and risk/reward or short/long term tradeoffs, but they can handle throwing a die or spinning a spinner and following directions. Competitive adults and core gamers often prefer games that skew more to the skill side of things so they can exercise their dominance, but (except in extreme cases) not so much that they feel like they cant ever win against a stronger opponent. Casual gamers may see the game as a social experience more than a chance to exercise the strategic parts of their brains, so they actually prefer to think less and make fewer decisions so they can devote more of their limited brainpower to chatting with their friends. There is no single right answer here for all games; recognize that a certain mix of luck and skill is best for your individual game, and part of your job as a game designer is to listen to your game to find out where it needs to be. Sometimes that means adding more randomness, sometimes it means removing it, and sometimes it means keeping it there but changing the nature of the randomness. The tools we gained from last week should give you enough skills to be able to assess the nature of randomness in your game, at least in part, and make appropriate changes. And Now Human Psychology As we saw last week, human intuition is generally terrible when it comes to odds estimation. You
might have noticed this with Dragon Die or Chuck-a-Luck; intuitively, both games seem to give better odds of winning than they actually do. Many people also have a flawed understanding of how probability works, as we saw last week with the gamblers fallacy (expecting that previous independent events like die-rolls have the power to influence future ones). Lets dig into these errors of thought, and their implications to gamers and game designers. Selection Bias When asked to do an intuitive odds estimation, where does our intuition come from? The first heuristic most people use is to check their memory recall: how easy is it to recall different events? The easier it is to remember many examples, the more we assume that event is likely or probable. This usually gives pretty good results; if youre rolling a weighted die a few hundred times and seem to remember the number 4 coming up more often, youll probably have a decent intuition for how often it actually comes up. As you might guess, this kind of intuition will fail whenever its easier to recall a rare event than a common one. Why would it be easier to recall rare events than common events? For one thing, rare events that are sufficiently powerful tend to stick in our minds (I bet you can remember exactly where you were when the planes struck the towers). We sometimes see rare events happen more often than common ones due to media coverage. For example, many people are more afraid of dying in a plane crash than dying in an auto accident, even though an auto fatality is far more likely. There are a few reasons for this, but one is that any time a plane crashes anywhere, its international news; car crashes, by contrast, are so common that theyre not reported so it is much easier to remember a lot of plane crashes than a lot of car crashes. Another example is the lottery; lotto winners are highly publicized while the millions of losers are not, leading us to assume that winning is more probable than it actually is. What does any of this have to do with games? For one thing, we tend to remember our epic wins much more easily than we remember our humiliating losses (another trick our brains play on us just to make life more bearable). People tend to assume theyre above average at most things, so absent of actual hard statistics, players will tend to overestimate their own win percentage / skill. This is dangerous, in games where players can set their difficulty level or choose their opponents. In general, we want a player to succeed a certain percentage of the time, and tune the difficulty of our games accordingly; if a player chooses a difficulty thats too hard for them, theyll struggle a bit more and be more likely to give up in frustrations. By being aware of this tendency, we can try (for example) to force players into a good match for their actual skills through automated matchmaking, dynamic difficulty adjustment, or other tricks. Self-Serving Bias Theres a certain point where an event is unlikely but still possible, where players will assume it is much more likely than it actually is. In Sid Meiers GDC keynote this year, he placed this from experience at somewhere around 3:1 or 4:1 that is, if the player had a 75 to 80% chance of winning or greater, and they did win exactly that percentage of the time, it would feel wrong to them, like they were losing more than they should. His playtesters expected to win nearly all of the time, Id guess around 95% of the time, if the screen displayed a 75% or 80% chance. Players also have a self-serving bias, that probably ties into what I said before about how everyone thinks theyre above-average. So while players are not okay with losing a quarter of the time when they have a 75% win advantage, they are perfectly okay winning a quarter of the time if they are at a 1:3 disadvantage. Attribution Bias In general, players are much more likely to accept a random reward than a random setback or punishment. And interestingly, they interpret these random events very differently. With a random reward, players have a tendency to internalize the event, to believe that they earned
the reward through superior decision-making and strategy in play. Sure, maybe it was a lucky die roll, but they were the ones who chose to make the choices that led to the die roll, and their calculated risk paid off, so clearly this was a good decision on their part. With a random setback, players tend to externalize the event; they blame the dice or cards, they say they were just unlucky. If it happens too much, they might go so far as to say that they dont like the game because its unfair. If theyre emotionally invested enough in the game, such as a high-stakes gambling game, they might even accuse other players of cheating! With video games the logic and random-number generation is hidden, so we see some even stranger player behavior. Some players will actually believe that the AI is peeking at the game data or altering the numbers behind their back and cheating on purpose, because after all its the computer so it can theoretically do that. Basically, people handle losing very differently from winning, in games and in life. Anchoring Another way people get odds wrong is this phenomenon called anchoring. The idea is, whatever the first number is that people see, they latch onto that and overvalue it. So for example, if you got to a casino and look at any random slot machine, probably the biggest, most attention-grabbing thing on there is the number of coins you can win with a jackpot. Because people look at that, and concentrate on it, and it gives them the idea that their chance of winning is much bigger than it actually is. Sid Meier mentioned a curious aspect of that during his keynote. Playtesters the same ones that were perfectly happy losing a third of the time when the had a 2:1 advantage just like they were supposed to would feel the game was unfair if they lost a third of the time when they had a 20:10 advantage. Why? Because the first number they see is that 20, which seems like a big number so they feel like theyve got a lot of power there and it feels a lot bigger than 10, so they feel like they should have this overwhelming advantage. (Naturally, if they have a 10:20 disadvantage, they are perfectly happy to accept one win in three.) It also means that a player who has, say, a small amount of base damage and then a bunch of bonuses may underestimate how much they do. The Gamblers Fallacy Now we return to the gamblers fallacy, which is that people expect random numbers to look random. Long streaks make people nervous and make them question whether the numbers are actually random. One statistic I found from the literature is that if you ask a person to generate a random list of coin flips from their head, they tend to not be very random. Specifically, if a persons previous item was Heads, they have a 60% chance of picking Tails for the next flip, and vice versa (this is assuming youre simply asking them to say Heads or Tails from their head when they are instructed to come up with a random result, not when theyre actually flipping a real coin). In a string of merely 10 coin flips, it is actually pretty likely youll get 4 Heads or 4 Tails in a row (since the probability of 4 in a row is 1 in 8), but if you ask someone to give you ten random numbers that are either 0 or 1 from their head, they will probably not give you even 3 in a row. This can lead players astray in less obvious ways. Heres an example, also from Sids keynote. Remember how players feel like a 3:1 advantage means theyll almost always win, but theyre okay with losing a 2:1 contest about a third of the time? It turns out that if they lose two 2:1 contests in a row, this will feel wrong to a lot of people; they dont expect unlikely events to happen multiple times in a row, even though by the laws of probability they should. Heres another example, to show you why as game designers we need to be keenly aware of this. Suppose you design a video game that involves a series of fair coin flips as part of its core mechanics (maybe you use this to determine who goes first each turn, or something). Probability
tells you that 1 out of every 32 plays, the first six coin flips will be exactly the same result. If a player sees this as their first introduction to your game, they may perceive this event as so unlikely that the random number generator in the game must be busted somehow. If the coins come up in their favor, they wont complain but when regression to the mean kicks in and they start losing half the time like theyre supposed to, theyll start to feel like the game is cheating, and it will take them awhile to un-learn their (incorrect) first impression that they are more likely to win a flip. Worse, if the player sees six losses in a row, right from the beginning, you can bet that player is probably going to think your game is unfair. To see how much of a problem this can potentially become, suppose your game is a modest hit that sells 3.2 Million units. In that case, one hundred thousand players are going to experience a 6-long streak as their first experience on their first game. Thats a lot of players that are going to think your game is unfair! The Gamblers Fallacy is something we can exploit as gamers. People assume that long streaks do not appear random, so when trying to play randomly they will actually change values more often than not. Against a non-championship opponent, you can win more than half the time at RockPaper-Scissors by knowing this. Insist on playing to best 3 of 5, or 4 of 7, or something. Since you know your opponent is unlikely to repeat their last throw, on subsequent rounds you should throw whatever would have lost to your opponents last throw, because your opponent probably wont do the same thing twice, so you probably wont lose (the worst you can do is draw). The Hot-Hand Fallacy Theres a variant of the Gamblers Fallacy that mostly applies to sports and other action games. The Hot-Hand fallacy is so called because in the sport of Basketball fans started getting this idea that if a player made two or three baskets in a row, they were running hot and more likely to score additional baskets and not miss. (We even see this in sports games like NBA Jam, where becoming on fire is actually a mechanic that gives the player a speed and accuracy advantage and some cool effects like making the basket explode in a nuclear fireball.) When probability theorists looked at this, their first reaction was that each shot is an independent event, like rolling dice, so theres no reason why previous baskets should influence future ones at all. They expected that a player would be exactly as likely to make a basket, regardless of what happened in the players previous attempts. Not so fast, said Basketball fans. Who says theyre completely independent events? Psychology plays a role in sports performance. Maybe the player has more confidence after making a few successful shots, and that causes them to play better. Maybe the fans cheering them on gives them a little extra mental energy. Maybe the previous baskets are a sign that the player is hyper-focused on the game and in a really solid flow state, making it more likely theyll continue to perform well. Who knows? Fair enough, said the probability theorists, so they looked at actual statistics from a bunch of games to see if previous baskets carried any predictive value for future performance. As it turned out, both the theorists and sports fans were wrong. If a player made several baskets in a row, it slightly increased their chance of missing next time the longer the streak, the greater the chance of a miss (relative to what would be expected by random chance). Why? I dont think we know for sure, but presumably there is some kind of negative psychological effect. Maybe the player got tired. Maybe the other team felt that player was more of a threat, and played a more aggressive defense when that player had the ball. Maybe the crowds cheering broke the players flow state, or maybe the player gets overconfident and starts taking more unnecessary risks. Whatever the case, this is something that works against us when players can build up a win streak in our games especially if we tie that to social rewards, such as achievements, trophies, or leaderboards that give special attention to players with long streaks. Why is this dangerous? Because at best, even if each game is truly an independent random event, we know that a win streak is anomalous. If a players performance overall falls on some kind of probability curve (usually a
bell curve) and they happen to achieve uncharacteristically high performance in a single game or play session or whatever, odds are their next game will fall lower on the curve. The player is probably erroneously thinking their skills have greatly improved; when they start losing again, theyll feel frustrated, because they know they can do better. Thus, when the streak inevitably comes to an end, the whole thing is tainted in the players mind. Its as if the designer has deliberately built a system that automatically punishes the player after every reward. Houston, We Have a Problem To sum up, here are the problems we face as designers, when players encounter our random systems: Selection bias: improbable but memorable events are perceived as more likely than they actually are. Self-serving bias: an unlikely loss is interpreted as a nearly impossible loss when the odds are in the players favor. However, an unlikely win is still correctly interpreted as an unlikely but possible win when the odds are against the player. Attribution bias: a positive random result is assumed to be due to player skill; a negative random result is assumed to be bad luck (or worse, cheating). Anchoring: players overvalue the first or biggest number seen. Gamblers fallacy: assuming that a string of identical results reduces the chance that the string will continue. Hot-hand fallacy: assuming that a string of identical results increases the chance that the string will continue. The lesson here is that if you expose the actual probabilities of the game to your players, and your game produces fair, random numbers, players will complain because according to them and their flawed understanding of probability, the game feels wrong. As designers, what do we do about this? We can complain to each other about how all our stupid players are bad at math. But is there anything we can do to take advantage of this knowledge, that will let us make better games? When Designers Turn Evil One way to react to this knowledge is to exploit it in order to extract large sums of money from people. Game designers who turn to the Dark Side of the Force tend to go into the gambling industry, marketing and advertising, or political strategy. (I say this with apologies to any honest designers who happen to work in these industries.) Gambling Lotteries and casinos regularly take advantage of selection bias by publicizing their winners, making it seem to people like winning is more likely than it really is. Another thing they can do if theyre more dishonest is to rig their machines to give a close but not quite result more often than would be predicted by random chance, such as having two Bars and then a blank come up on a slot machine, or having four out of five cards for a royal flush come up in a video poker game. These give the players a false impression that theyre closer to winning more often than they actually are, increasing their excitement and anticipation of hitting a jackpot and making it more likely theyll continue to play. Marketing and Advertising Marketers use the principle of anchoring all the time to change our expectations of price. For example, your local grocery store probably puts big discount stickers all over the place to call your attention to the lower prices on select items, and our brains will assume that the other items around them are also less expensive by comparison even if theyre actually not. Another example of anchoring is a car dealership that might put two nearly identical models next to
each other, one with a really big sticker price and one with a smaller (but still big) price. Shoppers see the first big price and anchor to that, then they see the smaller price and feel like by comparison theyre getting a really good deal even though theyre actually getting ripped off. Political Strategy There are a ton of tricks politicians can use to win votes. A common one these days is to play up peoples fears of vastly unlikely but well-publicized events like terrorism or hurricanes, and make campaign promises that theyll protect you and keep your family safe. Odds are, theyre right, because the events are so unlikely that they probably wont happen again anyway. Scam Artists The really evil game designers use their knowledge of psychology to do things that are highly effective, and highly illegal. One scam Ive heard involves writing a large number of people offering your investment advice and telling them to watch a certain penny stock between one day and the next. Half of the letters predict the stock will go up, the other half say itll go down. Then, whatever actually happens, you take that half where you guessed right, and make another prediction. Repeat that four or five times, and eventually you get down to a handful of people for whom youve predicted every single thing right. And those people figure theres no way that could just be random chance, you must have a working system, and they give you tons of money, and you skip town and move to Fiji or something. What About Good Game Designers? All of that is fine and good for those of us who want to be scam artists (or for those of us who dont want to get scammed). But what about those of us who still want to, you know, make games for entertainment? We must remember that were crafting a player experience. If we want that experience to be a positive one, we need to take into account that our players will not intuitively understand the probabilities expressed in our game, and modify our designs accordingly. Skewing the Odds One way to do this is to tell our players one thing, and actually do something else. If we tell the player they have a 75% chance of winning, under the hood we can actually roll it as if it were a 95% chance. If the player gets a failure, we can make the next failure less likely, cumulatively; this makes long streaks of losses unlikely, and super-long streaks impossible. Random Events We can use random events with great care, especially those with major game-changing effects, and especially those that are not in the players favor. In general, we can avoid hosing the player to a great degree from a single random event; otherwise, the player may think that they did something wrong (and unsuccessfully try to figure out what), or they may feel annoyed that their strategy was torn down by a single bad die-roll and not want to play anymore. Being very clear about why the bad event happened (and what could be done to prevent it in future plays) helps to keep the player feeling in control. Countering the Hot Hand To counter the hot-hand problem (where a streak of wins makes it more likely that a player will screw up), one thing to do is to downplay the importance of streaks in our games, so that the players dont notice when theyre on a streak in the first place (and therefore wont notice, or feel as bad, when the streak ends). If we do include a streak mechanism, one thing we can do is embed them in a positive feedback loop, giving the player a gameplay advantage to counteract the greater chance of a miss after a string of hits. For example, in Modern Warfare 2, players get certain bonuses if they continue a kill
streak (killing multiple enemies without being killed themselves), including better weapons, air support, and even a nuclear strike. With each bonus, its more likely their streak will continue because they are now more powerful. Summary In short, we know that players have a flawed understanding of probability. If we understand the nature of these flaws, we can change our games behavior to conform to player expectations. This will make the game feel more fun and more fair to the players. This was, essentially, one of the big takeaways from Sid Meiers GDC keynote this year. A Question of Professional Ethics Now, this doesnt sit well with everyone. Post-GDC, there were at least a few people that said: wait, isnt that dishonest? As game designers, we teach our players through the games designed systems. If we bend the rules of probability in our games to reinforce players flawed understanding of how probability works, are we not doing a disservice to our players? Are we not taking something that we already know is wrong and teaching it to players as if it were right? One objection might be that if players arent having fun (regardless of whether that comes from our poor design, or their poor understanding of math), then the designer must design in favor of the player especially if they are beholden to a developer and publisher that expect to see big sales. But is this necessarily valid? Poker, for example, is incredibly popular and profitable even though it will mercilessly punish any player that dares harbor any flaws in how they think about probability. To this day, I think it is still an open question, and something each individual designer must decide for themselves, based on their personal values and the specific games they are designing. Take a moment to think about these questions yourself: Are the probabilities in our games something that we can frame as a question of professional ethics? How do we balance the importance of giving the player enjoyment (especially when they are paying for it), versus giving them an accurate representation of reality? Whats more important: how the game actually works, or how players perceive it as working? One More Solution I can offer one other solution that works in some specific situations, and thats to expose not just the stated probabilities to the player, but the actual results as well. For example, if you ask players to estimate their win percentage of a game when its not tracked by the game, theyll probably estimate higher than the real value. If their wins, losses and percentage are displayed to them every time they go to start a game, they have a much more accurate view of their actual skill. What if you have a random number generator? See if this sounds familiar to you: youre playing a game of Tetris and you become convinced at some point that the game is out to get you. You never seem to get one of those long, straight pieces until just after you need it, right? My favorite version of Tetris to this day, the arcade version, had a neat solution to this: in a single-player game, your game took up the left half of the screen, and it used the right half to keep track of how many of each type of brick fell down. So if you felt like you were getting screwed, you could look over there and see if the game was really screwing you, or if it was just your imagination. And if you kept an eye on those over time, youd see that yes, occasionally you might get a little more of one piece than another on average over the course of a single level, but over time it would balance out, and most of the time youd get about as much of each brick as any other. The game was fair, and it could prove it with cold, hard facts displayed to the player in real time. How could this work in other games? In a Poker video game against AI opponents, you could let the player know after each hand if they actually held the winning hand or not, and keep ongoing
track of their percentage of winning hands, so that they know the deck shuffling is fair. (This might be more controversial with human opponents, as it gives you some knowledge of your opponents bluffing patterns.) If youre making a version of the board game RISK that has simulated die rolls in it, have the game keep track of how frequently each number or combination is rolled, and let the player access those statistics at any time. And so on. These kinds of things are surprisingly reassuring to a player who can never know for sure if the randomness inside the computer is fair or not. When Randomness Isnt Random This brings us to the distinction of numbers that are random versus numbers that are pseudorandom. Pseudorandom literally means fake random and in this case it means its not actually entirely random, it just looks that way. Now, most of the things we use even in physical games for randomness are not perfectly random. Balls with numbers painted on them in a hopper thats used for Bingo might be slightly more or less likely to come up because the paint gives them slightly different weights. 6-sided dice with recessed dots may be very slightly weighted towards one side or another since theyve actually got matter thats missing from them, so their center of gravity is a little off. Also, a lot of 6-sided dice have curved edges, and if those curves are slightly different then itll be slightly more likely to keep rolling when it hits certain faces, and thus a little more or less likely to land on certain numbers. 20sided dice can be slightly oblong rather than perfectly round (due to how theyre manufactured), making it slightly less likely to roll the numbers on the edges. All this is without considering dice that are deliberately loaded, or the person throwing the dice has practiced how to throw what they want! What about cards? All kinds of studies have shown that the way we shuffle a deck of cards in a typical shuffle is not random, and in fact if you shuffle a certain way for a specific number of times (I forget the exact number but its not very large), the deck will return almost exactly to its original state; card magicians use this to make it look like theyre shuffling a deck when theyre actually stacking it. Even an honest shuffle isnt perfectly random; if you think about it, for example, depending on how you shuffle either the top or bottom card will probably stay in the same position after a single riffle-shuffle, so you have to shuffle a certain number of times before the deck is sufficiently randomized. Even without stacking the deck deliberately, the point is that all shuffles are not equally likely. In Las Vegas, they have to be very careful about these things, which is why youll notice that casino dice are a lot different from those white-colored black-dotted d6s you have at home. One slightly unfair die can cost the casino its gambling license (or cost them money from sufficiently skilled players who know how to throw dice unfairly), which is worth billions of dollars just from their honest, long-term House advantage, which is why they want to be very sure that their dice are as close to perfectly random as possible. Shuffling cards is another thing casinos have to be careful of. If the cards are shuffled manually, you can run into a few problems with randomness (aside from complaints of carpal tunnel from your dealers). A dealer who doesnt shuffle enough to sufficiently randomize the deck because theyre trying to deal more hands so they get lazy with the shuffling can be exploited by a player who knows theres a better-than-average chance of some cards following others in sequence. And thats not even counting dealers who collude with players to do this intentionally. Casinos have largely moved to automated shufflers to deal with these problems, but those have problems of their own; for example, mechanical shuffles are potentially less random than human shuffles, so a careful gambler can use a hidden camera to analyze the machine and figure out which cards are more likely to clump together from a fresh deck, giving themselves an advantage. These days, some of the latest automated shufflers dont riffle-shuffle, they actually stack the deck according to a randomized computer algorithm, but as well see shortly, even those algorithms can
have problems, and those problems can cost the casinos a lot of money if theyre not careful. The point here is, even the events in physical games that we think of as random, arent always as random as we give them credit for. Theres not necessarily a lot we can do about this, mind you, at least not without going to great expense to get super-high-quality game components. Paying through the nose for machine-shopped dice is a bit much for most of us that just want a casual game of Catan, so as gamers we have to accept that our games arent always completely random and fair but theyre close enough, they affect all players equally, so we disregard. Psuedorandom Numbers Computers have similar problems, because as I mentioned in the introduction, a computer doesnt have any kind of randomness inside it. Its all ones and zeros, high and low voltage going through wires or being stored electromagnetically in memory or on a disk somewhere, its completely deterministic. And unless youre willing to get some kind of special hardware that measures some kind of physical phenomenon that varies like some kind of Geiger counter tracking the directions of released radiation or something, which most of us arent, your computer is pretty much stuck with this problem that you have to use a deterministic machine to play a non-deterministic game of chance. We do this through a little bit of mathematics that I wont cover here (you can do a Google search for pseudorandom number algorithms on your own if you care). All you need to know is that there are some math functions that behave very erratically, without an apparent pattern, and so you just take one of the results from that function and call it a random number. How do you know which result to take from the function? Well, you determine that randomly. Just kidding; as we said, you cant really do that. So instead what you do is you have to tell the computer which one to take, and then itll start with that one, and then next time you need a random number itll take the next one in sequence, then the next, and so on. But because we told it where to start, this is no longer actually random, even though it might look that way to a casual player. The number you tell the computer to start with is called a random number seed, and once you give the computer one seed, it just starts picking random numbers with its formula sequentially from there. So you only have to seed it once. But this is important if you give it the same seed youll get exactly the same results. Remember, its deterministic! Usually we get around this by picking a random number seed thats hard for a player to intentionally replicate, like the number of milliseconds that have elapsed since midnight or something. You have to choose carefully though. If, for example, you pick a random number seed thats just the fractional milliseconds in the system clock (from 0 to 999), then really your game only has 1000 ways of shuffling, which is enough that over repeated play a player might see two games that seem suspiciously identical. If your game is meant to be played competitively, a sufficiently determined player could study your game and reach a point where they could predict which random numbers happen and when, and use that to gain an unfair advantage. So we have to be careful when creating these systems. Pseudorandomness in Online Games: Keeping Clients in Sync You have to be extra careful with random numbers in an online game, if your players machines are generating their own random numbers. Ive worked on games at two companies that were architected that way, for better or worse. What could happen was this: you would of course have one player (or the server) generate a random number seed, and that seed would be used for both players in a head-to-head game. Then, when either player needed a random number, both machines would have to generate that number so that their random number seed was kept in sync. Occasionally due to a bug, one player might generate a random number and forget to inform their opponent, and now their random number generators are out of sync. The game continues for a few turns until suddenly, one player takes an action that requires a random element, their machine rolls a success, the opponents machine (with a different random number) rolls a failure, the two clients
compare checksums and fail because they now have a different game state, and both players become convinced the other guy is trying to cheat. Oops. (A better way to do this for PC games is to put all the game logic on the server and use thin clients; for networked handheld console or phone games where there is no server and its direct-connect, designate one players device to handle all these things and broadcast the game state to the other players.) Pseudorandomness in Single-Player Games: Saving and Loading You also have to be careful with pseudorandom numbers in a single-player game, because of the potential for exploits. This is a largely unsolved problem in game design. You cant really win here, but you can at least pick your poison. Save Anywhere Suppose you have a game where the player can save anywhere, any time. Many games do this, because it is convenient for the player. However, nothing stops the player from saving just before they have to make a big random roll, maybe something where theyre highly unlikely to succeed but theres a big payoff if they do, and keep reloading from save until they succeed. If you regenerate your random number seed each time they reload from save, they will eventually succeed, and theyre not really playing the game that you designed at that point but on the other hand, theyre using the systems you designed, so theyre not really cheating either. Your carefully balanced probabilities suddenly become unbalanced when a player can keep rerolling until they win. Save Anywhere, Saved Seed Okay, so you say, lets fix that: what if we save the random number seed in the saved game file? Then, if you try to save and reload, youll get the same result every time! First, that doesnt eliminate the problem, it just makes it a little harder; the player just has to find one other random thing to do, like maybe drinking a potion that restores a random number of HP, or maybe choosing their combat actions in a different order or something, and keep trying until they find a combination of actions that works. Second, youve now created a new problem: after the player saves, they know exactly what the enemy AI will do on every turn, because once you start with the same random number seed the game now becomes fully deterministic! Sometimes this foreknowledge of exactly how an enemy will act in advance is even more powerful than being able to indefinitely reroll. Save Points So you say, okay, lets limit where the player can save, so that theyll have to go through some nontrivial amount of gameplay between saves. Now they can theoretically exploit the save system, but in reality they have to redo too much to be able to fully optimize every last action. And then we run into a problem with the opposite type of player: while this mechanism quells the cheating, the honest players now complain that your save system wont let them walk away when they want, that the game holds them hostage between save points. Quicksave Maybe you think to try a save system where the player can save any time, but it erases their save when they start, so they cant do the old save/reload trick. This seems to work until the power goes out just as an honest player reaches the final boss, and now they have to start the entire game over from scratch. And they hire a hit man to kill you in your sleep because you deserve it, you evil, evil designer. Save Anywhere, Limited Times You give the player the ability to save anywhere, but limit the total number of saves. The original Tomb Raider did this, for example. This allows some exploits, but at least not on every last die-roll. Is this a good compromise? Oh, by the way, I hope you gave the player a map and told them exactly how far apart they can save
on average, and drew some BIG ARROWS on the map pointing to places where the big battles are going to happen, so a player doesnt have to replay large sections of the map just because they didnt know ahead of time where the best locations were to save. And then your players will complain that the game is too easy because it gives them all this information about where the challenge is. Pick Your Poison As I said, finding the perfect save system is one of those general unsolved problems in game design, just from the perspective of what kind of system is the most fun and enjoyable from the player, and thats just for deterministic games! When you add pseudorandom numbers, you can see how the problem can get much thornier, so this is something you should be thinking about as a designer while designing the load/save system because if you dont, then itll be left to some programmer to figure out, God help you, and itll probably be based on whatevers easiest to code and not whats best for the game or the player. When Pseudorandom Numbers Fail Even if you choose a good random number seed, and even if you ignore player exploits and bugs, there are other ways that randomness can go wrong if you choose poor algorithms. For example, suppose you have a deck of cards and want to shuffle them. Heres a nave algorithm that most budding game programmers have envisioned at some point: Start with an unshuffled deck. Generate a pseudorandom number that corresponds to a card in the deck (so if the deck is 52 cards, generate a whole number between 1 and 52 or 0 and 51, depending on what language youre using). Call this number A. Generate a second pseudorandom number, the same way. Call this number B. Swap the cards in positions A and B in the deck. Repeat steps 2-4 lots and lots of times. The problem here is, first off, it takes an obnoxiously long time to get anything resembling a random shuffle. Second, because the card positions start out fixed, and youre swapping random pairs one at a time, no matter how many times you repeat this there is always a slightly greater chance that youll find each card in its original position in the deck, than anywhere else. Think of it this way: if a card is ever swapped, itll swap to a random position, so all positions are equally likely for any card that has been swapped at all. Swapping multiple times makes no difference; youre going from one random place to another, so you still end up equally likely to be in any position. So any card thats swapped is equally likely to end up anywhere, with equal frequency, as a baseline (this is good). However, there is some non-zero chance that a card will not be swapped, in which case it will remain in its original position, so it is that much more likely to stay where it is. The more times you perform swaps, the lower the chance that a card will remain in its original position, but no matter how much you swap you can never get that probability all the way down to zero. Ironically, this means that the most likely shuffle from this algorithm is to see all cards in exactly the same position they started! So you can see that even if the pseudorandom numbers generated for this shuffle are perfectly random (or close enough), the shuffle itself isnt. Are Your Pseudorandom Numbers Pseudorandom Enough? There is also, of course, the question of whether your pseudorandom number generator function itself actually produces numbers that are pretty random, or if there are actually some numbers that are more or less likely than others, either due to rounding error or just a poor algorithm. A simple test for this is to use your generator to generate a few thousand pairs of random coordinates on a 2d graph, and plot that graph to see if theres any noticeable patterns (like the numbers showing up in a lattice pattern, or with noticeable clusters of results or empty spaces). You can expect to see some
clustering, of course, because as we learned thats how random numbers work. But if you repeat the experiment a few times you should see clusters in different areas. This is a way of using Monte Carlo simulation to do a quick visual test for if your pseudorandom numbers are actually random. (There are other more mathematical ways to calculate the exact level of randomness from your generator, but that requires actual math; this is an easier, quick-and-dirty good enough test for game design purposes.) Most of the time this wont be an issue. Most programming languages and game libraries come with their own built-in pseudorandom number generation functions, and those all use established algorithms that are known to work, and thats what most programmers use. But if your programmer, for some reason, feels the need to implement a custom pseudorandom number generation function, this is something you will want to test carefully! Homework In past weeks, Ive given you something you can do right now to improve the balance of a game youre working on, and also a task you can do later for practice. This week Im going to reverse the order. Get some practice first, then apply it to your project once youre comfortable with the idea. For this weeks homework Im going to go over two algorithms for shuffling cards. I have seen some variant of both of these used in actual working code in shipped games before (I wont say which games, to protect the innocent). In both cases, we saw the perennial complaints from the player base that the deck shuffler was broken, and that there were certain hotspots where if you placed a card in that position in your deck, it was more likely to get shuffled to the top and show up in your opening hand. What I want you to do is think about both of these algorithms logically, and then figure out if they work. Ill give you a hint: one works and one doesnt. Ill actually give you the source code, but Ill also explain both for you in case you dont know how to program. Keep in mind that this might look like a programming problem, but really its a probability calculation: how many ways can you count the different ways a shuffling algorithm shuffles, and do those ways line up evenly with the different permutations of cards in a deck? Algorithm #1 The first algorithm looks like this: Start with an unshuffled deck. Choose a random card from all available cards (so if the deck has 60 cards, choose a pseudorandom number between 1 and 60). Take the card in that position, and swap with card #60. Essentially, this means choose one card randomly to put on the bottom of the shuffled deck, then lock it there in place. Now, take a random card from the remaining cards (between 1 and 59), and swap that card with position #59, putting it on top of the previous one. Then take another random card from the remaining ones (between 1 and 58), swap that with position #58, and so on. Keep repeating this until eventually you get down to position #1, which swaps with itself (so it does nothing), and then were done. This is clearly different from how humans normally shuffle a deck, but remember, the purpose here isnt to emulate a human shuffle; its to get a random shuffle, that is, a random ordering of cards. Algorithm #2 The second algirhtm is similar, but with two minor changes. Start with an unshuffled deck. Choose a random card from all available cards (in a 60 card deck, that means a card from 1 to 60). Swap with position #60, putting it on the bottom of the deck. Choose a random card from all available cards, including those that have been chosen
already (so, choose another random card from 1 to 60). Swap with position #59. Choose another random card from 1 to 60, and swap with position #58. Keep repeating this until eventually you choose a random number from 1 to 60, swap that card with position #1, and youre done. Oh yeah one last thing. Repeat this entire process (steps 2-5) fifty times. Thatll make it more random. Hints How do you approach this, when there are too many different shuffles in a 60-card deck to count? The answer is that you start much simpler. Assume a deck with only three cards in it (assume these cards are all different; call them A, B and C if you want). First, figure out how many ways there are to order a 3-card deck. There are mathematical tricks for doing this that we havent discussed, but you should be able to do this just by trial and error. Next, look at both algorithms and figure out how many different ways there are for each algorithm to produce a shuffle. So in the first case, youre choosing from 3 cards, then choosing from 2, then choosing from 1. In the second case youre choosing from 3, then choosing from 3, then choosing from 3. Compare the list of actual possible orderings of the deck, that is, all the different unique ways the deck can be shuffled then compare to all the different ways a deck can be shuffled by these algorithms. Youll find that one of them produces a random shuffle (well, as random as your pseudorandom number generator is, anyway) and one actually favors certain shuffles over others. And if it works that way for a 3-card deck, assume its similarly random (or not) for larger decks. If you want to go through the math with larger decks, be my guest, but you shouldnt have to. If Youre Working on a Game Now Video Games Once youve done that, if youre working on a game that involves computers, take a look at the pseudorandom numbers and how your program uses them. In particular, if you use a series of pseudorandom numbers to do something like deck shuffling, make sure youre doing it in a way thats actually random; and if youre using a nonstandard way to generate pseudorandom numbers, check that by graphing a bunch of pairs of random coordinates to check for undesirable patterns. Lastly, examine how your random number seed is stored in the game state; if its a multiplayer game, is it stored separately in different clients or on a single server? If its a single-player game, does your save-game system work in such a way that a player can save, attempt a high-risk random roll, and keep reloading from save until it succeeds? All Games, Digital or Non-Digital Another thing to do, whether youre designing a board game or video game, is to take a look at the random mechanics in your game (if there are any) and ask yourself some questions: Is the game dominated more by skill or luck, or is it an even mix? Is the level of skill and luck in the game appropriate to the games target audience, or should the game lean a bit more to one side or the other? What kinds of probability fallacies are your players likely to observe when they play? Can
you design your game differently to change the player perception of how fair and how random your game is? Should you?
Level 6: Situational Balance

Answers to Last Weeks Question If you want to check your answer from last week: Analyzing Card Shuffles For a 3-card deck, there are six distinct shuffling results, all equally likely. If the cards are A, B, and C, then these are: ABC, ACB, BAC, BCA, CAB, CBA. Thus, for a truly random shuffler, we would expect six outcomes (or a multiple of six), with each of these results being equally likely. Analyzing Algorithm #1: First you choose one of three cards in the bottom slot (A, B, or C). Then you choose one of the two remaining cards to go in the middle (if you already chose A for the bottom, then you would choose between B and C). Finally, the remaining card is put on top (no choice involved). These are separate, (pseudo)random, independent trials, so to count them we multiply: 3x2x1 = 6. If you actually go through the steps to enumerate all six possibilities, youll find they correspond to the six outcomes above. This algorithm is correct, and in fact is one of the two standard ways to shuffle a deck of cards. (The other algorithm is to generate a pseudorandom number for each card, then put the cards in order of their numbers. This second method is the easiest way to randomly order a list in Excel, using RAND(), RANK() and VLOOKUP().) Analyzing Algorithm #2: First of all, if a single shuffle is truly random, then repeating it 50 times is not going to make it any more random, so this is just a waste of computing resources. And if the shuffle isnt random, then repeating may or may not make it any better than before, and youd do better to fix the underlying algorithm rather than covering it up. What about the inner loop? First we choose one of the three cards to go on bottom, then one of the three to go in the middle, and then one of the three to go on top. As before these are separate independent trials, so we multiply 3x3x3 = 27. Immediately we know there must be a problem, since 6 does not divide evenly into 27. Therefore, without having to go any further at all, we know that some shuffles must be more likely than others. So it would be perfectly valid to stop here and declare this algorithm buggy. If youre sufficiently determined, you could actually trace through this algorithm all 27 times to figure out all outcomes, and show which shuffles are more or less likely and by how much. A competitive player, upon learning the algorithm, might actually run such a simulation for a larger deck in order to gain a slight competitive advantage. This Week This is a special week. We spent two weeks near the start of the course talking about balancing transitive games, and then two more weeks talking about probability. This week were going to tie the two together, and put a nice big bow on it. This week is about situational balancing. What is situational balancing? What I mean is that sometimes, we have things that are transitive, sort of, but their value changes over time or depends on the situation. One example is area-effect damage. You would expect something that does 500 damage to multiple enemies at once is more valuable than something that does 500 damage just to a single target, other things being equal. But how much more valuable is it? Well, it depends. If youre only fighting a single enemy, one-on-one, it isnt any more valuable. If youre fighting fifty enemies all clustered together in a swarming mass, its 50x more valuable. Maybe at some points in your game you have swarms of 50 enemies, and other times youre only fighting a single lone boss. How do you balance
something like that? Or, consider an effect that depends on what your opponent does. For example, theres a card in Magic: the Gathering called Karma, that does 1 damage to your opponent each turn for each of their Swamps in play. Against a player who has 24 Swamps in their deck, this single card can probably kill them very dead, very fast, all on its own. Against a player with no Swamps at all, the card is totally worthless. (Well, its worthless unless you have other cards in your deck that can turn their lands into Swamps, in which case the value of Karma is dependent on your ability to combine it with other card effects that you may or may not draw.) In either case, the cards ability to do damage changes from turn to turn and game to game. Or, think of healing effects in most games, which are completely worthless if youre fully healed already, but which can make the difference between winning and losing if youre fighting something thats almost dead, and youre almost dead, and you need to squeeze one more action out of the deal to kill it before it kills you. In each of these cases, finding the right cost on your cost curve depends on the situation within the game, which is why I call it situational balancing. So it might be balanced, or underpowered or overpowered, all depending on the context. How do we balance something that has to have a fixed cost, even though it has a benefit that changes? The short answer is, we use probability to figure out the expected value of the thing, which is why Ive spent two weeks building up to all of this. The long answer is its complicated, which is why Im devoting an entire tenth of this course to the subject. Playtesting: The Ultimate Solution? There are actually a lot of different methods of situational balancing. Unfortunately, since the answer to how valuable is it? is always it depends! the best way to approach this is thorough playtesting to figure out where various situations land on your cost curve. But as before, we dont always have unlimited playtest budgets in the Real World, and even if we do have unlimited budgets we still have to start somewhere, so we at least need to make our best guess, and there are a few ways to do that. So playtest, playtest, playtest is good advice, and a simple answer, but not a complete answer. A Simple Example: d20 Lets start with a very simple situation. This is actually something I was asked once on a job interview (and yes, I got the job), so I know it must be useful for something. What follows is a very, very oversimplified version of the d20 combat system, which was used in D&D 3.0 and up. Heres how it works: each character has two stats, their Base Attack Bonus (or BAB, which defaults to 0) and their Armor Class (or AC, which defaults to 10). Each round, each character gets to make one attack against one opponent. To attack, they roll 1d20, add their BAB, and compare the total to the targets AC. If the attackers total is greater or equal, they hit and do damage; otherwise, they miss and nothing further happens. So, by default with no bonuses, you should be hitting about 55% of the time. Heres the question: are BAB and AC balanced? That is, if I gave you an extra +1 to your attack, is that equivalent to +1 AC? Or is one of those more powerful than the other? If I were interviewing you for a job right now, what would you say? Think about it for a moment before reading on. Whats the central resource? Heres my solution (yours may vary). First, I realized that I didnt know how much damage you did, or how many hit points you had (that is, how many times could you survive being hit, and how many times would you have to hit something else to kill it). But assuming these are equal (or equivalent), it doesnt actually matter. Whether you have to hit an enemy once or 5 times or 10 times to kill it, as long as you are equally vulnerable, youre going to hit the enemy a certain
percentage of the time. And theyre going to hit you a certain percentage of the time. What it comes down to is this: you want your hit percentage to be higher than theirs. Hit percentage is the central resource that everything has to be balanced against. If both me and the enemy have a 5% chance of hitting each other, on average well both hit each other very infrequently. If we both have a 95% chance of hitting each other, well hit each other just about every turn. But either way, well exchange blows about as often as not, so theres no real advantage to one or the other. Using the central resource to derive balance So, are AC and BAB balanced? +1 BAB gives me +5% to my chance to hit, and +1 AC gives me -5% to my opponents chance to hit, so if Im fighting against a single opponent on my own, one on one, the two are indeed equivalent. Either way, our relative hit percentages are changed by exactly the same amount. (One exception is if either hit percentage goes above 100% or below 0%, at which point extra plusses do nothing for you. This is probably why the default is +0 BAB, 10 AC, so that it would take a lot of bonuses and be exceedingly unlikely to ever reach that point. Lets ignore this extreme for the time being.) What if Im not fighting one on one? What if my character is alone, and there are four enemies surrounding me? Now I only get to attack once for every four times the opponents attack, so +1 AC is much more powerful here, because Im making a roll that involves my AC four times as often as I make a roll involving my BAB. Or what if its the other way around, and Im in a party of four adventurers ganging up on a lone giant? Here, assuming the giant can only attack one of us at a time, +1 BAB is more powerful because each of us is attacking every round, but only one of us is actually getting attacked. In practice, in most D&D games, GMs are fond of putting their adventuring party in situations where theyre outnumbered; it feels more epic that way. (This is from my experience, at least.) This means that in everyday use, AC is more powerful than BAB; the two stats are not equivalent on the cost curve, even though the game behaves like they should be. Now, as I said, this is an oversimplification; it does not reflect on the actual balance of D&D at all. But we can see something interesting even from this very simple system: the value of attacking is higher if you outnumber the opponent, and the value of defending is higher if youre outnumbered. And if we attached numerical cost and benefit values to hit percentage, we could even calculate how much more powerful these values are, as a function of how much you outnumber or are outnumbered. Implications for Game Design If youre designing a game where you know what the player will encounter ahead of time say, an FPS or RPG with hand-designed levels then you can use your knowledge of the upcoming challenges to balance your stats. In our simplified d20 system, for example, if you know that the player is mostly fighting combats where theyre outnumbered, you can change AC on your cost curve to be more valuable and thus more costly. Another thing you can do, if you wanted AC and BAB to be equivalent and balanced with each other, is to change the mix of encounters in your game so that the player is outnumbered about half the time, and they outnumber the enemy about half the time. Aside from making your stats more balanced, this also adds some replay value to your game: going through the game with high BAB is going to give a very different experience than going through the game with a high AC; in each case, some encounters are going to be a lot harder than others, giving the player a different perspective and different level of challenge in each encounter. The Cost of Switching What if D&D worked in such a way that you could freely convert AC to BAB at the start of a
combat, and vice versa? Now all of a sudden they are more or less equivalent to each other, and suddenly a +1 bonus to either one is much more powerful and versatile relative to any other bonuses in the rest of the game. Okay, maybe you cant actually do that in D&D, but there are plenty of games where you can swap out one situational thing for another. First-Person Shooters are a common example, where you might be carrying several weapons at a time: maybe a rocket launcher against big slow targets or clusters of enemies, a sniper rifle to use at a distance against single targets, and a knife for closequarters combat. Each of these weapons is situationally useful some of the time, but as long as you can switch from one to another with minimal delay, its the sum of weapon capabilities that matters rather than individual weapon limitations. That said, suppose we made the cost of switching weapons higher, maybe a 10-second delay to put one weapon in your pack and take out another (which when you think about it, seems a lot more realistic I mean, seriously, if youre carrying ten heavy firearms around with you and can switch without delay, where exactly are you carrying them all?). Now all of a sudden the limitations of individual weapons play a much greater role, and a single general-purpose weapon may end up becoming more powerful than a smorgasbord of situational weapons. But if instead you have instant real-time weapon change, a pile of weapons where each is the perfect tool for a single situation is much better than a single jack-of-all-trades, master-of-none weapon. Whats the lesson here? We can mess with the situational balance of a game simply by modifying the cost of switching between different tools, weapons, stat distributions, or overall strategies. Thats fine as a general theory, but how do the actual numbers work? Lets see Example: Inability to Switch Lets take one extreme case, where you cant switch strategies at all. An example might be an RPG where youre only allowed to carry one weapon and equip one armor at a time, and whenever you acquire a new one it automatically gets rid of the old. Here, the calculation is pretty straightforward, because this is your only option, so we have to look at it across all situations. Its a lot like an expected value calculation. So, you ask: in what situations does this object have a greater or lesser value, and by how much? How often do you encounter those situations? Multiply and add it all together. Heres a simple, contrived example to illustrate the math: suppose you have a sword that does double damage against Dragons. Suppose 10% of the meaningful combats in your game are against Dragons. Lets assume that in this game, damage has a linear relationship to your cost curve, so doubling the damage of something makes it exactly twice as good. So, 90% of the time the sword is normal, 10% of the time its twice as good. 90%*1.0 + 10%*2.0 = 110% of the cost. So in this case, double damage against dragons is a +10% modifier to the base cost. Heres another example: you have a sword that is 1.5x as powerful as the other swords in its class, but it only does half damage against Trolls. And lets further assume that half damage is actually a huge liability; it takes away your primary way to do damage, so you have to rely on other sources that are less efficient, and it greatly increases the chance youre going to get yourself very killed if you run into a troll at a bad time. So in this case, lets say that half damage actually makes the sword a net negative. But lets also say that trolls are pretty rare, maybe only 5% of the encounters in the game are against trolls. So if a typical sword at this level has a benefit of 100 (according to your existing cost curve), a 1.5x powerful sword would have a benefit of 150, and maybe a sword that doesnt work actually has a cost of 250, because its just that deadly to get caught with your sword down, so to speak. The math says: 95%*150 + 5%*(-250) = 130. So this sword has a benefit of 130, or 30% more than a typical
sword. Again, you can see that there are actually a lot of ways you can change this, a lot of design knobs you can turn to mess with the balance here. You can obviously change the cost and benefit of an object, maybe adjusting the damage in those special situations to make it better or worse when you have those rare cases where it matters, or adjusting the base abilities to cover every other situation, just as you normally would with transitive mechanics. But with situational balance, you can also change the frequency of situations, say by increasing the number of trolls or dragons the player encounters either in the entire game, or just in the area surrounding the place where theyd get that special sword. (After all, if the player is only expected to use this sword that loses to trolls in one region in the game that has no trolls, even if the rest of the game is covered in trolls, its not really much of a drawback, is it?) Another Example: No-Cost Switching Now lets take the other extreme, where you can carry as many situational objects as you want and use or switch between them freely. In this case, the limitations dont matter nearly as much as the strengths of each object, because there is no opportunity cost to gaining a new capability. In this case, we look at the benefits of all of the players objects collected so far, and figure out what this new object will add that cant be done better by something else. Multiply the extra benefit by the percentage of time that the benefit is used, and there is your additional benefit from the new object. So it is a similar calculation, except in most cases we ignore the bad parts, because you can just switch away from those. In practice, its usually not that simple. In a lot of games, the player may be able to use suboptimal strategies if they havent acquired exactly the right thing for this one situation (in fact, its probably better for most games to be designed that way). Also, the player may pick up new objects in a different order on each playthrough. End result: you dont actually know how often something will be used, because it might be used more or less often depending on what other tools the player has already acquired, and also how likely they are to use this new toy in situations where its not perfect (they havent got the perfect toy for that situation yet) but its at least better than their other toys. Lets take an example. Maybe you have a variety of swords, each of which does major extra damage against a specific type of monster, a slight bump in damage against a second type of monster, and is completely ineffective against a third type of monster. Suppose there are ten of these swords, and ten monster types in your game, and the monster types are all about as powerful and encountered about as frequently. It doesnt take a mathematician to guess that these swords should all cost the same. However, we run into a problem. Playing through this game, we would quickly realize that they do not actually give equal value to the player at any given point in time. For example, say Ive purchased a sword that does double damage against Dragons and 1.5x damage against Trolls. Now theres a sword out there that does double damage against Trolls, but that sword is no longer quite as useful to me as it used to be; Im now going from a 1.5x multiplier to 2x, not 1x to 2x, so theres less of a gain there. If I fully optimize, I can probably buy about half the swords in the game and have at least some kind of improved multiplier against most or all monsters, and from that point, extra swords have diminishing returns. How do you balance a system like that? There are a few methods for this that I could think of, and probably a few more I couldnt. It all depends on whats right for your game. Give a discount: One way is to actually change costs on the fly. Work it into your narrative that the more swords you buy from this merchant, the more he discounts additional swords because youre such a good customer (you could even give the player a customer loyalty card in the game and have the merchant put stamps on it; some players love that kind of thing).
Let the player decide: Or, you could balance everything assuming the player has nothing, which means that yes, there will be a law of diminishing returns here, and its up to the player to decide how many is enough; consider that part of the strategy of the game. Let the increasing money curve do the discount work for you: Maybe if the player is getting progressively more money over time, keeping the costs constant will itself be a diminishing cost to compensate, since each sword takes the player less time to earn enough to buy it. Tricky! Discount swords found later in the game: Or, you can spread out the locations in the game where the player gets these swords, so that you know theyll probably buy certain ones earlier in the game and other ones later. You can then cost them differently because you know that when the player finds certain swords, theyll already have access to other ones, and you can reduce the costs of the newer ones accordingly. Obviously, for games where you can switch between objects but theres some cost to switching (a time cost, a money cost, or whatever), youll use a method that lies somewhere between the cant change at all and can change freely and instantly extreme scenarios. The Cost of Versatility Now weve touched on this concept of versatility from a player perspective, if they are buying multiple items in the game that make their character more versatile and able to handle more situations. What about when the objects themselves are versatile? This happens a lot in real-time and turn-based strategy games, for example, where individual units may have several functions. So, maybe archers are really strong against fliers and really weak against footmen (a common RTS design pattern), but maybe you want to make a new unit type who are strong against both fliers and footmen, but not as strong as archers. So maybe an archer can take down a flier and take next to no damage, but this new unit would go down to about half HP in combat with a flier (it would win, but at a cost). This new unit isnt as good against fliers, but it is good for other things, so theyre more versatile. Taking another example, in a first-person shooter, knives and swords are usually the best weapons when youre standing next to an opponent, while sniper rifles are great from a distance, but a machine gun is moderately useful at most ranges (but not quite as good as anything else). So youll never get caught with a totally ineffective weapon if youve got a machine gun, but youll also never have the perfect weapon for the job if youre operating at far or close range a lot. How much are these kinds of versatility worth? Heres the key: versatility has value in direct proportion to uncertainty. If you know ahead of time youre playing on a small map with tight corridors and lots of twists and turns, knives are going to be a lot more useful than sniper rifles. On a map with large, open space, its the other way around. If you have a single map with some tight spaces and some open areas, a versatile weapon that can serve both roles (even if only mediocre) is much more valuable. Suppose instead you have a random map, so theres a 50/50 chance of getting either a map optimized for close quarters or a map optimized for distance attacks. Now whats the best strategy? Taking the versatile weapon thats mildly useful in each case but not as good as the best weapon means youll win against people who guessed poorly and lose against people who guessed well. There is no best strategy here; its a random guess. This kind of choice is actually not very interesting: the players must choose blindly ahead of time, and then most of the game comes down to who guessed right. Unless theyre given a mechanism for changing weapons during play in order to adjust to the map, or they can take multiple weapons with them, or something ah, so we come back to the fact that versatility comes in two flavors: The ability of an individual game object to be useful in multiple situations The ability of the player to swap out one game object for another.
The more easily a player can exchange game objects, the less valuable versatility in a single object becomes. Shadow Costs Now, before we move on with some in-depth examples, I want to write a little bit about different kinds of costs that a game object can have. Strictly speaking, I should have talked about this when we were originally talking about cost curves, but in practice these seem to come up more often in situational balancing than other areas, so Im bringing it up now. Broadly speaking, we can split the cost of an object into two categories: the resource cost, and Everything Else. If you remember when doing cost curves, I generally said that any kind of drawback or limitation is also a cost, so thats what Im talking about here. Economists call these shadow costs, that is, they are a cost thats hidden behind the dollar cost. If you buy a cheap clock radio for $10, there is an additional cost in time (and transportation) to go out and buy the thing, and if it doesnt go off one morning when you really need it to because the UI is poorly designed and you set it for PM instead of AM then missing an appointment because of the poor design costs you additional time and money. If it then breaks in a few months because of its cheap components and you have to go replace or return it then that is an extra time cost, and so on so it looks like it costs $10 but the actual cost is more because it has these shadow costs that a better-quality clock radio might not have. In games, there are two kinds of shadow costs that seem to come up a lot in situational balance: sunk costs and opportunity costs. Let me explain each. Sunk Costs By sunk costs, Im talking about some kind of setup cost that has to be paid first, before you gain access to the thing you want to buy in the first place. One place you commonly see these is in tech trees in RTSs, MMOs and RPGs. For example, in an RTS, in order to build certain kinds of units, you first typically need to build a structure that supports them. The structure may not do anything practical or useful for you, other than allowing you to build a special kind of unit. As an example, each Dragoon unit in StarCraft costs 125 minerals and 50 gas (that is its listed cost), but you had to build a Cybernetics Core to build Dragoons and that cost 200 minerals, and that cost is in addition to each Dragoon. Oh, and by the way, you cant build a Cybernetics Core without also building a Gateway for 150 minerals, so thats part of the cost as well. So if you build all these structures, use them for nothing else, and then create a single Dragoon, that one guy costs you a total of 475 minerals and 50 gas, which is a pretty huge cost compared to the listed cost of the unit itself! Of course, if you build ten Dragoons, then the cost of each is reduced to 160 minerals and 50 gas each, a lot closer to the listed cost, because you only have to pay the build cost for those buildings once (well, under most cases anyway). And if you get additional benefits from those buildings, like them letting you build other kinds of units or structures or upgrades that you take advantage of, then effectively part of the cost of those buildings goes to other things so you can consider it to not even be part of the Dragoons cost. But still, you can see that if you have to pay some kind of cost just for the privilege of paying an additional cost, you need to be careful to factor that into your analysis. When the cost may be amortized (spread out) over multiple purchases, the original sunk cost has to be balanced based on its expected value: how many Dragoons do you expect to build in typical play? When costing Dragoons, you need to factor in the up-front costs as well. You can also look at this the other way, if youre costing the prerequisite (such as those structures you had to build in order to buy Dragoon units): not just what does this do for me now but also what kinds of options does this enable in the future? You tend to see this a lot in tech trees. For example, in some RPGs or MMOs with tech trees, you might see some special abilities you can purchase on level-up that arent particularly useful on their own, maybe theyre even completely
worthless but theyre prerequisites for some really powerful abilities you can get later. This can lead to interesting kinds of short-term/long-term decisions, where you could take a powerful ability now, or a less powerful ability now to get a really powerful ability later. You can see sunk costs in other kinds of games, too. Ive seen some RPGs where the player has a choice between paying for consumable or reusable items. The consumables are much cheaper of course, but you only get to use them once. So for example, you can either buy a Potion for 50 Gold, or a Potion Making Machine for 500 Gold, and in that case youd buy the machine if you expect to create more than ten Potions. Or you pay for a one-way ticket on a ferry for 10 Gold, or buy a lifetime pass for 50 Gold, and you have to ask yourself whether you expect to ride the ferry more than five times. Or you consider purchasing a Shop Discount Card which gives 10% off all future purchases, but it costs you 1000 Gold to purchase the discount in the first place, so you have to consider whether youll spend enough at that shop to make the discount card pay for itself (come to think of it, the choice to buy one of those discount cards at the real-world GameStop down the street requires a similar calculation). These kinds of choices arent always that interesting, because youre basically asking the player to estimate how many times theyll use something but without telling them how much longer the game is or how many times they can expect to use the reusable thing, so its a kind of blind decision. Still, as designers, we know the answer, and we can do our own expected-value calculation and balance accordingly. If we do it right, our players will trust that the cost is relative to the value by the time they have to make the buy-or-not decision in our games. Opportunity Costs The second type of hidden cost, which Im calling an opportunity cost here, is the cost of giving up something else, reducing your versatility. An example, also from games with tech trees, might be if you reach a point where you have to choose one branch of the tech tree, and if you take a certain feat or learn a certain tech or whatever, it prevents you from learning something else. If you learn Fire magic, youre immediately locked out of all the Ice spells, and vice versa. This happens in questing systems, too: if you dont blow up Megaton, you dont get the Tenpenny Tower quest. These can even happen in tabletop games: one CCG I worked on had mostly neutral cards, but a few that were good guy cards and a few that were bad guy cards, and if you played any good guy cards it prevented you from playing bad guy cards for the rest of the game (and vice versa), so any given deck basically had to only use good or bad but not both. Basically, any situation where taking an action in the game prevents you from taking certain other actions later on, is an opportunity cost. In this case, your action has a special kind of shadow cost: in addition to the cost of taking the action right now, you also pay a cost later in decreased versatility (not just resources). It adds a constraint to the player. How much is that constraint worth as a cost? Well, thats up to you to figure out for your particular game situation. But remember that its not zero, and be sure to factor this into your cost curve analysis. Versatility Example How do the numbers for versatility actually work in practice? That depends on the nature of the versatility and the cost and difficulty of switching. Heres a contrived example: youre going into a PvP arena, and you know that your next opponent either has an Ice attack or a Fire attack, but never both and never neither always one or the other. You can buy a Protection From Ice enchantment which gives you protection from Ice attacks, or a Protection From Fire enchantment which gives you protection from Fire attacks (or both, if you want to be sure, although thats kind of expensive). Lets say both enchantments cost 10 Gold each. Now, suppose we offer a new item, Protection From Elements, which gives you both enchantments as a package deal. How much should it cost? (It depends!) Okay, what does it depend on? If youve been paying attention, you know the answer: it depends on how much you know about
your next opponent up front, and it depends on the cost of switching from one to the other if you change your mind later. If you know ahead of time that they will be, say, a Fire attack, then the package should cost the same as Protection from Fire: 10 Gold. The versatility here offers no added value, because you already know the optimal choice. If you have no way of knowing your next opponents attack type until its too late to do anything about it, and you cant switch protections once you enter the arena, then Protection From Elements should cost 20 Gold, the same as buying both. Here, versatility offers you exactly the same added value as just buying both things individually. Theres no in-game difference between buying them separately or together. Ah, but what if you have the option to buy one before the combat, and then if the combat starts and you realize you guessed wrong, you can immediately call a time-out and buy the other one? In this case, you would normally spend 10 Gold right away, and theres a 50% chance youll guess right and only have to spend 10 Gold, and a 50% chance youll guess wrong and have to spend an additional 10 Gold (or 20 Gold total) to buy the other one. The expected value here is (50%*10) + (50%*20) = 15 Gold, so that is what the combined package should cost in this case. What if the game is partly predictable? Say you may have some idea of whether your opponent will use Fire or Ice attacks, but youre not completely sure. Then the optimal cost for the package will be somewhere between the extremes, depending on exactly how sure you are. Okay, so that last one sounds kind of strange as a design. What might be a situation in a real game where you have some idea but not a complete idea of what your opponent is bringing against you? As one example, in an RTS, I might see some parts of the army my opponent is fielding against me, so that gives me a partial (but not complete) sense of what Im up against, and I can choose what units to build accordingly. Here, a unit that is versatile offers some value (my opponent might have some tricks up their sleeve that I dont know about yet) but not complete value (I do know SOME of what the opponent has, so theres also value in building troops that are strong against their existing army). Case Studies in Situational Balance So, with all of that said, lets look at some common case studies. Single-target versus area-effect (AoE) For things that do damage to multiple targets instead of just one at a time, other things being equal, how much of a benefit is that splash damage? The answer is generally, take the expected number of things youll hit, and multiply. So, if enemies come in clusters from 1 to 3 in the game, evenly distributed, then on average youll hit 2 enemies per attack, doing twice the damage you would normally, so splash damage is twice the benefit. A word of warning: other things being equal is really tricky here, because generally other things arent equal in this case. For example, in most games, enemies dont lose offensive capability until theyre completely defeated, so just doing partial damage isnt as important as doing lethal damage. In this case, spreading out the damage slowly and evenly can be less efficient than using singletarget high-power shots to selectively take out one enemy at a time, since the latter reduces the offensive power of the enemy force with each shot, while the area-effect attack doesnt do that for awhile. Also, if the enemies youre shooting at have varying amounts of HP, an area-effect attack might kill some of them off but not all, reducing a cluster of enemies to a few corpses and a smaller group (or lone enemy), which then reduces the total damage output of your subsequent AoE attacks that is, AoE actually makes itself weaker over time as it starts working! So this is something you have to be careful of as well: looking at typical encounters, how often enemies will be clustered together, and also how long theyll stay that way throughout the encounter.
Attacks that are strong (or weak) against a specific enemy type We did an example of this before, with dragons and trolls. Multiply the extra benefit (or liability) as if it were always in effect during all encounters, by the expected percentage of the time it actually will matter (that is, how often do you encounter the relevant type of enemy). The trick here, as we saw in that earlier example, is you have to be very careful of what the extra benefit or liability is really worth, because something like double damage or half damage is rarely double or half the actual value. Metagame objects that you can choose to use or ignore Sometimes you have an object thats sometimes useful and sometimes not, but its at the players discretion whether to use it so if the situation doesnt call for it, they simply dont spend the resources. Examples are situational weapons in an FPS that can be carried as an alternate weapon, specialized units in an RTS that can be build when needed and ignored otherwise, or situational cards in a CCG that can be sideboarded against relevant opponents. Note that in these cases, they are objects that depend on things outside of player control: what random map youre playing on, what units your opponent is building, what cards are in your opponents deck. In these cases, its tempting to cost them according to the likelihood that they will be useful. For example, if I have a card that does 10 damage against a player who is playing Red in Magic, and I know that most decks are 2 or 3 colors so maybe 40-50% of the time Ill play against Red in open play, then we would cost this the same as a card that did 4 or 5 damage against everyone. If the player must choose to use it or not before play begins, with no knowledge of whether the opponent is playing Red or not, this would be a good method. But in some of these cases, you do know what your opponent is doing. In a Magic tournament, after playing the first game to best-of-3, you are allowed to swap some cards into your deck from a Sideboard. You could put this 10-damage-to-Red card aside, not play with it your first game, and then bring it out on subsequent games only if your opponent is playing Red. Played this way, you are virtually assured that the card will work 100% of the time; the only cost to you is a discretionary card slot in your sideboard, which is a metagame cost. As we learned a few weeks ago, trying to cost something in the game based on the metagame is really tricky. So the best we can say is that it should cost a little less to compensate for the metagame cost, but it probably shouldnt be half off like it would be if the player always had to use it unless we want to really encourage its use as a sideboard card by intentionally undercosting it. Likewise with a specialized RTS unit, assuming it costs you nothing to earn the capability of building it. If its useless most of the time, you lose nothing by simply not exercising your option to build it. But when it is useful, you will build it, and youll know that it is useful in that case. So again, it should be costed with the assumption that whatever situation its built for, actually happens 100% of the time. (If you must pay extra for the versatility of being able to build the situational unit in the first place, that cost is what youd want to adjust based on a realistic percentage of the time that such a situation is encountered in real play.) With an alternate weapon in an FPS, a lot depends on exactly how the game is structured. If the weapons are all free (no resource cost) but you can only select one main and one alternate, then you need to make sure the alternates are balanced against each other, i.e. that each one is useful in equally likely situations, or at least that if you multiply the situational benefit by the expected probability of receiving that benefit, that should be the same across all weapons (so you might have a weapon thats the most powerful in the game but only in a really rare situation, versus a weapon thats mediocre but can be used just about anywhere, and you can call those balanced if the numbers come out right). Metagame combos
Now, we just talked about situations where the player has no control. But what if they do have control that is, if something isnt particularly useful on its own, but it forms a powerful combo with something else? An example would be dual-wielding in an FPS, support character classes in an MMO or multiplayer FPS, situational cards that you build your deck around in a CCG, support towers in a Tower Defense game that only improve the towers next to them, and so on. These are situational in a different way: they reward the player for playing the metagame in a certain way. To understand how to balance these, we first have to return to the concept of opportunity costs from earlier. In this case, we have a metagame opportunity cost: you have to take some other action in the game completely apart from the thing were trying to balance, in order to make that thing useful. There are a few ways we could go about balancing things like this, depending on the situation. One is to take the combo in aggregate and balance that, then try to divide that up among the individual components based on how useful they are outside of the combo. For example, Magic had two cards, Lich and Mirror Universe: Lich reduced you to zero life points, but added additional rules that effectively turned your cards into your life this card on its own was incredibly risky, because if it ever left play you would still have zero life, and thus lose the game immediately! Even without that risk, it was of questionable value, because it basically just helped you out if you were losing, and cards that are the most useful when youre losing mean that youre playing to lose, which isnt generally a winning strategy. Mirror Universe was a card that would swap life totals with your opponent not as risky as Lich since youre in control of when to use it, but still only useful when youre losing and not particularly easy to use effectively. But combined the two cards, if uncountered, immediately win you the game by reducing your life to zero and then swapping totals with your opponent: an instant win! How do you cost this? Now, this is a pretty extreme example, where two cards are individually all but useless, dont really work that well in any other context, but combined with each other are allpowerful. The best answer for a situation like this might be to err on the side of making their combined cost equal to a similarly powerful game-winning effect, perhaps marked down slightly because it requires a two-card combination (which is harder to draw than just playing a single card). How do you split the cost between them should one be cheap and the other expensive, or should they both be about the same? Weight them according to their relative usefulness. Lich does provide some other benefits (like drawing cards as an effect when you gain life) but with a pretty nasty drawback. Mirror Universe has no drawback, and a kind of psychological benefit that your opponent might hold off attacking you because they dont want to almost kill you, then have you use it and plink them to death. These are hard to balance against one another directly, but looking at what actually happened in the game, their costs are comparable. How about a slightly less extreme example? A support character class in an MMO can offer lots of healing and attribute bonuses that help the rest of their team. On their own they do have some nonzero value (they can always attack an enemy directly if they have to, if they can heal and buff themselves they might even be reasonably good at it, and in any case theyre still a warm body that can distract enemies by giving them something else to shoot at). But their true value shows up in a group, where they can take the best members of a group and make them better. How do you balance something like this? Lets take a simple example. Suppose your support character has a special ability that increases a single allys attack value by 10%, and that they can only have one of these active at a time, and this is part of their tech tree; you want to find the expected benefit of that ability so you can come up with an appropriate cost. To figure out what thats worth, we might assume a group of adventurers of similar level, and find the character class in that group with the highest attack value, and find our expected attack value for that character class. In a group, this attack buff support ability would be
worth about 10% of that value. Obviously it would be less useful if questing solo, or with a group that doesnt have any good attackers, so youd have to figure the percentage of time that you can expect this kind of support character to be traveling with a group where this attack boost is useful, and factor that into your numbers. In this case, the opportunity cost for including an attacker in your party is pretty low (most groups will have at least one of those anyway), so this support ability is almost always going to be operating at its highest level of effectiveness, and you can balance it accordingly. What do the Lich/Mirror Universe example and the support class example have in common? When dealing with situational effects that players have control over, a rule of thumb is to figure out the opportunity costs for them setting up that situation, and factoring that in as a cost to counteract the added situational benefit. Beyond that, the cost should be computed under best case situations, not some kind of average case: if the players are in control of whether they use each part of the combo, we can assume theyre going to use it under optimal conditions. Multi-class characters As long as were on the subject of character classes, how about multi-class characters that are found in many tabletop RPGs? The common pattern is that you gain versatility, in that you have access to the unique specialties of several character types but in exchange for that, you tend to be lower level and less powerful in all of those types than if you were dedicated to a single class. How much less powerful do you have to be so that multi-classing feels like a viable choice (not too weak), but not one thats so overpowered that theres no reason to single-class? This is a versatility problem. The player typically doesnt know what kinds of situations their character will be in ahead of time, so theyre trying to prepare for a little of everything. After all, if they knew exactly what to expect, they would pick and choose the most effective single character class and ignore the other! However, they do probably have some basic idea of what theyre going to encounter, or at least what capabilities their party is going to need that are not yet accounted for, so a Level 5 Fighter/Thief is probably not as good as a Level 10 Fighter or Level 10 Thief. Since the player must choose ahead of time what they want and they typically cant change their class in the middle of a quest, they are more constrained, so youll probably do well with a starting guess of making a single class 1.5x as powerful as multi-class and then adjusting downward from there as needed. That is, a Level 10 single-class is usually about as powerful as a Level 7 or 8 dual-class. Either-or choices from a single game object Sometimes you have a single object that can do one thing or another, players choice, but not both (the object is typically either used up or irreversibly converted as part of the choice). Maybe you have a card in a CCG that can bring a creature into play or make an existing one bigger. Or you have a lump of metal in an RPG that can be fashioned into a great suit of armor or a powerful weapon. Or youre given a choice to upgrade one of your weapons in an FPS. In these kinds of cases, assuming the player knows the value of the things theyll get (but they can only choose one), the actual benefit is probably going to be more than either option individually, but less than all choices combined, depending on the situation. What does it depend on? This is a versatility problem, so it depends on the raw benefit of each choice, the cost/difficulty of changing their strategy in mid-game, and the foreknowledge of the player regarding the challenges that are coming up later. The Difference Between PvE and PvP Designing PvE games (Player versus Environment, where its one or more players cooperating against the computer, the system, the AI or whatever) is different than PvP games (Player versus Player, where players are in direct conflict with each other) when it comes to situational balance. PvE games are much easier. As the games designer, youre designing the environment, youre designing the levels, youre designing the AI. You already know what is typical or expected in
terms of player encounters. Even in games with procedurally-generated content where you dont know exactly what the player will encounter, you know the algorithms that generate it (you designed them, after all) so you can figure out the expected probability that the content generator will spit out certain kinds of encounters, and within what range. Because of this, you can do expected-value calculations for PvE games pretty easily to come up with at least a good initial guess for your costs and benefits when youre dealing with the situational parts of your game. PvP is a little trickier, because players can vary their strategies. Expected value doesnt really have meaning when you dont know what to expect from your opponent. In these cases, playtesting and metrics are the best methods we have for determining typical use, and thats something well discuss in more detail over the next couple of weeks. If Youre Working on a Game Now Choose one object in your game thats been giving you trouble, something that seems like its always either too good or too weak, and which has some kind of conditional or situational nature to it. (Since situational effects are some of the trickiest to balance, if something has been giving you trouble, its probably in that category anyway.) First, do a thorough search for any shadow costs you may have. What opportunities or versatility do you have to give up in order to gain this objects capabilities? What other things do you have to acquire first before you even have the option of acquiring this object? Ask yourself what those additional costs are worth, and whether they are factored in to the objects resource cost. Next, consider the versatility of the object itself. Is it something thats useful in a wide variety of situations, or only rarely? How much control does the player have over their situation that is, if the object is only useful in certain situations, can the player do anything to make those situations more likely, thus increasing the objects expected value? How easy is it for the player to change their mind (the versatility of the player versus versatility of the object, since a more versatile player reduces the value of object-based versatility) if the player takes this object but then wants to replace it with something else, or use other objects or strategies when they need to, is that even possible and if so, is it easy, or is there a noticeable cost to doing so? How much of a liability is it if the player is stuck in a situation where the object isnt useful? Now, consider how the versatility of the games systems and the versatility of the individual objects should affect their benefits and costs. See if looking at that object in a new way has helped to explain why it felt too weak or too powerful. Does this give you more insight into other objects as well, or the games systems overall? Homework For your homework, were going to look at Desktop Tower Defense 1.5, which was one of the games that popularized the genre of tower defense games. (Ill suggest you dont actually play it unless you absolutely have to, because it is obnoxiously addicting and you can lose a lot of otherwise productive time just playing around with the thing.) DTD 1.5 is a great game for analysis of situational game balance, because nearly everything in the game is situational! You buy a tower and place it down on the map somewhere, and when enemies come into range the tower shoots at them. Buying or upgrading towers costs money, and you get money from killing the enemies with your towers. Since you have a limited amount of money at any time in the game, your goal is to maximize the total damage output of your towers per dollar spent, so from the players perspective this is an efficiency problem. The situational nature of DTD So, on the surface, all you have to do is figure out how much damage a single tower will do, divide by cost, and take the tower with the best damage-to-cost ratio. Simple, right?
Except that actually figuring out how much damage your towers do is completely situational! Each tower has a range; how long enemies stay within that range getting shot at depends entirely on where youve placed your towers. If you just place a tower in the middle of a bunch of open space, the enemies will walk right by it and not be in danger for long; if you build a huge maze that routes everyone back and forth in range of the tower in question, its total damage will be a lot higher. Furthermore, most towers can only shoot one enemy at a time, so if a cluster of enemies walks by, its total damage per enemy is a lot smaller (one enemy gets shot, the others dont). Other towers do area-effect or splash damage, which is great on clusters but pretty inefficient against individual enemies, particularly those that are spaced out because they move fast. One of the tower types doesnt do much damage at all, but slows down enemies that it shoots, which keep them in range of other towers for longer, so the benefit depends on what else is out there shooting. Some towers only work against certain types of enemies, or dont work against certain enemy types, so there are some waves where some of your towers are totally useless to you even if they have a higher-than-normal damage output at other times. And then theres one tower that does absolutely nothing on its own, but boosts the damage output of all adjacent towers so this has a variable cost-to-benefit ratio depending on what other towers you place around it. Even more interesting, placing towers in a giant block (to maximize the effectiveness of this boost tower) has a hidden cost itself, in that its slightly less efficient in terms of usage of space on the board, since theres this big obstacle that the enemies get to walk around rather than just having them march through a longer maze. So, trying to balance a game like this is really tough, because everything depends on everything else! Your mission, should you choose to accept it Since this is a surprisingly deep game to analyze, Im going to constrain this to one very small part of the game. In particular, I want you to consider two towers: the Swarm tower (which only works against flying enemies but does a lot of damage to them) and the Boost tower (thats the one that increases the damage of the towers around it). Now, the prime spot to put these is right in the center of the map, in this little 43 rectangular block. Lets assume youve decided to dedicate that twelvetower area to only Swarm and Boost towers, in order to totally destroy the flying enemies that come your way. Assuming that youre trying to minimize cost and maximize damage, whats the optimal placement of these towers? To give you some numbers, a fully-upgraded Swarm tower does a base of 480 damage per hit, and costs $640 in the game. A fully-upgraded Boost tower costs $500 and does no damage, but improves all adjacent towers (either touching at a side or a corner) by +50%, so in practical terms a Boost tower does 240 damage for each adjacent Swarm tower. Note that two Boost towers adjacent to each other do absolutely nothing for each other they increase each others damage of zero by +50%, which is still zero. Assume all towers will be fully upgraded; the most expensive versions of each tower have the most efficient damage-to-cost ratios. The most certain way to solve this, if you know any scripting or programming, is to write a bruteforce program that runs through all 3^12 possibilities (no tower, Swarm tower or Boost tower in each of the twelve slots). For each slot, count a damage of 480 if a Swarm tower, 240*(number of adjacent Swarm towers) for a Boost tower, or 0 for an empty slot; for cost, count 640 per Swarm tower, 500 for each Boost tower, and 0 for an empty slot. Add up the total damage and cost for each scenario, and keep track of the best damage-to-cost ratio (that is, divide total damage by total cost, and try to get that as high as possible). If you dont have the time or skills to write a brute-force program, an alternative is to create an Excel spreadsheet that calculates the damage and cost for a single scenario. Create a 43 block of cells that are either B (boost tower), S (swarm tower), or blank. Below that block, create a second block of cells to compute the individual costs of each cell. The formula might be something like:
=IF(B2=S,640,IF(B2=B,500,0)) Lastly, create a third block of cells to compute the damage of each cell: =IF(B2=S,480,IF(B2=B,IF(A1=S,240,0)+IF(A2=S,240,0)+IF(A3=S,240,0)+IF(B1=S, 240,0)+IF(B3=S,240,0)+IF(C1=S,240,0)+IF(C2=S,240,0)+IF(C3=S,240,0),0)) Then take the sum of all the damage cells, and divide by the sum of all the cost cells. Display that in a cell of its own. From there, all you need to do is play around with the original cells, changing them by hand from S to B and back again to try to optimize that one final damage-to-cost value. The final deliverable Once youve determined what you think is the optimal damage-to-cost configuration of Swarm and Boost towers, figure out the actual cost and benefit from the Swarm towers only, and the cost and benefit contributed by the Boost towers. Assuming optimal play, and assuming only this one very limited situation, which one is more powerful that is, on a dollars-for-damage basis, which of the two types of tower (Swarm or Boost) contributes more to your victory for each dollar spent? Thats all you have to do, but if you want more, you can then take it to any level of analysis you want as I said, this game is full of situational things to balance. Flying enemies only come every seventh round, so if you want to compute the actual damage efficiency of our Swarm/Boost complex, youd have to divide by 7. Then, compare with other types of towers and figure out if some combination of ground towers (for the 6 out of 7 non-flying levels) and the anti-flying towers should give you better overall results than using towers that can attack both ground and air. And then, of course, you can test out your theories in the game itself, if you have the time. I look forward to seeing some of your names in the all-time high score list.
Level 7: Advancement, Progression and Pacing

Answers to Last Weeks Question If you want to check your answer from last week: Well, I must confess I dont know for sure if this is the right answer or not. In theory I could write a program to do a brute-force solution with all twelve towers if each tower is either Swarm, Boost or Nothing (simply dont build a tower in that location), then its only 3^12 possibilities but I dont have the time to do that at this moment. If someone finds a better solution, feel free to post here! By playing around by hand in a spreadsheet, the best I came up with was the top and bottom rows both consisting of four Swarm towers, with the center row holding four Boost towers, giving a damage/cost ratio of 1.21. The two Boost towers in the center give +50% damage to six Swarm towers surrounding them, thus providing a damage bonus of 1440 damage each, while the two Boost towers on the side support four Swarm towers for a damage bonus of 960 each. On average, then, Boost towers provide 1200 damage for a cost of 500, or a damage/cost ratio of 2.4. Each Swarm tower provides 480 damage (x8 = 3840 damage, total). Each tower costs 640, for a damage/cost ratio of 0.75 for each one. While this is much less efficient than the Boost towers, the Swarm towers are still worth having; deleting any of them makes the surrounding Boost towers less effective, so in combination the Swarm towers are still more cost-efficient than having nothing at all. However, the Boost towers are still much more cost-effective than Swarm towers (and if you look at the other tower types, Boost towers are the most cost-effective tower in the game, hands-down, when you assume many fully-upgraded towers surrounding them). The only thing that prevents Boost towers from being the dominant strategy at the top levels of play, I think, is that you dont have enough cash to make full use of them. A typical game that lasts 40 levels might only give you a few thousand dollars or so, which is just not enough to build a killer array of fully-upgraded towers. Or, maybe theres an opportunity for you to find new dominant strategies that have so far gone undiscovered This Week In the syllabus, this week is listed as advancement, progression and pacing for single-player games but Ive changed my mind. A lot of games feature some kind of advancement and pacing, even multiplayer games. Theres the multiplayer co-op games, like the tabletop RPG Dungeons & Dragons or the console action-RPG Baldurs Gate: Dark Alliance or the PC game Left 4 Dead. Even within multiplayer competitive games, some of them have the players progressing and getting more powerful during play: players get more lands and cast more powerful spells as a game of Magic: the Gathering progresses, while players field more powerful units in the late game of Starcraft. Then there are MMOs like World of Warcraft that clearly have progression built in as a core mechanic of the game, even on PvP servers. So in addition to single-player experiences like your typical Final Fantasy game, well be talking about these other things too: basically, how do you balance progression mechanics? Wait, Whats Balance Again? First, its worth a reminder of what balance even means in this context. As I said in the intro to this course, in terms of progression, there are three things to consider: Is the difficulty level appropriate for the audience, or is the game overall too hard or too easy? As the player progresses through the game, we expect the game to get harder to compensate
for the players increasing skill level because they are getting better; does the difficulty increase at a good rate, or does it get too hard too fast (which leads to frustration), or does it get harder too slowly (leading to boredom while the player waits for the game to get challenging again)? If your avatar increases in power, whether that be from finding new game objects like better weapons or tools or other toys, gaining new special abilities, or just getting a raw boost in stats like Hit Points or Damage, are you gaining these at a good rate relative to the increase in enemy power? Or do you gain too much power too fast (making the rest of the game trivial after a certain point), or do you gain power too slowly (requiring a lot of mindless grinding to compensate, which artificially lengthens the game at the cost of forcing the player to re-play content that theyve already mastered)? We will consider each of these in turn. Flow Theory If youre not familiar with the concept of flow then read up here from last summers course. Basically, this says that if the game is too hard for your level of skill you get frustrated, if its too easy you get bored, but if youre challenged at the peak of your ability then you find the game engaging and usually more fun, and one of our goals as game designers is to provide a suitable level of challenge to our players. Theres two problems here. First, not every player comes to the game with the same skill level, so whats too easy for some players is too hard for others. How do you give all players the same experience but have it be balanced for all of them? Second, as a player progresses through the game, they get better at it, so even if the games challenge level remains constant it will actually get easier for the player. How do we solve these problems? Well, thats most of what this week is about. Why Progression Mechanics? Before moving on, though, its worth asking what the purpose is behind progression mechanics to begin with. If were going to dedicate a full tenth of this course to progression through a game, progression mechanics should be a useful design tool thats worth talking about. What is it useful for? Ending the game In most cases, the purpose of progression is to bring the game to an end. For shorter games especially, the idea is that progression makes sure the game ends in a reasonable time frame. So whether youre making a game thats meant to last 3 minutes (like an early-80s arcade game) or 3060 minutes (like a family board game) or 3 to 6 hours (like a strategic wargame) or 30 to 300 hours (like a console RPG), the idea is that some games have a desired game length, and if you know what that length is, forced progression keeps it moving along to guarantee that the game will actually end within the desired time range. Well talk more about optimal game length later in this post. Reward and training for the elder game In a few specialized cases, the game has no end (MMOs, Sims, tabletop RPGs, or progressionbased Facebook games), so progression is used as a reward structure and a training simulator in the early game rather than a way to end the game. This has an obvious problem which can be seen with just about all of these games: at some point, more progression just isnt meaningful. The player has seen all the content in the game that they need to, theyve reached the level cap, theyve unlocked all of their special abilities in their skill tree, theyve maxed their stats, or whatever. In just about all cases, when the player reaches this point, they have to find something else to do, and there is a sharp transition into whats sometimes called the elder game where the objective changes from
progression to something else. For players who are used to progression as a goal, since thats what the game has been training them for, this transition can be jarring. The people who enjoy the earlygame progression may not enjoy the elder game activities as much since theyre so different (and likewise, some people who would love the elder game never reach it because they dont have the patience to go through the progression treadmill). What happens in the elder game? In Sim games and FarmVille, the elder game is artistic expression: making your farm pretty or interesting for your friends to look at, or setting up custom stories or skits with your sims. In MMOs, the elder game is high-level raids that require careful coordination between a large group, or PvP areas where youre fighting against other human players one-on-one or in teams, or exploring social aspects of the game like taking on a coordination or leadership role within a Guild. In tabletop RPGs, the elder game is usually finding an elegant way to retire your characters and end the story in a way thats sufficiently satisfying, which is interesting because in these games the elder game is actually a quest to end the game! What happens with games that end? In games where progression does end the game, there is also a problem: generally, if youre gaining power throughout the game and this serves as a reward to the player, the game ends right when youre reaching the peak of your power. This means you dont really get to enjoy being on top of the world for very long. If youre losing power throughout the game, which can happen in games like Chess, then at the end you just feel like youve been ground into the dirt for the entire experience, which isnt much better. Peter Molyneux has pointed out this flaw when talking about the upcoming Fable 3, where he insists youll reach the peak of your power early on, succeed in ruling the world, and then have to spend the rest of the game making good on the promises you made to get there which is a great tagline, but really all hes saying is that hes taking the standard Console RPG progression model, shortening it, and adding an elder game, which means that Fable 3 will either live or die on its ability to deliver a solid elder-game experience that still appeals to the same kinds of players who enjoyed reaching that point in the first place. Now, Im not saying it cant be done, but hes got his work cut out for him. In the interview I saw, it sounded like he was treating this like a simple fix to an age-old problem, but as we can see here its really just replacing one difficult design problem with another. I look forward to seeing if he solves it because if he does, that will have major applications for MMOs and FarmVille and everything with an elder game in between. Two Types of Progression Progression tends to work differently in PvP games compared to PvE games. In PvP (this includes multi-player PvP like deathmatch and also single-player games played against AI opponents), youre trying to win against another player, human or AI, so the meaning of your progression is relative to the progression of your opponents. In PvE games (this includes both single-player games and multi-player co-op) you are progressing through the game to try to overcome a challenge and reach some kind of end state, so for most of these games your progress is seen in absolute terms. So that is really the core distinction Id like to make, games where the focus is either on relative power between parties, or absolute power with respect to the games core systems. Im just using PvP and PvE as shorthand here, and if I slip up and refer to PvP as multi-player and PvE as singleplayer that is just because those are the most common design patterns. Challenge Levels in PvE When youre progressing through a bunch of challenges within a game, how do you track the level of challenge that the player is feeling, so you know if its increasing too quickly or too slowly, and whether the total challenge level is just right?
This is actually a tricky question to answer, because the difficulty felt by the player is not made up of just one thing here, its actually a combination of four things, but the player experiences it only as a single am I being challenged? feeling. If were trying to measure the player perception of how challenged they are, its like if the dashboard of your car took the gas, current speed, and engine RPMs and multiplied them all together to get a single happiness rating, and you only had this one number to look at to try to figure out what was causing it to go up or down. The four components of perceived difficulty First of all, theres the level of the players skill at the game. The more skilled the player is at the game, the easier the challenges will seem, regardless of anything else. Second, theres the players power level in the game. Even if the player isnt very good at the game, doubling their Hit Points will still keep them alive longer, increasing their Attack stat will let them kill things more effectively, giving them a Hook Shot lets them reach new places they couldnt before, and so on. Third and fourth, theres the flip side of both of these, which are how the game creates challenges for the player. The game can create skill-based challenges which require the player to gain a greater amount of skill in the game, for example by introducing new enemies with better AI that make them harder to hit. Or it can provide power-based challenges, by increasing the hit points or attack power or other stats of the enemies in the game (or just adding more enemies in an area) without actually making the enemies any more skilled. Skill and power are interchangeable You can substitute skill and power, to an extent, either on the player side or the challenge side. We do this all the time on the challenge side, adding extra hit points or resource generation or otherwise just using the same AI but inflating the numbers, and expecting that the player will need to either get better stats themselves or show a higher level of skill in order to compensate. Or a player who finds a game too easy can challenge themselves by not finding all of the power-ups in a game, giving themselves less power and relying on their high level of skill to make up for it (Im sure at least some of you have tried beating the original Zelda with just the wooden sword, to see if it could be done). Creating a stronger AI to challenge the player is a lot harder and more expensive, so very few games do that (although the results tend to be spectacular when they do Im thinking of Gunstar Heroes as the prototypical example). At any rate, we can think of the challenge level as the sum of the players skill and power, subtracted from the games skill challenges and power challenges. This difference gives us the players perceived level of difficulty. So, when any one of these things changes, the player will feel the game get harder or easier. Written mathematically, we have this equation: PerceivedDifficulty = (SkillChallenge + PowerChallenge) (PlayerSkill + PlayerPower) Example: perceived challenge decreases naturally How do we use this information? Lets take the players skill, which generally increases over time. Thats significant, because it means that if everything else is equal, that is if the players power level, and the overall challenge in the game stay the same, over time the player will feel like the game is getting easier, and eventually itll be too easy. To keep the players attention once they get better, every game must get harder in some way. (Or at least, every game where the players skill can increase. There are some games with no skill component at all, and those are exempted here.) Changing player skill Now, you might think the player skill curve is not under our control. After all, players come to our game with different pre-existing skill levels, and they learn at different rates. However, as designers we actually do have some control over this, based on our mechanics: If we design deep mechanics that interact in a lot of ways with multiple layers of strategy, so
that mastering the basic game just opens up new ways to look at the game at a more abstract meta-level, the players skill curve will be increasing for a long time, probably with certain well-defined jumps when the player finally masters some new way of thinking, like when a Chess player first starts to learn book openings, or when they start understanding the tradeoffs of tempo versus board control versus total pieces on the board. If our game is more shallow, or has a large luck component, we will expect to see a short increase in skill as the player masters what little they can, and then a skill plateau. There are plenty of valid design reasons to do this intentionally. One common example is educational games, where part of the core vision is that you want the player to learn a new skill from the game, and then you want them to stop playing so they can go on to learn other things. Or this might simply be the tradeoff for making your game accessible: A minute to learn, a minute to master. You can also control how quickly the player learns, based on the number of tutorials and practice areas you provide. One common design pattern, popularized by Valve, is to give the player some new weapon or tool or toy in a safe area where they can just play around with it, then introduce them immediately to a relatively easy area where they are given a series of simple challenges that let them use their new toy and learn all the cool things it can do, and then you give them a harder challenge where they have to integrate the new toy into their existing play style and combine it with other toys. By designing your levels to teach the player specific skills in certain areas, you can ramp the player up more quickly so they can increase their skill faster. What if you dont want the player to increase in skill quickly, because you want the game to last longer? If you want the player to learn more slowly, you can instead use skill gating as Ive heard it called. That is, you dont necessarily teach the player how to play your game, or hold their hand through it. Instead, you simply offer a set of progressively harder challenges, so you are at least guaranteed that if a player completes one challenge, they are ready for the next: each challenge is essentially a signpost that says you must be at least THIS GOOD to pass. Measuring the components of perceived challenge Player skill is hard to measure mathematically on its own, because as I said earlier, it is combined with player power in any game that includes both. For now, I can say that the best way to get a handle on this is to use playtesting and metrics, for example looking at how often players die or are otherwise set back, where these failures happen, how long it takes players to get through a level the first time they encounter it, and so on. Well talk more about this next week. Player power and power-based challenges are much easier to balance mathematically: just compare the player power curve with the games opposition power curve. You have complete control over both of these; you control when the player is gaining power, and also when their enemies are presenting a larger amount of power to counter them. What do you want these curves to look like? Part of it depends on what you expect the skill curve to be, since you can use power as a compensatory mechanism in either direction. As a general guideline, the most common pattern Ive seen looks something like this: within a single area like an individual dungeon or level, you start with a sudden jump in difficulty since the player is entering a new space after mastering the old one. Over time, the players power increases, either through level-ups or item drops, until they reach the end of the level where there may be another sudden difficulty jump in the form of a boss, and then after that typically another sudden jump in player power when they get loot from the boss or reach a new area that lets them upgrade their character. Some dungeons split themselves into several parts, with an easier part at the beginning, then a midboss, then a harder part, and then a final boss, but really you can just think of this as the same pattern repeated several times without a change of graphical scenery. String a bunch of these together and thats the power progression in your game: the difficulty jumps initially in a new area,
stays constant awhile, has a sudden jump at the end for the boss, then returns; meanwhile the players power has sudden jumps at the end of an area, with incremental gains along the way as they find new stuff or level up. That said, this is not the only pattern of power progression, not even necessarily the best for your game! These will vary based on genre and intended audience. For Space Invaders, over the course of a single game, the games power challenges, player skill and player power are all constant; the only thing that increases is the games skill challenge (making the aliens start faster and lower to the ground in each successive wave) until eventually they present a hard enough challenge to overwhelm the player. Rewards in PvE In PvE games especially, progression is strongly related to what is sometimes called the reward schedule or risk/reward cycle. The idea is that you dont just want the player to progress, you want them to feel like they are being rewarded for playing well. In a sense, you can think of progression as a reward itself: as the player continues in the game and demonstrates mastery, the ability to progress through the game shows the player they are doing well and reinforces that theyre a good player. One corollary here is that you do need to make sure the player notices youre rewarding them (in practice, this is usually not much of a problem). Another corollary is that timing is important when handing out rewards: Giving too few rewards, or spacing them out for too long so that the player goes for long stretches without feeling any sense of progression, is usually a bad thing. The player is demoralized and may start to feel like if they arent making progress, theyre playing the game wrong (even if theyre really doing fine). Ironically, giving too many rewards can also be hazardous. One of the things weve learned from psychology is that happiness comes from experiencing some kind of gain or improvement, so many little gains produce a lot more happiness than one big gain, even if they add up to the same thing. Giving too many big rewards in a small space of time diminishes their impact. Another thing we know from psychology is that a random reward schedule is more powerful than a fixed schedule. This does not mean that the rewards themselves should be arbitrary; they should be linked to the players progress through the game, and they should happen as a direct result of what the player did, so that the player feels a sense of accomplishment. It is far more powerful to reward the player because of their deliberate action in the game, than to reward them for something they didnt know about and werent even trying for. Ill give a few examples: Have you ever started a new game on Facebook and been immediately given some kind of trophy or achievement unlocked bonus just for logging in the first time? I think this is a mistake a lot of Facebook games make: they give a reward that seems arbitrary, and it actually waters down the players actual achievements later. It gives the impression that the game is too easy. Now, for some games, you may want them to seem easy if they are aimed at an extremely casual audience, but the danger is reducing the actual, genuine feelings of accomplishment the player gets later. Hidden achievements in Xbox 360 games, or their equivalents on other platforms. If achievements are a reward for skill, how is the player to know what the achievement is if its hidden? Even more obnoxious, a lot of these achievements are for things that arent really under player control and that seem kind of arbitrary, like do exactly 123 damage in a single attack where damage is computed randomly. What exactly is the player supposed to feel rewarded for here? A positive example would be random loot drops in a typical action game or action-RPG. While these are random, and occasionally the player gets a really cool item, this is still tied
to the deliberate player action of defeating enemies, so the player is rewarded but on a random schedule. (Note that you can play around with randomness here, for example by having your game track the time between rare loot drops, and having it deliberately give a big reward if the player hasnt seen one in awhile. Some games split the difference, with random drops and also fixed drops from specific quests/bosses, so that the player is at least getting some great items every now and then.) Another common example: the player is treated to a cut-scene once they reach a certain point in a dungeon. Now, at first you might say this isnt random it happens at exactly the same point in the game, every time, because the level designer scripted that event to happen at exactly that place! And on multiple playthroughs youd be correct but the first time a player experiences the game, they dont know where these rewards are, so from the players perspective it is not something that can be predicted; it may as well be random. Now Id like to talk about three kinds of rewards that all relate to progression: increasing player power, level transitions, and story progression. Rewarding the player with increased power Progression through getting a new toy/object/capability that actually increases player options is another special milestone. Like we said before, you want these spaced out, though a lot of times I see the player get all the cool toys in the first third or half of the game and then spend the rest of the game finding new and interesting ways to use them. This can be perfectly valid design; if the most fun toy in your game is only discovered 2/3rds of the way through, thats a lot of time the player doesnt get to have fun Valve made this example famous through the Gravity Gun in Half-Life 2: as the story goes, they initially had you get this gun near the end of the game, but players had so much fun with it that they restructured their levels to give it to the player much earlier. Still, if you give the player access to everything early on, you need to use other kinds of rewards to keep them engaged through the longer final parts of the game where they dont find any new toys. How can you do this? Heres a few ways: If your mechanics have a lot of depth, you can just present unique combinations of things to the player to keep them challenged and engaged. (This is really hard to do in practice.) Use other rewards more liberally after you shut off the new toys: more story, more stat increases, more frequent boss fights or level transitions. You can also offer upgrades to their toys, although its debatable whether you can think of an upgrade as just another way of saying new toy. Or you can, you know, make your game shorter. In this day and age, thankfully, theres no shame in this. Portal and Braid are both well-known for two things: being really great games, and being short. At the big-budget AAA level, Batman: Arkham Asylum was one of the best games of last year (both in critical reception and sales), even though I hear it only lasts about ten hours or so. Rewarding the player with level transitions Progression through level transitions that is, progression to a new area is a special kind of reward, because it makes the player feel like theyre moving ahead (and they are!). You want these spaced out a bit so the player isnt so overwhelmed by changes that they feel like the whole game is always moving ahead without them; a rule of thumb is to offer new levels or areas on a slightly increasing curve, where each level takes a little bit longer than the last. This makes the player feel like they are moving ahead more rapidly at the start of the game when they havent become as emotionally invested in the outcome; a player can tolerate slightly longer stretches between transitions near the end of the game, especially if they are being led up to a huge plot point. Strangely, a lot of this can be done with just the visual design of the level, which is admittedly crossing from game design into the territory of game art: for example, if you have a really long dungeon the players are traversing, you can add things to make it feel like each region of the
dungeon is different, maybe having the color or texture of the walls change as the player gets deeper inside, to give the player a sense that they are moving forward. Rewarding the player with story progression Progression through plot advancement is interesting to analyze, because in so many ways the story is separate from the gameplay: in most games, knowing the characters motivations or their feelings towards each other has absolutely no meaning when youre dealing with things like combat mechanics. And yet, in many games (originally this was restricted to RPGs, but were seeing story integrated into all kinds of games these days), story progression is one of the rewards built into the reward cycle. Additionally, the story itself has a difficulty of sorts (we call it dramatic tension), so another thing to consider in story-based games is whether the dramatic tension of the story overlaps well with the overall difficulty of the game. Many games do not: the story climax is at the end, but the hardest part of the game is in the middle somewhere, before you find an uber-powerful weapon that makes the rest of the game easy. In general, you want rising tension in your story while the difficulty curve is increasing, dramatic climaxes (climaxen? climaces?) at the hardest part, and so on; this makes the story feel more integrated with the mechanics, all thanks to game balance and math. Its really strange to write that you get a better story by using math, but there you are. (I guess another way of doing this would be to force the story writers to put their drama in other places to match the games difficulty curve, but in practice I think its easier to change a few numbers than to change the story.) Combining the types of rewards into a single reward schedule Note that a reward is a reward, so you dont just want to space each category of rewards out individually, but also interleave them. In other words, you dont need to have too many overlaps, where you have a level transition, plot advancement, and a power level increase all at once. Level transitions are fixed, so you tend to see the power rewards sprinkled throughout the levels as rewards between transitions. Strangely, in practice, a lot of plot advancement tends to happen at the same time as level transitions, which might be a missed opportunity. Some games take the chance to add some backstory in the middle of levels, in areas that are otherwise uninteresting although then the danger is that the player is getting a reward arbitrarily when they feel like they werent doing anything except walking around and exploring. A common design pattern I see in this case is to split the difference by scripting the plot advancement so it immediately follows a fight of some kind. Even if its a relatively easy fight, if its one thats scripted, the reward of revealing some additional story immediately after can make the player feel like they earned it. Challenge Levels in PvP If PvE games are all about progression and rewards, PvP games are about gains and losses relative to your opponents. Either directly or indirectly, the goal is to gain enough power to win the game, and there is some kind of tug-of-war between the players as each is trying to get there first. Ill remind you that when Im saying power in the context of progression, Im talking about the sum of all aspects of the players position in the game, so this includes having more pieces and cards put into play, more resources, better board position, taking more turns or actions, or really anything that affects the players standing (other than the players skill level at playing the game). The victory condition for the game is sometimes to reach a certain level of power directly; sometimes it is indirect, where the actual condition is something abstract like Victory Points, and it is the players power in the game that merely enables them to score those Victory Points. And in some cases the players dont gain power, they lose power, and the object of the game is to get the opponent(s) to run out first. In any case, gaining power relative to your opponents is usually an important player goal. Tracking player power as the game progresses (that is, seeing how power changes over time in a
real-time game, or how it changes each turn in a turn-based game) can follow a lot of different patterns in PvP games. In PvE you almost always see an increase in absolute player power level over time (even if their power level relative to the challenges around them may increase or decrease, depending on the game). In PvP there are more options to play with, since everything is relative to the opponents and not compared with some absolute you must be THIS GOOD to win the game yardstick. Positive-sum, negative-sum, and zero-sum games This seems as good a time as any to talk about an important distinction in power-based progression that we borrow from the field of Game Theory: whether the game is zero-sum, positive-sum, or negative-sum. If you havent heard these terms before: Positive-sum means that the overall power in the game increases over time. Settlers of Catan is an example of a positive-sum game: with each roll of the dice, resources are generated for the players, and all players can gain power simultaneously without any of their opponents losing power. Monopoly is another example of a positive-sum game, because on average every trip around the board will give the player $200 (and that money comes from the bank, not from other players). While there are a few spaces that remove wealth from the game and are therefore negative-sum (Income Tax, Luxury Tax, a few of the Chance and Community Chest cards, unmortgaging properties, and sometimes Jail), on average these losses add up to less than $200, so on average more wealth is created than removed over time. Some players use house rules that give jackpots on Free Parking or landing exactly on Go, which make the game even more positive-sum. While you can lose lots of money to other players by landing on their properties, that activity itself is zero-sum (one player is losing money, another player is gaining the exact same amount). This helps explain why Monopoly feels to most people like it takes forever: its a positive-sum game so the average wealth of players is increasing over time, but the object of the game is to bankrupt your opponents which can only be done through zero-sum methods. And the house rules most people play with just increase the positive-sum nature of the game, making the problem worse! Zero-sum means that the sum of all power in the game is a constant, and can neither be created nor destroyed by players. In other words, the only way for me to gain power is to take it from another player, and I gain exactly as much as they lose. Poker is an example of a zero-sum game, because the only way to win money is to take it from other players, and you win exactly as much as the total that everyone else loses. (If you play in a casino or online where the House takes a percentage of each pot, it actually becomes a negative-sum game for the players.) Negative-sum means that over time, players actually lose more power than they gain; player actions remove power from the game without replacing it. Chess is a good example of a negative-sum game; generally over time, your force is getting smaller. Capturing your opponents pieces does not give those pieces to you, it removes them from the board. Chess has no zero-sum elements, where capturing an enemy piece gives that piece to you (although the related game Shogi does work this way, and has extremely different play dynamics as a result). Chess does have one positive-sum element, pawn promotion, but that generally happens rarely and only in the end game, and serves the important purpose of adding a positive feedback loop to bring the game to a close something Ill talk about in just a second. An interesting property here is that changes in player power, whether zero-sum, positive-sum, or negative-sum, are the primary rewards in a PvP game. The player feels rewarded because they have gained power relative to their opponents, so they feel like they have a better chance of winning after making a particularly good move. Positive and negative feedback loops
Another thing I should mention here is how positive and negative feedback loops fit in with this, because you can have either kind of feedback loop with a zero-sum, positive-sum or negative-sum game, but they work differently. In case youre not familiar with these terms, positive feedback loop means that receiving a power reward makes it more likely that youll receive more, in other words it rewards you for doing well and punishes you for doing poorly; negative feedback loop is the opposite, where receiving a power reward makes it less likely youll receive more, so it punishes you for doing well and rewards you for doing poorly. I went into a fair amount of detail about these in last summers course, so I wont repeat that here. One interesting property of feedback loops is how they affect the players power curve. With negative feedback, the power curve of one player usually depends on their opponents power: they will increase more when behind, and decrease more when ahead, so a single players power curve can look very different depending on how theyre doing relative to their opponents, and this will look different from game to game. With positive feedback, you tend to have a curve that gets more sharply increasing or decreasing over time, with larger swings in the endgame; unlike negative feedback, a positive feedback curve doesnt always take the opponents standings into account it can just reward a players absolute power. Now, these arent hard-and-fast rules a negative feedback loop can be absolute, which basically forces everyone to slow down around the time they reach the end game; and a positive feedback loop can be relative, where you gain power when youre in the lead. However, if we understand the game design purpose that is served by feedback loops, well see why positive feedback is usually independent of the opponents, while negative feedback is usually dependent. The purpose of feedback loops in game design The primary purpose of positive feedback is to get the game to end quickly. Once a winner has been decided and a player is too far ahead, you dont want to drag it out because that wastes everyones time. Because of this, you want all players on an accelerating curve in the end game. It doesnt really matter who is ahead; the purpose is to get the game to end, and as long as everyone gets more power, it will end faster. By contrast, the primary purpose of negative feedback is to let players who are behind catch up, so that no one ever feels like they are in a position where they cant possibly win. If everyone is slowed down in exactly the same fashion in the endgame, that doesnt fulfill this purpose; someone who was behind at the beginning can still be behind at the end, and even though the gap appears to close, they are slowed down as much as anyone else. In order to truly allow those who are behind to catch up, the game has to be able to tell the difference between someone who is behind and someone who is ahead. Power curves So, what does a players power curve look like in a PvP game? Here are a few ways you might see a players power gain (or loss) over time: In a typical positive-sum game, each player is gaining power over time in some way. That might be on an increasing, linear, or decreasing curve. In a positive-sum game with positive feedback, the players are gaining power over time and the more power they gain, the more they have, so its an increasing curve (such as a triangular or exponential gain in power over time) for each player. If you subtract one players curve from another (which shows you who is ahead, who is behind, and how often the lead is changing), usually what happens is one player gets an early lead and then keeps riding the curve to victory, unless they make a mistake along the way. Such a game is usually not that interesting for the players who are not in the lead. In a positive-sum game with negative feedback, the players are still on an increasing curve,
but that curve is altered by the position of the other players, reducing the gains for the leader and increasing them further for whoevers behind, so if you look at all the players power curves simultaneously youll see a sort of tangled braid where the players are constantly overtaking one another. Subtracting one players power from another over time, youll see that the players relative power swings back and forth, which is pretty much how any negative feedback should work. In a typical zero-sum game, players take power from each other, and the sum of all player power is a constant. In a two-player game, that means you could derive either players power curve just by looking at the other one. In a zero-sum game with positive feedback, the game may end quickly as one player takes an early advantage and presses it to gain even more of an advantage, taking all of the power from their opponents quickly. Usually games that fall into this category also have some kind of early-game negative feedback built in to prevent the game from coming to an end too early, unless the game is very short. In a zero-sum game with negative feedback, we tend to see swings of power that pull the leader back to the center. This keeps the players close, but also makes it very hard for a single player to actually win; if the negative feedback is too strong, you can easily end in a stalemate where neither player can win, which tends to be unsatisfying. A typical design pattern for zero-sum games in particular is to have some strong negative feedback mechanisms in the early game that taper off towards the end, while positive feedback increases towards the end of the game. This can end up as a pretty exciting game of backand-forth where each player spends some time in the lead before one final, spectacular, irreversible triumph that brings the game to a close. In a typical negative-sum game, the idea is generally not for a player to acquire enough power to win, but rather for a player to lose the least power relative to their opponents. In negative-sum games, players are usually eliminated when they lose most or all of their power, and the object is either to be the last one eliminated, or to be in the best standing when the first opponent is eliminated. A players power curve might be increasing, decreasing or constant, sort of an inverse of the positive-sum game, and pretty much everything else looks like a positive-sum game turned upside down. In a negative-sum game with positive feedback, players who are losing will lose even faster. The more power a player has left over, the slower theyll tend to lose it, but once they start that slide into oblivion it happens more and more rapidly. In a negative-sum game with negative feedback, players who are losing will lose slower, and players who have more power tend to lose it faster, so again youll see this braid shape where the players will chase each other downward until they start crashing.
Applied power curves Now, maybe you can visualize what a power curve looks like in theory, showing how a players power goes up or down over time but how do you actually make one for a real game? The easiest way to construct a power curve is through playtest data. The raw numbers can easily allow you to chart something like this. The hardest part is coming up with some kind of numeric formula for power in the game: how well a player is actually doing in absolute terms. This is easier with some games than others. In a miniatures game like HeroClix or Warhammer 40K, each figurine you control is worth a certain number of points, so it is not hard to add your points together on any given turn to get at least a rough idea of where each player stands. In a real-time strategy game like Starcraft, adding a players current resources along with the resource costs of all of their units and structures would also give a reasonable approximation of their power over time. For a game like Chess where you have to balance a players remaining pieces, board position and tempo, this is a bit trickier. But once you have a power formula you can simply track it for all players over time through repeated playtests to see what kinds of patterns emerge.
Ending the game One of the really important things to notice as you do this is the amount of time it takes to reach a certain point. You want to scale the game so that it ends in about as much time as you want it to. The most obvious way to do that is by hard-limiting time or turns which guarantees a specific game length (this game ends after 4 turns); sometimes this is necessary and even compelling, but a lot of times its just a lazy design solution that says we didnt playtest this enough to know how long the game would take if you actually played to a satisfying conclusion. An alternative is to balance your progression mechanics to cause the game to end within your desired range. You can do this by changing the nature of how positive or negative sum your game is (that is, the base rate of combined player power gain or loss), or by adding, removing, strengthening or weakening your feedback loops. This part is pretty straightforward, if you collect all of the numbers that you need to analyze it. For example, if you take an existing positive feedback loop and make the effect stronger, the game will probably end earlier, so that is one way to shorten the game. Game phases I should note that some PvP games have well-defined transitions between different game phases. The most common pattern here is a three-phase structure where you have an early game, a midgame and an endgame, as made famous by Chess which has many entire books devoted to just a single phase of the game. If you become aware of these transitions (or if you design them into the game explicitly), you dont just want to pay attention to the player power curve throughout the game, but also how it changes in each phase, and the relative length of each phase. For example, a common finding in game prototypes is that the endgame isnt very interesting and is mostly a matter of just going through the motions to reach the conclusion that you already arrived at in mid-game. To fix this, you might think of adding new mechanics that come into play in the endgame to make it more interesting. Or, you might try to find ways to either extend the mid-game or shorten the endgame by adjusting your feedback loops and the positive, negative or zero-sum nature of your game during different phases. Another common game design problem is a game thats great once the players ramp up in midgame, but the early game feels like it starts too slowly. One way to fix this is to add a temporary positive-sum nature to the early game in order to get the players gaining power and into the midgame quickly. In some games, the game is explicitly broken into phases as part of the core design. One example is the board game Shear Panic, where the scoring track is divided into four regions, and each region changes the rules for scoring which gives the game a very different feel in each of the games four phases. In this game, you transition between phases based on the number of turns taken in the game, so the length of each phase is dependent on how many turns each player has had. In a game like this, you could easily change the length of time spent in each phase by just increasing the length of that phase, so it lasts more turns. Other games have less sharp transitions between different phases, and those may not be immediately obvious or explicitly designed. Chess is one example Ive already mentioned. Another is Netrunner, an asymmetric CCG where one player (the Corporation) is trying to put cards in play and then spend actions to score the points on those cards, and the other player (the Runner) is trying to steal the points before theyre scored. After the game had been released, players at the tournament level realized that most games followed three distinct phases: the early game when the Runner is relatively safe from harm and could try to steal as much as possible; then the mid-game when the Corporation sets up its defenses and makes it prohibitively expensive, temporarily for the Runner to steal anything; and finally the endgame where the Runner puts together enough resources to break through the Corporations defenses to steal the remaining points needed for the win.
Looked at in this way, the Corporation is trying to enter the second phase as early in the game as possible and is trying to stretch it out for as long as possible, while the Runner is trying to stay in the first phase as long as it can and then if it doesnt win early, the Runner tries to transition from the second phase into the endgame before the Corporation has scored enough to win. How do you balance the progression mechanics in something like this? One thing you can do, as was done with Netrunner, is to put the progression of the game under partial control of the players, so that it is the players collectively trying to push the game forward or hold it back. That creates an interesting meta-level of strategic tension. Another thing you can do is include some mechanics that actually have some method of detecting what phase youre in, or at the very least, that tend to work a lot better in some phases than others. Netrunner does this as well; for example, the Runner has some really expensive attack cards that arent so useful early on when they dont have a lot of resources, but that help it greatly to end the game in the final phase. In this way, as players use new strategies in each phase, it tends to give the game a very different feel and offers new dynamics as the game progresses. And then, of course, you can use some of these mechanics specifically to adjust the length of each phase in order to make the game progress at the rate you desire. In Netrunner, the Corporation has some cheap defenses it can throw up quickly to try to transition to the mid-game quickly, and it also has more expensive defenses it can use to put up a high bar that the Runner has to meet in order to transition to the endgame. By adjusting both the relative and absolute lengths of each phase in the game, you can make sure that the game takes about as long as you want it to, and also that it is broken up into phases that last a good amount of time relative to each other. Ideal game length All of this assumes you know how long the game (and each phase of the game) should take, but how do you know that? Part of it depends on target audience: young kids need short games to fit their attention span. Busy working adults want games that can be played in short, bite-sized chunks of time. Otherwise, I think it depends mainly on the level and depth of skill: more luck-based or casual games tend to be shorter, while deeper strategic games can be a bit longer. Another thing to consider is at what point a player is far enough ahead that theyve essentially won: you want this point to happen just about the time when the game actually ends so it doesnt drag on. For games that never end, like MMOs or Facebook games, you can think of the elder game as a final infinite-length phase of the game, and youll want to change the length of the progression portion of your game so that the transition happens at about the time you want it to. How long that is depends on how much you want to support the progression game versus the elder game. For example, if your progression game is very different from your elder game and you see a lot of churn (that is, a lot of players that leave the game) when they hit the elder game, and youre using a subscription-based model where you want players to keep their accounts for as long as possible, youll probably want to do two things: work on softening the transition to elder game so you lose fewer people, and also find ways of extending the early game (such as issuing expansion sets that raise the level cap, or letting players create multiple characters with different race/class combinations so they can play through the progression game multiple times). Another interesting case is story-based RPGs, where the story often outlasts the mechanics of the game. We see this all the time with console RPGs, where it says 100 hours of gameplay right on the box. And on the surface that sounds like the game is delivering more value, but in reality if youre just repeating the same tired old mechanics and mindlessly grinding for 95 of those hours, all the game is really doing is wasting your time. Ideally you want the player to feel like theyre progressing through learning new mechanics and progressing through the story at any given time; you dont want the gameplay to drag on any more than you want filler plot that makes the story feel like its dragging on. These kinds of games are challenging to design because you want to tune the length of the game to match both story and gameplay, and often that either means lengthening the story or adding more gameplay, both of which tend to be expensive in development. (You can also
shorten the story or remove depth from the gameplay, but when youve got a really brilliant plot or really inspired mechanics it can be hard to rip that out of the game just to save a few bucks; also, in this specific case theres often a consumer expectation that the game is pretty long to give it that epic feel, so the tendency is to just keep adding on one side or the other.) Flow Theory, Revisited With all that said, lets come back to flow. At the start of this blog post, I said there were two problems here that needed to be solved. One is that the player skill is increasing throughout the game, which tends to shift them from being in the flow to being bored. This is mostly a problem for longer PvE games, where the player has enough time and experience in the game to genuinely get better. The solution, as weve seen when we talked about PvE games, is to have the game compensate by increasing its difficulty through play in order to make the game seem more challenging this is the essence of what designers mean when they talk about a games pacing. For PvP games, in most cases we want the better player to win, so this isnt seen as much of a problem; however, for games where we want the less-skilled player to have a chance and the highly-skilled player to still be challenged, we can implement negative feedback loops and randomness to give an extra edge to the player who is behind. There was another problem with flow that I mentioned, which is that you can design your game at one level of difficulty, but players come to your game with a range of initial difficulty levels, and whats easy for one player is hard for another. With PvE games, as you might guess, the de facto standard is to implement a series of difficulty levels, with higher levels granting the AI power-based bonuses or giving the player fewer powerbased bonuses, because that is relatively cheap and easy to design and implement. However, I have two cautions here: 1. If you keep using the same playtesters they will become experts at the game, and thus unable to accurately judge the true difficulty of easy mode; easy should mean easy and its better to err on the side of making it too easy, than making it challenging enough that some players will feel like they just cant play at all. The best approach is to use a steady stream of fresh playtesters throughout the playtest phase of development (these are sometimes referred to as Kleenex testers because you use them once then throw them away). If you dont have access to that many testers, at least reserve a few of them for the very end of development, when youre tuning the difficulty level of easy. 2. Take care to set player expectations up front about higher difficulties, especially if the AI actually cheats. If the game pretends on the surface to be a fair opponent that just gets harder because it is more skilled, and then players find out that its actually peeking at information thats supposed to be hidden, it can be frustrating. If youre clear that the AI is cheating and the player chooses that difficulty level anyway, there are less hurt feelings: the player is expecting an unfair challenge and the whole point is to beat that challenge anyway. Sometimes this is as simple as choosing a creative name for your highest difficulty level, like Insane. There are, of course, other ways to deal with differing player skill levels. Higher difficulty levels can actually increase the skill challenge of the game instead of the power challenge. Giving enemies a higher degree of AI, as I said before, is expensive but can be really impressive if pulled off correctly. A cheaper way to do this in some games is simply to modify the design of your levels by blocking off easier alternate paths, forcing the player to go through a harder path to get to the same end location when theyre playing at higher difficulty. Then theres Dynamic Difficulty Adjustment (DDA), which is a specialized type of negative feedback loop where the game tries to figure out how the player is doing and then adjusts the difficulty on the fly. You have to be very careful with this, as with all negative feedback loops,
because it does punish the player for doing well and some players will not appreciate that if it isnt set up as an expectation ahead of time. Another way to do this is to split the difference, by offering dynamic difficulty changes under player control. Like DDA, try to figure out how the player is doing but then, give the player the option of changing the difficulty level manually. One example of this is the game flOw, where the player can go to the next more challenging level or the previous easier level at just about any time, based on how confident they are in their skills. Another example, God of War did this and probably some other games as well, is if you die enough times on a level itll offer you the chance to drop the difficulty on the reload screen (which some players might find patronizing, but on the other hand it also gives the player no excuse if they die again anyway). Sid Meiers Pirates actually gives the player the chance to increase the difficulty when they come into port after a successful mission, and actually gives the player an incentive: a higher percentage of the booty on future missions if they succeed. The equivalent in PvP games is a handicapping system, where one player can start with more power or earn more power over the course of the game, to compensate for their lower level of skill. In most cases this should be voluntary, though; players entering a PvP contest typically expect the game to be fair by default. Case Studies With all of that said, lets look at a few examples to see how we can use this to analyze games in practice. Space Invaders (and other retro-arcade games) This game presents the same skill-based challenge to you, wave after wave, and increasing the skill by making the aliens move and shoot faster and start lower. The player has absolutely no way to gain power in the game; you start with three lives and thats all you get, and there are no powerups. On the other hand, you also dont really lose power in the game, in a sense: whether you have one life remaining or all three, your offensive and defensive capabilities are the same. The players goal is not to win, but to survive as long as possible before the increasing challenge curve overwhelms them. Interestingly, the challenge curve does change over the course of a wave; early on there are a lot of aliens and they move slowly, so its very easy to hit a target. Later on you have fewer targets and they move faster, which makes the game more challenging, and of course if they ever reach the ground you lose all of your lives which makes this a real threat. Then the next wave starts and its a little harder than the last one, but the difficulty is still decreased initially. (Youd think that there would also be a tradeoff in that fewer aliens would have less firepower to shoot at you, but in the actual game I think the overall rate of fire was constant, its just how spread out it is, so this didnt actually change much as each round progressed.) Chess (and other squad-based wargames) If I had to put Chess in a genre, yes, Id call a squad-based wargame which is kind of odd since we normally think of this as an entire army and not a squad. I mean this in the sense that you start with a set force and generally do not receive reinforcements, nor do you have any mechanics about resources, production, supply or logistics which you tend to see in more detailed army-level games. Here, we are dealing with a negative-sum game that actually has a mild positive-feedback loop built into it: if youre ahead in pieces, trades tend to be beneficial to you (other things being equal), and once you reach the endgame certain positions let you basically take an automatic win if youre far enough ahead. This can be pretty demoralizing for the player who is losing, especially if you have two players with extremely unequal skill at the game, because they will tend to start losing early and just keep losing more and more as the game goes on. The only reason this works is because against two equally-skilled players, there does tend to be a bit of back-and-forth as players trade off pieces for board control or tempo, so a player that appears to
be losing has a number of opportunities to turn that around later before the endgame. Against wellmatched opponents you will tend to see a variable rate of decrease as they trade pieces, based on how well they are playing, and if they play about as well well see a game where the conclusion is uncertain until the endgame (and even then, if the players are really well-matched, well see a stalemate). Settlers of Catan (and other progression-based board games) Here is a game where progression is gaining power, so it is positive-sum. There are only very limited cases where players can actually lose their progress; mostly, when you build something, that gain is permanent. Catan contains a pretty powerful positive feedback loop, in that building more settlements and cities gives you more resources which lets you build even more, and building is the primary victory condition. At first youd think this means that the first player to get an early-game advantage automatically wins, and if players couldnt trade with each other that would almost certainly be the case. The ability to trade freely with other players balances this aspect of the game, as trading can be mutually beneficial to both players involved in the trade; if players who are behind trade fairly with each other and refuse to trade with those in the lead at all (or only at exorbitant rates of exchange), they can catch up fairly quickly. If I were to criticize this game at all, it would be that the early game doesnt see a lot of rapid progression because the players arent generating that many resources yet and in fact, other games in the series fix this by giving players extra resources in the early game. Mario Kart (and other racing games) Racing games are an interesting case, because players are always progressing towards the goal of the finish line. Most racing video games include a strong negative feedback loop that keeps everyone feeling like they still have a chance, up to the end usually through some kind of rubberbanding technique that causes the computer-controlled cars to speed up or slow down based on how the players are doing. Games like Mario Kart take this a step further, offering pickups that are weighted so that if youre behind, youre likely to get something that lets you catch up, while if youre ahead youll get something less valuable, making it harder for you to fight for first. On the one hand, this provides an interesting tension: players in the lead know that they just have to keep the lead for a little bit longer, while players who are behind realize that time is running out and they have to close the gap quickly. On the other hand, the way most racing games do this feels artificial to a lot of players, because it feels like a players standing in the race is always being modified by factors outside their control. Since the games length is essentially capped by the number of laps, the players are trying to exchange positions before the race ends, so you get an interesting progression curve where players are all moving towards the end at about the same rate. Notice that this is actually the same progression pattern as Catan: both games are positive-sum with negative feedback. And yet, it feels very different as a player. I think this is mostly because in Catan, the negative feedback is under player control, while in Mario Kart a lot of it is under computer control. Interestingly, this is also the same pattern in stock car racing in real life. In auto racing, theres also a negative feedback loop, but it feels a lot more fair: the person in the lead is running into a bunch of air resistance so theyre burning extra fuel to maintain their high speed, which means they need more pit stops; meanwhile, the people drafting behind them are much more fuel-efficient and can take over the lead later. This isnt arbitrary, its a mechanic that affects all players equally, and its up to each driver how much of a risk they want to take by breaking away from the pack. So again, this is something that feels more fair because the negative feedback is under player control. Final Fantasy (and other computer/console RPGs) In these games, the player is mostly progressing through the game by increasing their power level more than their skill level. Older games on consoles like NES tended to be even more based in stats and less in skill than todays games (i.e. they required a lot more grinding than players will put up
with today). Most of these games made today do give the player more abilities as they progress through the experience levels, giving them more options and letting them increase their tactical/strategic skills. Progression and rewards also come from plot advancement and reaching new areas. Usually these games are paced on a slightly increasing curve, where each area takes a little more time than the last. As we discussed in an earlier week, theres usually a positive feedback loop in that winning enough combats lets you level up, which in turn makes it easier for you to win even more combats, and that is counteracted by the negative feedback loop that your enemies also get stronger, and that you need more and more victories to level up again if you stay in the same area too long, which means the actual gains are close to linear. World of Warcraft (and other MMORPGs) Massively multiplayer online games have a highly similar progression to CRPGs, except they then transition at the end to this elder game state, and at that point the concept of progression loses a lot of meaning. So our analysis looks much the same as it does with the more traditional computer RPGs, up until that point. Nethack (and other roguelikes) There are the so-called rogue-like games, which are kind of like this weird fusion with the leveling-up and stat-based progression of an RPG, and the mercilessness of a retro arcade game. A single successful playthrough in Nethack looks similar to that of an RPG, with the player gaining power to meet the increasing level of challenge in the game, but actually reaching the level of skill to complete the game takes much, much longer. If youve never played these games, one thing you should know is that most of them have absolutely no problem killing you dead if you make the slightest mistake. And when I say dead, I mean they will literally delete your save file, permanently, and then you have to start over from scratch with a new character. So, like an arcade game, the players goal is to stay alive as long as possible and progress as far as possible, so progress is both a reward and a measure of skill. While there is a win condition, a lot of players simply never make it that far; keep in mind that taking a character all the way from the start to the end of the game may take dozens of hours, similar to a modern RPG, but with a ton of hard resets that collectively raise the players skill as they die (and then learn how not to die that way in the future). Therefore, Nethack looks like an increasing player power and power/skill challenge over the course of a single playthrough but over a players lifetime, you see these repeated increases punctuated by total restarts, with a slowly increasing player skill curve over time that lets the player survive for longer in each successive game. FarmVille (and other social media games) If roguelikes are the harsh sadists of the game design world, then the cute fluffy bunnies of the game world would be progression-based social media games like FarmVille. These are positivesum games where you basically click to progress, its nearly impossible to lose any progress at all, and youre always gaining something. More player skill simply means you progress at a faster rate. Eventually, you transition to the elder game, but from most of the games Ive played this is a more subtle transition than with MMOs. FarmVille doesnt have a level cap that I know of (if it does have one, its so ridiculously high that most people will never see it), itll happily let you keep earning experience and leveling up although after a certain point you dont really get any interesting rewards for doing so. But after awhile, the reward loop starts rewarding you less and less frequently, you finish earning all the ribbons or trophies or achievements or whatever, so its not that you cant progress any further, but that the game doesnt really reward you for progression as much, so at some point the player decides that further progression just isnt worth it to them, and they either stop playing or they start playing in a different way. If they start playing differently thats where the elder game comes in. Interestingly, the players actions in the elder game still do cause progression. If Youre Working On a Game Now
If youre working on a game right now, and that game has progression mechanics, I want you to ask yourself some design questions about the nature of that progression: What is the desired play length of this game? Why? Really challenge yourself here could you justify a reason to make it twice as long or half as long? If your publisher (or whoever) demanded that you do so anyway, what else would need to change about the nature of the game in order to compensate? Does the game actually play that long? How do you know? If the game is split up into phases, areas, locations or whatever, how long are they? Do they tend to get longer over time, or are there some that are a lot longer (or shorter) than those that come immediately before or after? Is this intentional? Is it justifiable? Is your game positive-sum, negative-sum, or zero-sum? Do any phases of the game have positive or negative feedback loops? How do these affect total play time? Homework Back in week 2, your homework was to analyze a progression mechanic in a game. In particular, you analyzed the progression of player power to the power level of the games challenges, over time, in order to identify weak spots where the player would either go through an area too quickly because theyre too powerful by the time they get there, or theyre stuck grinding because theyre behind the curve and have to catch up Its time to revisit that analysis with what we now know about pacing. This week, Id like you to analyze the reward structure in the game. Consider all kinds of rewards: power-based, level progression, and plot advancement, and anything else you can identify as it applies to your game: Which of these is truly random (such as randomized loot drops), and which of them only seem random to the player on their first playthrough (theyre placed in a specific location by the level designer, but the player has no way of knowing ahead of time how quickly theyll find these things) and are there any rewards that happen on a fixed schedule thats known to the player? How often do rewards happen? Do they happen more frequently at the beginning? Are there any places where the player goes a long time with relatively few rewards? Do certain kinds of rewards seem to happen more frequently than others, or at certain times?
Now, look back again at your original predictions, where you felt that the game either went too quickly, or more likely seemed to drag on forever, at certain points (based on your gut reaction and memory). Do these points coincide with more or fewer rewards in the game? Now ask yourself if the problem could have been solved by just adding a new power-up at a certain place, instead of fixing the leveling or progression curve.
Level 8: Metrics and Statistics

This Week One of the reasons I love game balance is that different aspects of balance touch all these other areas of game development. When we were talking about pseudorandom numbers, thats an area where you get dangerously close to programming. Last week we saw how the visual design of a level can be used as a game reward or to express progression to the player, which is game design but just this side of art. This week, we walk right up to the line where game design intersects business. This week Im covering two topics: statistics, and metrics. For anyone who isnt familiar with what these mean, metrics just means measurements, so it means youre actually measuring or tracking something about your game; leaderboards and high score lists are probably the best-known metrics because they are exposed to the players, but we also can use a lot of metrics behind the scenes to help design our games better. Once we collect a lot of metrics, once we take these measurements, they dont do anything on their own until we actually look at them and analyze them to learn something. Statistics is just one set of tools we can use to get useful information from our metrics. Even though we collect metrics first and then use statistics to analyze them, Im actually going to talk about statistics first because its useful to know how your tools work before you decide what data to capture. Statistics People who have never done statistics before think of it as an exact science. Its math, math is pure, and therefore you should be able to get all of the right answers all the time. In reality, its a lot messier, and youll see that game designers (and statisticians) disagree about the core principles of statistics even more than they disagree about the core principles of systems design, if such a thing is possible. What is statistics, and how is it different from probability? In probability, youre given a set of random things, and told exactly how random they are and what the nature of that randomness is, and your goal is to try to predict what the data will look like when you set those random things in motion. Statistics is kind of the opposite: here youre given the data up front, and youre trying to figure out the nature of the randomness that caused that data. Probability and statistics share one important thing in common: neither one is guaranteed. Probability can tell you theres a 1/6 chance of rolling a given number on 1d6, but it does not tell you what the actual number will be when you roll the die for real. Likewise, statistics can tell you from a bunch of die rolls that there is probably a uniform distribution, and that youre 95% sure, but theres a 5% chance that youre wrong. That chance never goes to zero. Statistical Tools This isnt a graduate-level course in statistical analysis, so all Ill say is that there are a lot more tools than this that are outside the scope of this course. What Im going to put down here is the bare minimum I think every game designer should know to be useful when analyzing metrics in their games. Mean: when someone asks for the average of something, theyre probably talking about the mean average (there are two other kinds of average that I know of, and probably a few more that I dont). To get the mean of a bunch of values, you add them all up and then divide by the number of values. This is sort of like expected value in probability, except that youre computing it based on realworld die-rolls and not a theoretically balanced set of die-rolls. Calculating the mean is incredibly useful; it tells you what the ballpark expected value is of something in your game. You can think of the mean as a Monte Carlo calculation of expected value, except youre using real-world playtest
data rather than a computer simulation. Median: this is another kind of average. To calculate it, take all your values and sort them from smallest to largest, then pick the one in the center. So, if you have five values, the third one is the median. (If you have an even number of values so that there are two in the middle rather than one, youre supposed to take the mean of those, in case youre curious.) On its own, the median isnt all that useful, but it tells you a lot when you compare it with the mean, about whether your values are all weighted to one side, or if theyre basically symmetric. For example, in the US, the median household income is a lot lower than the mean, which basically means weve got a lot of people making a little, and a few people making these ridiculously huge incomes that push up the mean. In a classroom, if the median is lower than the mean, it means most of the students are struggling and one or two brainiacs are wrecking the curve (although more often its the other way around, where most students are clustered around 75 or 80 and then youve got some lazy kid whos getting a zero which pulls down the mean a lot). If youre making a game with a scoreboard of some kind and you see a median thats a lot lower than the mean, it probably means youve got a small minority of players that are just obscenely good at the game and getting these massive scores, while everyone else who is just a mere mortal is closer to the median. Standard deviation: this is just geeky enough to make you sound like youre good at math if you use it in normal conversation. You calculate it by taking each of your data points, subtracting it from the mean, squaring the result (that is, multiply the result by itself), add all of those squares together, divide by the total number of data points, then take the square root of the whole thing. For reasons that you dont really need to know, going through this process gives you a number that represents how spread out the data is. Basically, about two-thirds of your data is within a single standard deviation from the mean, and nearly all of your data is within two standard deviations, so how big your SD is ends up being relative to how big your mean is. A mean of 50, SD of 25 looks a lot more spread out than a mean of 5000, SD of 25. A relatively large SD means your data is all over the place, while a really small SD means your data is all clustered together. Examples To give you an example, lets consider two random variables: 2d6, and 1d11+1. Like we talked about in the week on probability, both of these will give you a number from 2 to 12. But they have a very different nature; the 2d6 clusters around the center, while the 1d11+1 is spread out among all outcomes evenly. Now, statistics doesnt have anything to say about this, but lets just assume that I happen to roll the 2d6 thirty-six times and get one of each result, and I roll 1d11 eleven times and get one of each result which is wildly unlikely, but it does allow us to use statistical tools to analyze probability. The mean of both of these is 7, which means if youre trying to balance either of these numbers in your game, you can use 7 as the expected value. What about the range? The median is also 7 for both, which means youre just as likely to be above or below the mean, which makes sense because both of these are symmetric. However, youll see the standard deviations are a lot different: for 2d6, the SD is about two-and-a-half, meaning that most of the time youll get a result in between 5 and 9; for 1d11+1, the SD is about three-and-a-half, so youll get about as many rolls in the 4 to 10 range here, as you did in the 5 to 9 range for 2d6. Which doesnt actually sound like that big a deal, until you start rolling. As a different example, maybe youre looking at the time it takes playtesters to get through your first tutorial level in a video game youre designing. Your target is that it should take about 5 minutes. You measure the mean at 5 minutes, median at 6 minutes, standard deviation at 2 minutes. What does that tell us? Most people take between 3 and 7 minutes, which might be good or bad depending on just how much of the level is under player control, but in a lot of games the tutorial is meant to be a pretty standardized, linear experience so this would actually feel like a pretty huge range. The other cause for concern is the high median, which suggests most people actually take longer than 5 minutes, you just have a few people who get through the level really fast and they
bring down the mean. This is good news in that you know youre not having anyone taking four hours to complete it or whatever (otherwise the mean would be a lot higher than the median instead!), but its potentially bad news in that some players might have found an unintentional shortcut or exploit, or else theyre just skipping through all your intro dialogue or something which is going to get them stuck and frustrated in level 2, or something else. This suggests another lesson: statistics can tell us that something is happening, but it cant tell us why, and sometimes there are multiple explanations for the why. This is one area where statistics is often misused or flat out abused, by finding one logical explanation for the numbers and ignoring that there could be other explanations as well. In this case, we have no way of knowing why the median is shorter than the mean, or its implications of game design but we could spend some time thinking about all the possible answers, and then we could collect more data that would help us differentiate between them. For example, if one fear is that players are skipping through the intro dialogue, we could actually measure the time spent reading dialogues in addition to the total level time. Well come back to this concept of metrics design later today. Theres also a third lesson here: I didnt tell you how many playtesters it took to get this data! The more tests you have, the more accurate your final analysis will be. If you only had three tests, these numbers are pretty meaningless if youre trying to predict general trends. If there were a few thousand tests, thats a lot better. (How many tests are required to make sure your analysis is good enough? Depends what good enough means to you. The more you have, the more sure you can be, but its never actually 100% no matter how many tests you do. People who do this for a living have confidence intervals where theyll tell you a range of values and then say something like theyre 95% sure that the actual mean in reality is within such-and-such a range. This is a lot more detail than most of us need for our day-to-day design work.) Outliers When you have a set of data with some small number of points that are way above or below the mean, the name for those is outliers (pronounced like the words out and liars). Since these tend to throw off your mean a lot more than the median, if you see the mean and median differing by a lot its probably because of an outlier. When youre doing a statistical analysis, you might wonder what to do with the outliers. Do you include them? Do you ignore them? Do you put them in their own special group? As with most things, it depends. If youre just looking for normal, usual play patterns, it is generally better to discard the outliers because by definition, those are not arising from normal play. If youre looking for edge cases then you want to leave them in and pay close attention; for example, if youre trying to analyze the scores people get so you know how to display them on the leaderboards, realize that your top-score list is going to be dominated by outliers at the top. In either case, if you have any outliers, it is usually worth investigating further to figure out what happened. Going back to our earlier example of level play times, if most players take 5 to 7 minutes to complete your tutorial but you notice a small minority of players that get through in 1 or 2 minutes, that suggests those players may have found some kind of shortcut or exploit, and you want to figure out what happened. If most players take 5 to 7 minutes and you have one player that took 30 minutes, that is probably because the player put it on pause or had to walk away for awhile, or they were just having so much fun playing around in the sandbox that they didnt care about advancing to the next level or whatever, and you can probably ignore that if its just one person. But if its three or four people (still in the vast minority) who did that, you might investigate further, because there might be some small number of people who are running into problems or, players who find one aspect of your tutorial really fun, which is good to know as youre designing the other levels. Population samples
Heres another way statistics can go horribly wrong: it all comes down to what and who youre sampling. I already mentioned one frequent problem, which is not having a large enough sample. The more data points you have, the better. Ill give you an example: back when I played Magic: the Gathering regularly, this one time I put together a tournament deck for a friend, for a tournament that I couldnt play in but they could. To tell if I had the right ratio of land to spells, I shuffled and dealt an opening hand and played a few mock turns to see if I was getting enough. Id do this a bunch of times going through most of the deck, then Id take some land out or put some in depending on how many times I had too much or too little, and then Id reshuffle and do it again. At the time I figured this was a pretty good quick-and-dirty way to figure out how much land I needed. But it just so happened that I wasnt noticing that the land was actually very evenly distributed and not clustered, so most of the time it seemed like I was doing okay by the end but I never actually stopped to count. After the tournament, which my friend lost badly, they reported to me that they were consistently not drawing enough land, and when we actually went through the deck and counted, there were only 16 lands in a deck of 60 cards! I took a lot of flak from my friend for that, and rightly so. The real problem here was that I was trying to analyze the number of land through statistical methods, but my sample size was way too small to draw any meaningful conclusions. Heres another example: suppose youre making a game aimed at the casual market. You have everyone on the development team play through the game to get some baseline data on how long it takes to play through each level and how challenging each level is. Problem: the people playing the game are probably not casual gamers, so this is not really a representative sample of your target market. Im sure this has happened before at some point in the past. A more recent example: in True Crime: Hong Kong, publisher Activision allegedly demanded that the developers change the main character from male to female, because their focus group said they preferred a male protagonist. The problem: the focus group was made up of all males, or the questions were inherently biased by the person setting it up, as a deliberate attempt to further their agenda rather than to actually find out the real-world truth. Activision denies all of this, of course, but that hasnt stopped it from being the subject of many industry conversations not just about the role of women in games, but about the use of focus groups and statistics in game design. You also see things like this happening in the rest of the world, particularly in governmental politics, where a lot of people have their own personal agenda and theyre willing to warp a study and use statistics as a way of proving their point. Basically, when youre collecting playtest data, you want to do your best to recruit playtesters who are as similar as possible to your target market, and you want to have as many playtests as possible so that the random noise gets filtered out. Your analysis is only as good as your data! Even if you use statistics honestly theres still problems every game designer runs into, depending on the type of game. For video games, you are at the mercy of your programmers, and theres nothing you can do about that. The programmers are the ones who need to spend time coding the metrics you ask for. Programming time is always limited, so at some point youll have to make the call between having your programming team implement metrics collection or having them implement, you know, the actual game mechanics youve designed. And thats if the decision isnt made for you by your producer or your publisher. This is easier in some companies than others, but in some places metrics falls into the same category as audio, and localization, and playtesting: tasks that are pushed off towards the end of the development cycle until its too late to do anything useful. For tabletop games, you are at the mercy of your playtesters. The more data points you collect, the better, of course. But in reality, a video game company can release an early beta and get hundreds or thousands of plays, while you might realistically be able to do a fraction
of that with in-person tabletop tests. With a smaller sample, your playtest data is a lot more suspect. For any kind of game, you need to be very clear ahead of time what it is you need measured, and in what level of detail. If you run a few hundred playtests and only find out afterwards that you need to actually collect certain data from the game state that you werent collecting before, youll have to do those tests over again. The only thing to do about this is to recognize that just like design itself, playtesting with metrics is an iterative process, and you need to build that into your schedule. Also for any kind of game, you need to remember that its very easy to mess things up accidentally and get the wrong answer, just like probability. Unlike probability, there arent as many sanity checks to make the wrong numbers look wrong, since by definition you dont always know exactly what youre looking for or what you expect the answer to be. So you need to proceed with caution, and use every method you can find of independently verifying your numbers. It also helps if you try to envision in advance what the likely outcomes of your analysis might be, and what theyll look like. Correlation and causality Finally, one of the most common errors with statistics is when you notice some kind of correlation between two things. Correlation just means that when one thing goes up, another thing always seems to go up (which is a positive correlation) or down (a negative correlation) at the same time. Recognizing correlations is useful, but a lot of times people assume that just because two things are correlated, that one causes the other, and that is something that you cannot tell from statistics alone. Lets take an example. Say you notice when playing Puerto Rico that theres a strong positive correlation between winning, and buying the Factory building; say, out of 100 games, in 95 of them the winner bought a Factory. The natural assumption is that the Factory must be overpowered, and that its causing you to win. But you cant draw this conclusion by default, without additional information. Here are some other equally valid conclusions, based only on this data: Maybe its the other way around, that winning causes the player to buy a Factory. That sounds odd, but maybe the idea is that a Factory helps the player who is already winning, so its not that the Factory is causing the win, its that being strongly in the lead causes the player to buy a Factory for some reason. Or, it could be that something else is causing a player both to win and to buy a Factory. Maybe some early-game purchase sets the player up for buying the Factory, and that earlygame purchase also helps the player to win, so the Factory is just a symptom and not the root cause. Or, the two could actually be uncorrelated, and your sample size just isnt large enough for the Law of Large Numbers to really kick in. We actually see this all the time in popular culture, where two things that obviously have no relation are found to be correlated anyway, like the Redskins football game predicting the next Presidential election in the US, or an octopus that predicts the World Cup winner, or a groundhog seeing its shadow supposedly predicting the remaining length of Winter. As we learned when looking at probability, if you take a lot of random things youll be able to see patterns; one thing is that you can expect to see unlikely-looking streaks, but another is that if you take a bunch of sets of data, some of them will probably be randomly correlated. If you dont believe me, try rolling two separate dice a few times and then computing the correlation between those numbers; I bet its not zero! Statistics in Excel Heres the good news: while there are a lot of math formulas here, you dont actually need to know any of them. Excel will do this for you, it has all these formulas already. Here are a few useful ones: 1. AVERAGE: given a range of cells, this calculates the mean. You could also take the SUM of
the cells and then divide by the number of cells, but AVERAGE is easier. 2. MEDIAN: given a range of cells, this calculates the median, as you might guess. 3. STDEV: given a range of cells, this gives you the standard deviation. 4. CORREL: you give this two ranges of cells, not one, and it gives you the correlation between the two sets of data. For example, you could have one column with a list of final game scores, and another column with a list of scores at the end of the first turn, to see if early-game performance is any kind of indicator of the final game result (if so, this might suggest a positive feedback loop in the game somewhere). The number Excel gives you from the CORREL function ranges between -1 (perfect negative correlation) to 0 (uncorrelated) to +1 (perfect positive correlation). Is there any good news? At this point Ive spent so much time talking about how statistics are misused, that you might be wondering if theyre actually useful for anything. And the answer is, yes. If you have a question that cant be answered with intuition alone, and it cant be answered just through the math of your cost or progression curves, statistics let you draw useful conclusions if you ask the right questions, and if you collect the right data. Heres an example of a time when statistics really helped a game I was working on. I worked for a company that made this online game, and we found that our online population was falling and people werent playing as many games, because we hadnt released an update in awhile. (That part was expected. With no updates, Ive found that an online game loses about half of its core population every 6 months or so, at least that was my experience.) But what we didnt expect, was one of our programmers got bored one day and made a trivia bot, just this little script that would log into our server with its own player account, send a trivia question every couple of minutes, and then parse the incoming public chat to see if anyone said the right answer. And it was popular, as goofy and stupid and simple as it was, because it was such a short, immediate casual experience. Now, the big question is: what happened to the player population, and what happened to the actual, real game that players were supposed to be playing (you know, the one where they would log in to the chat room to find someone to challenge, before they got distracted by the trivia bot)? Some players loved the trivia bot. It gave them something to do in between games. Others hated the trivia bot; they claimed that it was harder to find a game, because everyone who was logged in was too busy answering dumb trivia questions to actually play a real game. Who was right? Intuition failed, because everyones intuition was different. Listening to the players failed, because the vocal minority of the player base was polarized, and there was no way to poll those who werent in the vocal minority. Math failed, because the trivia bot wasnt part of the game, let alone part of the cost curve. Could we answer this with statistics? We sure could, and we did! This was simple enough that it didnt even require much analysis. Measure the total number of logins per day. Measure total number of actual games played. Since our server tracked every player login, logout and game start already, we had this data, all we had to do was some very simple analysis, tracking how these things changed over time. As expected, the numbers were all falling gradually since the time of the last real release, but the trivia bot actually caused a noticeable increase in both total logins and number of games played. It turned out that players were logging in and playing with the trivia bot, but as long as they were there, they were also playing games with each other! That was a conclusion that would have been impossible to reach in any kind of definitive way, without analysis of the hard data. And it taught us something really important about online games: more players online, interacting with each other, is better even if theyre interacting in nonstandard ways. Metrics
Heres a common pattern in artistic and creative fields, particularly things like archaeology or art preservation or psychology or medicine where it requires a certain amount of intuition but at the same time there is still a right answer or best way to do things. The progression goes something like this: Practitioners see their field as a soft science; they dont know a whole lot about best principles or practices. They do learn how things work, eventually, but its mostly through trial and error. Someone creates a technology that seems to solve a lot of these problems algorithmically. Practitioners rejoice. Finally, were a hard science! No more guesswork! Most younger practitioners abandon the old ways and embrace science as a way to solve all their fields problems. The old guard, meanwhile, sees it as a threat to how theyve always done things, and eyes it skeptically. The limitations of the technology become apparent after much use. Practitioners realize that there is still a mysterious, touchy-feely element to what they do, and that while some day the tech might answer everything, that day is a lot farther off than it first appeared. Widespread disillusionment occurs as people no longer want to trust their instincts because theoretically technology can do it better, but people dont want to trust the current technology because it doesnt work that great yet. The young turks acknowledge that this wasnt the panacea they thought; the old guard acknowledge that its still a lot more useful than they assumed at first. Everyone kisses and makes up. Eventually, people settle into a pattern where they learn what parts can be done by computer algorithms, and what parts need an actual creative human thinking, and the field becomes stronger as the best parts of each get combined. But learning which parts go best with humans and which parts are best left to computers is a learning process that takes awhile. Currently, game design seems to be just starting Step 2. Were hearing more and more people anecdotally saying why metrics and statistical analysis saved their company. We hear about MMOs that are able to solve their game balance problems by looking at player patterns, before the players themselves learn enough to exploit them. We hear of Zynga changing the font color from red to pink which generates exponentially more click-throughs from players to try out other games. We have entire companies that have sprung up solely to help game developers capture and analyze their metrics. The industry is falling in love with metrics, and Ill go on record predicting that at least one company that relies entirely on metrics-driven design will fail, badly, by the time this whole thing shakes out, because they will be looking so hard at the numbers that theyll forget that there are actually human players out there who are trying to have fun in a way that cant really be measured directly. Or maybe not. Ive been wrong before. At any rate, right now there seems to be three schools of thought on the use of metrics: The Zynga model: design almost exclusively by metrics. Love it or hate it, 60 Million monthly active unique players laugh at your feeble intuition-based design. Rebellion against the Zynga model: metrics are easy to misunderstand, easy to manipulate, and are therefore dangerous and do more harm than good. If you measure player activity and find out that more players use the login screen than any other in-game action, that doesnt mean you should add more login screens to your game out of some preconceived notion that if a player does it, its fun. If you design using metrics, you push yourself into designing the kinds of games that can be designed solely by metrics, which pushes you away from a lot of really interesting video game genres. The moderate road: metrics have their uses, they help you tune your game to find local peaks of joy. They help you take a good game and make it just a little bit better, by helping you explore the nearby design space. However, intuition also has its uses; sometimes you need to take broad leaps in unexplored territory to find the global peaks, and metrics alone will not get you there, because sometimes you have to make a game a little worse in one
way before it gets a lot better in another, and metrics wont ever let you do that. Think about it for a bit and decide where you stand, personally, as a designer. What about the people you work with on a team (if you work with others on a team)? How much to measure? Suppose you want to take some metrics in your game so you can go back and do statistical analysis to improve your game balance. What metrics do you actually take that is, what exactly do you measure? There are two schools of thought that Ive seen. One is to record anything and everything you can think of, log it all, mine it later. The idea is that youd rather collect too much information and not use it, than to not collect a piece of critical info and then have to re-do all your tests. Another school of thought is that record everything is fine in theory, but in practice you either have this overwhelming amount of extraneous information from which youre supposed to find this needle in a haystack of something useful, or potentially worse, you mine the heck out of this data mountain to the point where youre finding all kinds of correlations and relationships that dont actually exist. By this way of thinking, instead you should figure out ahead of time what youre going to need for your next playtest, measure that and only that, and that way you dont get confused when you look at the wrong stuff in the wrong way later on. Again, think about where you stand on the issue. Personally, I think a lot depends on what resources you have. If its you and a few friends making a small commercial game in Flash, you probably dont have time to do much in the way of intensive data mining, so youre better off just figuring out the useful information you need ahead of time, and add more metrics later if a new question occurs to you that requires some data you arent tracking yet. If youre at a large company with an army of actuarial statisticians with nothing better to do than find data correlations all day, then sure, go nuts with data collection and youll probably find all kinds of interesting things youd never have thought of otherwise. What specific things do you measure? Thats all fine and good, but whether you say just get what we need or collect everything we can, neither of those is an actual design. At some point you need to specify what, exactly, you need to measure. Like game design itself, metrics is a second-order problem. Most of the things that you want to know about your game, you cant actually measure directly, so instead you have to figure out some kind of thing that you can measure that correlates strongly with what youre actually trying to learn. Example: measuring fun Lets take an example. In a single-player Flash game, you might want to know if the game is fun or not, but theres no way to measure fun. What correlates with fun, that you can measure? One thing might be if players continue to play for a long time, or if they spend enough time playing to finish the game and unlock all the achievements, or if they come back to play multiple sessions (especially if they replay even after theyve won), and these are all things you can measure. Now, keep in mind this isnt a perfect correlation; players might be coming back to your game for some other reason, like if youve put in a crop-withering mechanic that punishes them if they dont return, or something. But at least we can assume that if a player keeps playing, theres probably at least some reason, and that is useful information. More to the point, if lots of players stop playing your game at a certain point and dont come back, that tells us that point in the game is probably not enjoyable and may be driving players away. (Or if the point where they stopped playing was the end, maybe they found it incredibly enjoyable but they beat the game and now theyre done, and you didnt give a reason to continue playing after that. So it all depends on when.) Player usage patterns are a big deal, because whether people play, how often they play, and how
long they play are (hopefully) correlated with how much they like the game. For games that require players to come back on a regular basis (like your typical Facebook game), the two buzzwords you hear a lot are Monthly Active Uniques and Daily Active Uniques (MAU and DAU). The Active part of that is important, because it makes sure you dont overinflate your numbers by counting a bunch of old, dormant accounts belonging to people who stopped playing. The Unique part is also important, since one obsessive guy who checks FarmVille ten times a day doesnt mean he counts as ten users. Now, normally youd think Monthly and Daily should be equivalent, just multiply Daily by 30 or so to get Monthly, but in reality the two will be different based on how quickly your players burn out (that is, how much overlap there is between different sets of daily users). So if you divide MAU/DAU, that tells you something about how many of your players are new and how many are repeat customers. For example, suppose you have a really sticky game with a small player base, so you only have 100 players, but those players all log in at least once per day. Here your MAU is going to be 100, and your average DAU is also going to be 100, so your MAU/DAU is 1. Now, suppose instead that you have a game that people play once and never again, but your marketing is good, so you get 100 new players every day but they never come back. Here your average DAU is still going to be 100, but your MAU is around 3000, so your MAU/DAU is about 30 in this case. So thats the range, MAU/DAU goes between 1 (for a game where every player is extremely loyal) to 28, 30 or 31 depending on the month (representing a game where no one ever plays more than once). A word of warning: a lot of metrics, like the ones Facebook provides, might use different ways of computing these numbers so that one set of numbers isnt comparable to another. For example, I saw one website that listed the worst MAU/DAU ratio in the top 100 applications as 33-pointsomething, which should be flatly impossible, so clearly the numbers somewhere are being messed with (maybe they took the Dailies from a different range of dates than the Monthlies or something). And then some people compute this as a %, meaning on average, what percentage of your player pool logs in on a given day, which should range from a minimum of about 3.33% (1/30 of your monthly players logging in each day) to 100% (all of your monthly players log in every single day). This is computed by taking DAU/MAU (instead of MAU/DAU) and multiplying by 100 to get a percentage. So if you see any numbers like this from analytics websites, make sure youre clear on how theyre computing the numbers so youre not comparing apples to oranges. Why is it important to know this number? For one thing, if a lot of your players keep coming back, it probably means youve got a good game. For another, it means youre more likely to make money on the game, because youve got the same people stopping by every day sort of like how if you operate a brick-and-mortar storefront, an individual who just drops in to window-shop may not buy anything, but if that same individual comes in and is just looking every single day, theyre probably going to buy something from you eventually. Another metric thats used a lot, particularly on Flash game portals, is to go ahead and ask the players themselves to rate the game (often in the form of a 5-star rating system). In theory, we would hope that higher ratings mean a better game. In theory, wed also expect that a game with high player ratings would also have a good MAU/DAU ratio, that is, that the two would be correlated. I dont know of any actual studies that have checked this, though Id be interested to see the results, but if I had to guess Id assume that there is some correlation but not a lot. Users that give ratings are not a representative sample; for one thing, they tend to have strong opinions or else they wouldnt bother rating (seriously, I always had to wonder about those opinion polls that would say something like 2% of poll respondents said they had no opinion like, who calls up a paid opinion poll phone line just to say they have no opinion?), so while actual quality probably falls along a bell curve you tend to have more 5-star and 1-star ratings than 3-star, which is not what youd expect if everyone rated the game fairly. Also, theres the question of whether player opinion is more or less meaningful than actual play patterns; if a player logs into a game every day for months on end but rates it 1 out of 5 stars, what does that mean? Or if a player admits they havent even played the game, but theyre still giving it 4 out of 5 stars based on I dont know its
reputation or something? Also, players tend to not rate a game while theyre actively playing, only (usually) after theyre done, which probably skews the ratings a bit (depending on why they stopped playing). So its probably better to pay attention to usage patterns than player reporting, especially if that reporting isnt done during the game from within the game in a way that you can track. Now, Ive been talking about video games, in fact most of this is specific to online games. The equivalent in tabletop games is a little fuzzier, but as the designer you basically want to be watching peoples facial expressions and posture to see where in the game theyre engaged and where theyre bored or frustrated. You can track how these correlate to certain game events or board positions. Again, you can try to rely on interviews with players, but thats dangerous because player memory of these things is not good (and even if it is, not every playtester will be completely honest with you). For video games that are not online, you can still capture metrics based on player usage patterns, but actually uploading them anywhere is something you want to be very clear to your players about, because of privacy concerns. Another example: measuring difficulty Player difficulty, like fun, is another thing thats basically impossible to measure directly, but what you can measure is progression, and failure to progress. Measures of progression are going to be different depending on your game. For a game that presents skill-based challenges like a retro arcade game, you can measure things like how long it takes the player to clear each level, how many times they lose a life on each level, and importantly, where and how they lose a life. Collecting this information makes it really easy to see where your hardest points are, and if there are any unintentional spikes in your difficulty curve. I understand that Valve does this for their FPS games, and that they actually have a visualizer tool that will not only display all of this information, but actually plot it overlaid on a map of the level, so you can see where player deaths are clustered. Interestingly, starting with Half-Life 2 Episode 2 they actually have live reporting and uploading from players to their servers, and they have displayed their metrics on a public page (which probably helps with the aforementioned privacy concerns, because players can see for themselves exactly what is being uploaded and how its being used). Yet another example: measuring game balance What if instead you want to know if your game is fair and balanced? Thats not something you can measure directly either. However, you can track just about any number attached to any player, action or object in the game, and this can tell you a lot about both normal play patterns, and also the relative balance of strategies, objects, and anything else. For example, suppose you have a strategy game where each player can take one of four different actions each turn, and you have a way of numerically tracking each players standing. You could record each turn, what action each player takes, and how it affects their respective standing in the game. Or, suppose you have a CCG where players build their own decks, or a Fighting game where each player chooses a fighter, or an RTS where players choose a faction, or an MMO or tabletop RPG where players choose a race/class combination. Two things you can track here are which choices seem to be the most and least popular, and also which choices seem to have the highest correlation with actually winning. Note that this is not always the same thing; sometimes the big, flashy, coollooking thing that everyone likes because its impressive and easy to use is still easily defeated by a sufficiently skilled player who uses a less well-known strategy. Sometimes, dominant strategies take months or even years to emerge through tens of thousands of games played; the Necropotence card in Magic: the Gathering saw almost no play for six months or so after release, until some top players figured out how to use it, because it had this really complicated and obscure set of effects but once people started experimenting with it, they found it to be one of the most powerful cards ever made. So, both popularity and correlation with winning are two useful metrics here.
If a particular game object sees a lot more use than you expected, that can certainly signal a potential game balance issue. It may also mean that this one thing is just a lot more compelling to your target audience for whatever reason for example, in a high fantasy game, you might be surprised to find more players creating Elves than Humans, regardless of balance issues or maybe you wouldnt be that surprised. Popularity can be a sign in some games that a certain play style is really fun compared to the others, and you can sometimes migrate that into other characters or classes or cards or what have you in order to make the game overall more fun. If a game object sees less use than expected, again that can mean its underpowered or overcosted. It might also mean that its just not very fun to use, even if its effective. Or it might mean it is too complicated to use, it has a high learning curve relative to the rest of the game, and so players arent experimenting with it right away (which can be really dangerous if youre relying on playtesters to actually, you know, playtest, if they leave some of your things alone and dont play with them). Metrics have other applications besides game objects. For example, one really useful area is in measuring beginning asymmetries, a common one being the first-player advantage (or disadvantage). Collect a bunch of data on seating arrangements versus end results. This happens a lot with professional games and sports; for example, I think statisticians have calculated the homefield advantage in American Football to be about 2.3 points, and depending on where you play the first-move advantage in Go is 6.5 or 7.5 points (in this latter case, the half point is used to prevent tie games). Statistics from Settlers of Catan tournaments have shown a very slight advantage to playing second in a four-player game, on the order of a few hundredths of a percent; normally we could discard that as random variation, but the sheer number of games that have been played gives the numbers some weight. One last example: measuring money If youre actually trying to make money by selling your game, in whole or part, then at the end of the day this is one of your most important considerations. For some people its the most important consideration: theyd rather have a game that makes lots of money but isnt fun or interesting at all, than a game thats brilliant and innovative and fun and wonderful but is a sleeper hit which is just a nice way of saying it bombed in the market but didnt deserve to. Other game designers would rather make the game fun first, so one thing for each of you to consider is, personally, which side of the fence youre on because if you dont know that about yourself, someone else is going to make the call for you some day. At any rate, money is something that just about every commercial game should care about in some capacity, so its something thats worth tracking. Those sales tell you something related to how good a job you did with the game design, along with a ton of other factors like market conditions, marketing success, viral spread, and so on. With traditional games sold online or through retail, this is a pretty standard curve: big release-day sales that fall off over time on an exponentially decreasing curve, until they get to the point where the sales are small enough that its not worth it to sell anymore. With online games you dont have to worry about inventory or shelf space so you can hold onto it a bit longer, which is where this whole long tail thing came from, because I guess the idea is that this curve looks like it has a tail on the right-hand side. In this case the thing to watch for is sudden spikes, when those are, and what caused them, because they dont usually happen on their own. Unfortunately, that means sales metrics for traditional sales models arent all that useful to game designers. We see a single curve that combines lots of variables, and we only get the feedback after the game is released. If its one game in a series its more useful because we can see how the sales changed from game to game and what game mechanics changed, so if the game took a major step in a new direction and that drastically increased or reduced sales, that gives you some information there. If instead your game is online, such as an MMO, or a game in a Flash portal or on Facebook, the
pattern can be a bit different: sales start slow (higher if you do some marketing up front), then if the game is good it ramps up over time as word-of-mouth spreads, so its basically the same curve but stretched out a lot longer. The wonderful thing about this kind of release schedule is that you can manage the sales curve in real-time: make a change to your game today, measure the difference in sales for the rest of the week, and keep modifying as you go. Since you have regular incremental releases that each have an effect on sales, youre getting constant feedback on the effects that minor changes have on the money your game brings in. However, remember that your game doesnt operate in a vacuum; there are often other outside factors that will affect your sales. For example, I bet if theres a major natural disaster thats making international headlines, that most Facebook games will see a temporary drop in usage because people are busy watching the news instead. So if a game company made a minor game change the day before the Gulf oil spill and they noticed a sudden decrease in usage from that geographical area, the designers might mistakenly think their game change was a really bad one if they werent paying attention to the real world. Ideally, youd like to eliminate these factors, so you know what youre measuring, controlling for outside factors. One way of doing this, which works in some special cases, is to actually have two separate versions of your game that you roll out simultaneously to different players, and then you compare the two groups. One important thing about this is that you do need to select the players randomly (and not, say, giving one version to the earliest accounts created on your system and the other version to the most recent adopters). Of course, if the actual gameplay itself is different between the two groups, thats hard to do without some players getting angry about it, especially if one of the two groups ends up with an unbalanced design that can be exploited. So its better to do this with things that dont affect balance: banner ads, informational popup dialog text, splash screens, the color or appearance of the artwork in your game, and other things like that. Or, if you do this with gameplay, do it in a way that is honest and up front with the players; I could imagine assigning players randomly to a faction (like World of Warcrafts Alliance/Horde split, except randomly chosen when an account is created) and having the warring factions as part of the backstory of the game, so it would make sense that each faction would have some things that are a little bit different. I dont know of any game thats actually done this, but it would be interesting to see in action. For games where players can either play for free or pay this includes shareware, microtransactions, subscriptions, and most other kinds of payment models for online games you can look at not just how many users you have, or how much money youre getting total, but also where that money is coming from on a per-user basis. This is very powerful, but there are also a lot of variables to consider. First, what counts as a player? If some players have multiple accounts (with or without your permission) or if old accounts stay around while dormant, the choice of whether to count these things will change your calculations. Typically companies are interested in looking at revenue from unique, active users, because dormant accounts tend to not be spending money, and a single player with several accounts should really be thought of as one entity (even if theyre spending money on each account). Second, theres a difference between players who are playing for free and have absolutely no intention of paying for your game ever, versus players who spend regularly. Consider a game where you make a huge amount of money from a tiny minority of players; this suggests you have a great game that attracts and retains free players really well, and that once players can be convinced to spend any money at all theyll spend a lot, but it also says that you have trouble with conversion that is, convincing players to take that leap and spend their first dollar with you. In this case, youd want to think of ways to give players incentive to spend just a little bit. Now consider a different game, where most people that play spend something but that something is a really small amount. Thats a different problem, suggesting that your payment process itself is driving away players, or at least that its giving your players less incentive to spend more, like youre hitting a spending ceiling somewhere. You might be getting the same total cash across your user base in both of these
scenarios, but the solutions are different. Typically, the difference between them is shown with two buzzwords, ARPU (Average Revenue Per User) and ARPPU (Average Revenue Per Paying User). I wish we called them players rather than users, but it wasnt my call. At any rate, in the first example with a minority of players paying a lot when most people play for free, ARPPU will be really high; in the second case, ARPPU will be really low, even if ARPU is the same for both games. Of course, total number of players is also a consideration, not just the average. If your ARPU and ARPPU are both great but youve got a player base of a few thousand when you should have a few million, then thats probably more of a marketing problem than a game design problem. It depends on whats happening to your player base over time, and where you are in the tail of your sales curve. So these three things, sales, ARPU and ARPPU, can give you a lot of information about whether your problem is with acquisition (that is, getting people to try your game the first time), conversion (getting them to pay you money the first time), or retention (getting players to keep coming back for more). And when you overlap these with the changes you make in your game and the updates you offer, a lot of times you can get some really useful correlations between certain game mechanics and increased sales. Another interesting metric to look at is the graph of time-vs-money for the average user. How much do people give you on the day they start their account? What about the day after that, and the day after that? Do you see a large wad of cash up front and then nothing else? A decreasing curve where players try for free for awhile, then spend a lot, then spend incrementally smaller amounts until they hit zero? An increasing curve where players spend a little, then a bit more, then a bit more, until a sudden flameout where they drop your game entirely? Regular small payments on a traditional long tail model? What does this tell you about the value youre delivering to players in your early game mid-game late game elder game progression? While youre looking at revenue, dont forget to take your costs into account. There are two kinds of costs: up-front development, and ongoing costs. The up-front costs are things like development of new features, including both the good ones that increase revenue and also the bad ones that you try out and then discard; keep in mind that your ratio of good-to-bad features will not be perfect, so you have to count some portion of the bad ideas as part of the cost in developing the good ones (this is a type of sunk cost like we discussed in Week 6 when we talked about situational balance). Ongoing costs are things like bandwidth and server costs and customer support, which tend to scale with the number of players. Since a business usually wants to maximize its profits (that is, the money it takes in minus the money it spends) and not its revenue (which is just the money it takes in), youll want to factor these in if youre trying to optimize your development resources. A word of warning (gosh, I seem to be giving a lot of warnings this week): statistics are great at analyzing the past, but theyre a lot trickier if you try to use them to predict the future. For example, a really hot game that just launched might have what initially looks like an exponentially-increasing curve. Its tempting to assume, especially if its a really tight fit with an exponential function, that the trend will continue. But common sense tells us this cant continue indefinitely: the human population is finite, so if your exponential growth is faster than human population growth it has to level off eventually. Business growth curves are usually not exponential, but instead what is called S-shaped where it starts as an exponentially increasing curve and eventually transitions to a logarithmically (that is, slowly) increasing curve, and then eventually levels off or starts decreasing. A lot of investors get really burned when they mistake an S curve for an exponential increase, as we saw (more or less) with the dot-com crash about 10 years ago. Illegal pyramid schemes also tend to go through this kind of growth curve, with the exception that once they reach the peak of the S theres usually a very sudden crash. A Note on Ethics
This is the second time this Summer when talking about game balance that Ive brought up an issue of professional ethics. Its weird how this comes up in discussions of applied mathematics, isnt it? Anyway The ethical consideration here is that a lot of these metrics look at player behavior but they dont actually look at the value added (or removed) from the players lives. Some games, particularly those on Facebook which have evolved to make some of the most efficient use of metrics of any games ever made, have also been accused (by some people) of being blatantly manipulative, exploiting known flaws in human psychology to keep their players playing (and giving money) against their will. Now, this sounds silly when taken to the extreme, because we think of games as something inherently voluntary, so the idea of a game holding us prisoner seems strange. On the other hand, any game youve played for an extended period of time is a game you are emotionally invested in, and that emotional investment does have cash value. If it seems silly to you that Id say a game makes you spend money, consider this: suppose I found all of your saved games and put them in one place. Maybe some of these are on console memory cards or hard disks. Maybe some of them are on your PC hard drive. For online games, your saved game is on some companys server somewhere. And then suppose I threatened to destroy all of them but not to worry, Id replace the hardware. So you get free replacements of your hard drive and console memory cards, a fresh account on every online game you subscribe to, and so on. And then suppose I asked you, how much would you pay me to not do that. And I bet when you think about it, the answer is more than zero, and the reason is that those saved games have value to you! And more to the point, if one of these games threatened to delete all your saves unless you bought some extra downloadable content, you would at least consider it not because you wanted to gain the content, but because you wanted to not lose your save. To be fair, all games involve some kind of psychological manipulation, just like movies and books and all other media (theres that whole thing about suspending our disbelief, for example). And most people dont really have a problem with this; they still see the game experience itself as a net value-add to their life, by letting them live more in the hours they spend playing than they would have lived had they done other activities. But just like difficulty curves, the difference between value added and taken away is not constant; its different from person to person. This is why we have things like MMOs that enhance the lives of millions of subscribers, while also causing horrendous bad events in the lives of a small minority that lose their marriage and family to their game obsession, or that play for so long without attending to basic bodily needs that they keel over and die at the keyboard. So there is a question of how far we can push our players to give us money, or just to play our game at all, before we cross an ethical line especially in the case where our game design is being driven primarily by money-based metrics. As before, I invite you to think about where you stand on this, because if you dont know, the decision will be made for you by someone else who does. If Youre Working on a Game Now If youre working on a game now, as you might guess, my suggestion for any game youre working on is to ask yourself what game design questions could be best answered through metrics: What aspects of your design (especially relating to game balance) do you not know the answers to, at this point in time? Make a list. Of those open questions, which ones could be solved through playtesting, taking metrics, and analyzing them? Choose one question from the remaining list that is, in your opinion, the most vital to your gameplay. Figure out what metrics you want to use, and how you will use statistics to draw conclusions. What are the different things you might see? What would they mean? Make sure you know how youll interpret the data in advance. If youre doing a video game, make sure the game has some way of logging the information
you want. If its a board game, run some playtests and start measuring! Homework This is going to be mostly a thought experiment, more than practical experience, because I couldnt think of any way to force you to actually collect metrics on a game that isnt yours. Choose your favorite genre of game. Maybe an FPS, or RTS, CCG, tabletop RPG, Euro board game, or whatever. Now choose what you consider to be an archetypal example of such a game, one that youre familiar with and preferably that you own. Pretend that you were given the rights to do a remake of this game (not a sequel), that is, your intention was to keep the core mechanics basically the same but just to possibly make some minor changes for the purpose of game balance. Think of it as a version 2.0 of the original. You might have some areas where you already suspect, from your designers instinct, that the game is unbalanced but lets assume you want to actually prove it. Come up with a metrics plan. Assume that you have a ready supply of playtesters, or else existing play data from the initial release, and its just a matter of asking for the data and then analyzing it. Generate a list: What game balance questions would you want answers to, that could be answered with statistical analysis? What metrics would you use for each question? (Its okay if there is some overlap here, where several questions use some of the same metrics.) What analysis would you perform on your metrics to get the answers to each question? That is, what would you do to the data (such as taking means, medians and standard deviations, or looking for correlations)? If your questions are yes or no, what would a yes or no answer look like once you analyzed the data? Additional Resources Here are a few links, in case you didnt get enough reading this week. Much of what I wrote was influenced by these: http://chrishecker.com/Achievements_Considered_Harmful%3F and http://chrishecker.com/Metrics_Fetishism Game designer Chris Hecker gave a wonderful GDC talk this year called Achievements Considered Harmful which talks about a different kind of metric the Achievements we use to measure and reward player performance within a game and why this might or might not be such a good idea. In the second article, he talks about what he calls Metrics Fetishism, basically going into the dangers of relying too much on metrics and not enough on common sense. http://www.gamasutra.com/view/news/29916/GDC_Europe_Playfishs_Valadares_on_Intuition_Ver sus_Metrics_Make_Your_Own_Decisions.php This is a Gamasutra article quoting Playfish studio director Jeferson Valadares at GDC Europe, suggesting when to use metrics and when to use your actual game design skills. http://www.lostgarden.com/2009/08/flash-love-letter-2009-part-2.html Game designer Dan Cook writes on the many benefits of metrics when developing a Flash game. http://www.gamasutra.com/features/20070124/sigman_01.shtml
Written by the same guy who did the Orc Nostril Hair probability article, this time giving a basic primer on statistics rather than probability.
Level 9: Intransitive Mechanics

This Week Welcome back! Today were going to learn about how to balance intransitive mechanics. As a reminder, intransitive is just a geeky way of saying games like Rock-Paper-Scissors that is, games where there is no single dominant strategy, because everything can be beaten by something else. We see intransitive mechanics in games all the time. In fighting games, a typical pattern is that normal attacks are defeated by blocks, blocks are defeated by throws, and throws are defeated by attacks. In real-time strategy games, a typical pattern is that you have fliers that can destroy infantry, infantry that works well against archers, and archers are great at bringing down fliers. Turn-based strategy games often have some units that work well against others, an example pattern being that heavy tanks lose to anti-tank infantry which loses to normal infantry which lose to heavy tanks. First-person shooters sometimes have an intransitive relationship between different weapons or vehicles, like rocket launchers being good against tanks (since theyre slow and easy to hit) which are good against light vehicles (which are destroyed by the tanks fast rate of fire once they get in range) which in turn are good against rocket launchers (since they can dodge and weave around the slow incoming rockets). MMOs and tabletop RPGs often have some character classes that are particularly good at fighting against other classes, as well. So you can see that intransitive mechanics are in all kinds of places. Some of these relationships might not be immediately obvious. For example, consider a game where one kind of unit has long-range attacks, which is defeated by a short-range attacker who can turn invisible; this in turn is defeated by a medium-range attacker with radar that reveals invisible units; and the medium-range attacker is of course weak against the long-range attacker. Sometimes its purely mathematical; in Magic: the Gathering, a 1/3 creature will lose in combat to a 3/2 creature, which loses to a 2/1 First Strike creature, which in turn loses to the original 1/3 creature. Within the metagame of a CCG you often have three or four dominant decks, each one designed to beat one or more of the other ones. These kinds of things arent even necessarily designed with the intention of being intransitive, but that is what ends up happening. Solutions to intransitive mechanics Today were going to get our hands pretty dirty with some of the mathiest math weve done so far, borrowing when needed from the tools of algebra, linear algebra, and game theory. In the process well learn how to solve intransitive mechanics, so that we can learn more about how these work within our game and what we can expect from player behavior at the expert level. What does a solution look like here? It cant be a cost curve, because each choice wins sometimes and loses sometimes. Instead its a ratio of how often you choose each available option, and how often you expect your opponent to choose each of their options. For example, building an army of 30% archers, 50% infantry, 20% fliers (or 3:5:2) might be a solution to an intransitive game featuring those units, under certain conditions. As a game designer, you might desire certain game objects to be used more or less frequently than others, and by changing the relative costs and availability of each object you can change the optimal mix of objects that players will use in play. By designing your game specifically to have one or more optimal strategies of your choosing, you will know ahead of time how the game is likely to develop. For example, you might want certain things to only happen rarely during normal play but be spectacular when they do, and if you understand how your costs affect relative frequencies, you can design a game to be like that intentionally. (Or, if it seems like in playtesting, your players are using one thing a lot more than another, this kind of analysis may be able to shed light on why that is.)
Who Cares? It may be worth asking, if all intransitive mechanics are just glorified versions of Rock-PaperScissors, whats the appeal? Few people play Rock-Paper-Scissors for fun, so why should they enjoy a game that just uses the same mechanics and dresses them differently? For one thing, an intransitive game is at least more interesting than one with a single dominant strategy (Rock-Rock-Rock) because you will see more variety in play. For another, an intransitive mechanic embedded in a larger game may still allow players to change or modify their strategies in mid-game. Players may make certain choices in light of what they observe other players doing now (in real-time), particularly in action-based games where you must react to your opponents reaction to your reaction to their action in the space of a few milliseconds. In games with bluffing mechanics, players may make choices based on what theyve observed other players doing in the past and trying to use that to infer their future moves, which is particularly interesting in games of partial but incomplete information (like Poker). So, hopefully you can see that just because a game has an intransitive mechanic, does not mean its as dull as Rock-PaperScissors. Additionally, intransitive mechanics serve as a kind of emergency brake on runaway dominant strategies. Even if you dont know exactly what the best strategy in your game is, if all strategies have an intransitive relationship, you can at least know that there will not be a single dominant strategy that invalidates all of the others, because it will be weak against at least one other counterstrategy. Even if the game itself is unbalanced, intransitive mechanics allow for a metagame correction not an ideal thing to rely on exclusively (such a thing would be very lazy design), but better to have a safety net than not if youre releasing a game where major game balance changes cant be easily made after the fact. So, if Ive managed to convince you that intransitive mechanics are worth including for at least some kinds of games, get ready and lets learn how to solve them! Solving the basic RPS game Lets start by solving the basic game of Rock-Paper-Scissors to see how this works. Since each throw is theoretically as good as any other, we would expect the ratio to be 1:1:1, meaning you choose each throw equally often. And that is what well find, but its important to understand how to get there so that we can solve more complex problems. First, lets look at the outcomes. Lets call our opponents throws r, p and s, and our throws R, P and S (we get the capital letters because were awesome). Since winning and losing are equal and opposite (that is, one win + one loss balances out) and draws are right in the middle, lets call a win +1 point, a loss -1 point, and a draw 0 points. The math here would actually work for any point values really, but these numbers make it easiest. We now construct a table of results: r R P S p 0 +1 -1 s -1 0 +1 +1 -1 0
Of course, this is from our perspective for example, if we throw (R)ock and opponent throws (s)cissors, we win, for a net +1 to our score. Our opponents table would be the reverse. Lets re-frame this a little bit, by calling r, p and s probabilities that the opponent will make each respective throw. For example, suppose you know ahead of time that your opponent is using a strategy of r=0.5, p=s=0.25 (that is, they throw 2 rock for every paper or scissors). Whats the best counter-strategy? To answer that question, we can construct a set of three equations that tells you your payoffs for
each throw: Payoff for R = 0r + (-1)p + 1s = s-p Payoff for P = 1r + 0p + (-1)s = r-s Payoff for S = (-1)r + 1p + 0s = p-r So based on the probabilities, you can calculate the payoffs. In the case of our rock-heavy opponent, the payoffs are R=0, P=0.25, S=-0.25. Since P has the best payoff of all three throws, assuming the opponent doesnt vary their strategy at all, our best counter-strategy is to throw Paper every time, and we expect that we will gain 0.25 per throw that is, out of every four throws, well win one more game than we lose. In fact, well find that if our opponent merely throws rock the tiniest, slightest bit more often than the others, the net payoff for P will be better than the others, and our best strategy is still to throw Paper 100% of the time, until our opponent modifies their strategy. This is significant; it tells us that an intransitive mechanic is very fragile, and that even a slight imbalance on the players part can lead to a completely dominant strategy on the part of the opponent. Of course, against a human opponent who notices were always throwing P, their counter-strategy would be to throw a greater proportion of s, which then forces us to throw some R, which then causes them to throw p, which makes us throw S, which makes them throw r, and around and around we go. If were both constantly adjusting our strategies to counter each other, do we ever reach any point where both of us are doing the best we can? Over time, do we tend towards a stable state of some kind? Some Math Theorems Before answering that question, there are a couple of things Im going to ask you to trust me on; people smarter than me have actually proved these mathematically, but this isnt a course in math proofs so Im handwaving over that part of things. I hope youll forgive me for that. First is that if the game mechanics are symmetric (that is, both players have exactly the same set of options and they work the same way), the solution will end up being the same for both players; the opponents probability of choosing Rock is the same as our probability. Second is that each payoff must be the same as the other payoffs; that is, R = P = S; if any strategy is worth choosing at all, it will provide the same payoff as all other valid strategies, because if the payoff were instead any less than the others it would no longer be worth choosing (youd just take something else with a higher payoff), and if it were any higher than the others youd choose it exclusively and ignore the others. Thus, all potential moves that are worth taking have the same payoff. Lastly, in symmetric zero-sum games specifically, the payoff for everything must be zero (because the payoffs are going to be the same for both players due to symmetry, and the only way for the payoffs to sum to zero and still be equal is if theyre both zero). To summarize: All payoffs that are worth taking at all, give an equal payoff to each other. Symmetric zero-sum games have all payoffs equal to zero. Symmetric games have the same solution for all players. Finishing the RPS Solution Lets go back to our equations. Rock-Paper-Scissors is a symmetric zero-sum game, so: 1. R = P = S = 0. Since the opponent must select exactly one throw, we also know the probabilities of their throw add up to 100%:
r+p+s=1 From here we can solve the system of equations by substitution: R = 0 = s-p, therefore p=s P = 0 = r-s, therefore r=s S = 0 = p-r, therefore p=r r+p+s = r+r+r = 1, therefore r=1/3 Since r=p=s, p=1/3, s=1/3
So our solution is that the opponent should throw r, p and s each with probabilities of 1/3. This suggests that against a completely random opponent it doesnt matter what we choose, our odds of winning are the same no matter what. Of course, the opponent knows this too, so if we choose an unbalanced strategy they can alter their throw ratio to beat us; our best strategy is also to choose each throw with 1/3 probability. Note that in actual play, this does not mean that the best strategy is to actually play randomly (say, by rolling a die secretly before each throw)! As Ive said before, when humans try to play randomly, they tend to not do a very good job of it, so in the real world the best strategy is still to play each throw about as often as any other, but at the same time which throw you choose depends on your ability to detect and exploit patterns in your opponents play, while at the same time masking any apparent patterns in your own play. So our solution of 1:1:1 does not say which throw you must choose at any given time (that is in fact where the skill of the game comes in), but just that over time we expect the optimal strategy to be a 1:1:1 ratio (because any deviation from that hands your opponent a strategy that wins more often over you until you readjust your strategy back to 1:1:1). Solving RPS with Unequal Scoring The previous example is all fine and good for Rock-Paper-Scissors, but how can we apply this to something a little more interesting? As our next step, lets change the scoring mechanism. For example, in fighting games theres a common intransitive system that attacks beat throws, throws beat blocks, and blocks beat attacks, but each of these does a different amount of damage, so they tend to have different results in the sense that each choice puts a different amount at risk. How does Rock-Paper-Scissors change when we mess with the costs? Heres an example. Suppose I make a new rule: every win using Rock counts double. You could just as easily frame it like this: in a fighting game, attacks do normal damage, and blocks do the same amount of damage as an attack (lets say that a successful block allows for a counterattack), but that throws do twice as much damage as an attack or block. But lets just say every win with Rock counts double for simplicity here. How does that affect our probabilities? Again we start with a payoff table: r R P S p 0 +1 -2 s -1 0 +1 +2 -1 0
We then use this to construct our three payoff equations: R = 2s-p P = r-s S = p-2r Again, the game is zero-sum and symmetric, and both us and our opponent must choose exactly one throw, so we still have:
R=P=S=0 r+p+s = 1 Again we solve: R = 0 = 2s-p, therefore 2s = p P = 0 = r-s, therefore r = s S = 0 = p-2r, therefore 2r = p r+p+s = r+2r+r = 1, therefore r=1/4 r=s, therefore s=1/4 2r=p, therefore p=1/2
So here we get a surprising result: if we double the wins for Rock, the end result is that Paper gets chosen half of the time, while Rock and Scissors each get chosen a quarter of the time! This is an answer youd be unlikely to come up with on your own without doing the math, but in retrospect it makes sense: since Scissors is such a risky play, players are less likely to choose it. If you know your opponent is not likely to play Scissors, Paper is more likely to either draw or win, so it is actually Paper (and not Rock) that is played more frequently. So if you had a fighting game where a successful throw does twice as much damage as a successful attack or a successful block, but you do as much damage with a block or an attack, then youd actually expect to see twice as many attack attempts as throws or blocks! Solving RPS with Incomplete Wins Suppose we factor resource costs into this. Fighting games typically dont have a cost associated with performing a move (other than time, perhaps), but RTS games usually have actual resource costs to produce units. Lets take a simple RTS game where you have knights that beat archers, archers beat fliers, and fliers beat knights. Lets say further that if you send one type of unit against the same type, they kill each other mutually so there is no net gain or loss on either side, but that its a little different with winners. Lets say that when knights attack archers, they win, but they still lose 20% of their health to the initial arrow volley before they close the ranks. And lets say against fliers, archers lose 40% of their health to counterattacks. But against knights, fliers take no damage at all, because the knights cant do anything other than stand there and take it (their swords dont work too well against enemies a hundred feet above them, dropping rocks down on them from above). Finally, lets say that knights cost 50 gold, archers cost 75, and fliers cost 100. Now how does this work? We start with the payoff table: k K A F 50-50=0 -75+(0.2*50)= -65 +50 a (-50*0.2)+75=+65 75-75=0 -100+(75*0.4)= -70 f -50 (-75*0.4)+100=+70 100-100=0
To explain: if we both take the same unit it ends up being zero, thats just common sense, but really whats going on is that were both paying the same amount and both lose the unit. So we both actually have a net loss, but relative to each other its still zero-sum (for example, with Knight vs Knight, we gain +50 Gold relative to the opponent by defeating their Knight, but also lose -50 Gold because our own Knight dies as well, and adding those results together we end up with a net gain of zero). What about when our Knight meets an enemy Archer? We kill their Archer, which is worth a 75gold advantage, but they also reduced our Knights HP by 20%, so you could say we lost 20% of our Knight cost of 50, which means we lost an equivalent of 10 gold in the process. So the actual outcome is were up by 65 gold.
When our Knight meets an enemy Flier, we lose the Knight so were down 50 gold. It didnt hurt the opponent at all. Where does the Flier cost of 100 come in? In this case it doesnt, really the opponent still has a Flier after the exchange, so they still have 100 gold worth of Flier in play, theyve lost nothing at least, not yet! So in the case of different costs or incomplete victories, the hard part is just altering your payoff table. From there, the process is the same: K = 0k + 65a + (-50)f = 65a-50f A = (-65)k + 0a + 70f = 70f-65k F = 50k + (-70)a + 0f = 50k-70a K=A=F=0 k+a+f = 1 K = 0 = 65a-50f, therefore 65a = 50f A = 0 = 70f-65k, therefore 70f = 65k, therefore f = (13/14)k F = 0 = 50k-70a, therefore 50k = 70a, therefore a = (10/14)k k+a+f = k + (10/14)k + (13/14)k = (37/14)k = 1, therefore k = 14/37 f = (13/14)k = (13/14)(14/37), therefore f = 13/37 a = (10/14)k = (10/14)(14/37), therefore a = 10/37
Solving, we find:
In this case youd actually see a pretty even mix of units, with knights being a little more common and archers a little less. If you wanted fliers to be more rare you could play around with their costs, or allow knights to do a little bit of damage to them, or something. Solving RPS with Asymmetric Scoring So far weve assumed a game thats symmetric: we both have the exact same set of throws, and we both win or lose the same amount according to the same set of rules. But not all intransitive games are perfectly symmetric. For example, suppose I made a Rock-Paper-Scissors variant where each round, I flip up a new card that alters the win rewards. This round, my card says that my opponent gets two points for a win with Rock, but I dont (I would just score normally). How does this change things? It actually complicates the situation a great deal, because now both players must figure out the probabilities of their opponents throws, and those probabilities may not be the same anymore! Lets say that Player A has the double-Rock-win bonus, and Player B does not. Whats the optimal strategy for both players? And how much of an advantage does this give to Player A, if any? Lets find out by constructing two payoff tables. Player As payoff table looks like this: rB RA PA SA rA RB PB SB pB 0 +1 -1 pA 0 +1 -2 sB -1 0 +1 sA -1 0 +1 +1 -1 0 +2 -1 0
Player Bs payoff table looks like this:
Here we can assume that RA=PA=SA and RB=PB=SB, and also that rA+pA+sA = rB+pB+sB = 1. However, we cannot assume that RA=PA=SA=RB=PB=SB=0, because we dont actually know that the payoffs for players A and B are equal; in fact, intuition tells us they probably arent! We now have this intimidating set of equations: RA = 2sB pB PA = rB sB SA = pB rB RB = sA pA PB = rA sA SB = pA 2rA RA = PA = SA RB = PB = SB rA + pA + sA = 1 rB + pB + sB = 1
We could do this the hard way through substitution, but an easier way is to use matrices. Heres how it works: we rewrite the payoff tables as matrices. Heres the first one: RA [ SA 0 PA -1 -1 +1 +1 +2 0 0 -1 ]
Here, the left column represents the left side of the first three equations above, the second column is rA, the third column is pA, and the fourth column is sA. Two changes for clarity: first, lets move the leftmost column to the right instead, which will make it easier to work with; and second, since RA=PA=SA, lets just replace them all with a single variable X, which represents the net payoff for Player A: 0 [ -1 -1 +1 +1 +2 0 0 X -1 X X ]
This is just a shorthand way of writing down these three equations, omitting the variable names but keeping them all lined up in the same order so that each column represents a different variable: 0rB 1rB -1rB -1pB +2sB =X =X =X +0pB -1sB +1pB +0sB
Algebra tells us we can multiply everything in an equation by a constant and its still true (which means we could multiply any row of the matrix by any value and its still valid, as long as we multiply all four entries in the row by the same amount). Algebra also tells us that we can add both sides of two equations together and the result is still true, meaning we could add each entry of two rows together and the resulting row is still a valid entry (which we could use to add to the rows already there, or even replace an existing row with the new result). And we can also rearrange the rows, because all of them are still true no matter what order we put them in. What we want to do here is put this matrix in whats called triangular form, that is, of the form where everything under the diagonal is zeros, and the diagonals themselves (marked here with an asterisk) have to be nonzero: * [ ? 0 ? * ? ? ? ]
0 -1 [ +1
0 +1 0 0
* 0 -1 -1
? X +2 X X ]
So, first we reorder them by swapping the top and middle rows:
To eliminate the +1 in the bottom row, we add the top and bottom rows together and replace the bottom row with that: -1 + 0 -1 [ 0 +1 +1 +1 +1 0 +1 0 0 -1 0 -1 -1 X -1 2*X X +2 2*X X ] X
Our matrix is now:
Now we want to eliminate the +1 on the bottom row, so we add the middle and bottom rows together and replace the bottom row with the result: -1 [ 0 +1 0 0 0 -1 +1 X +2 3*X X ]
Now we can write these in the standard equation forms and solve, going from the bottom up, using substitution: +1(sB) = 3*X, therefore sB = 3*X -1(pB) +2(sB) = X, therefore -1(pB)+2(3*X) = X, therefore pB = 5*X -1(rB) + 1(pB) = X, therefore rB = 4*X At this point we dont really need to know what X is, but we do know that the ratio for Player B is 3 Scissors to 5 Paper to 4 Rock. Since sB+pB+rB = 1, this means: rB = 4/12 pB = 5/12 sB = 3/12 We can use the same technique with the second set of equations to figure out the optional ratio for Player A. Again, the payoff table is: rA RB PB SB 0 [ -2 pA 0 +1 -2 -1 +1 +1 sA -1 0 +1 +1 0 0 +1 -1 0 RB -1 SB PB ]
This becomes the following matrix:
Again we reorganize, and since RB=PB=SB, lets call these all a new variable Y (we dont use X to avoid confusion with the previous X; remember that the payoff for one player may be different from
the other here). Lets swap the bottom and top this time, along with replacing the payoffs by Y: -2 [ 0 +1 +1 -1 0 0 +1 Y -1 Y Y ]
To eliminate the +1 in the center row, we have to multiply the center row by 2 before adding it to the top row (or, multiply the top row by 1/2, but I find it easier to multiply by whole numbers than fractions). -2 + 0 -2 [ 0 -2 [ 0 +1 +1*2 +1 +1 0 -1 +1 0 0 0 0*2 -2 0 +1 +1 0 +1 -1 Y -1*2 Y*3 Y -2 Y Y -2 Y*4 Y*3 ] Y*3 ] Y*2
Our matrix is now:
Adding second and third rows to eliminate the -1 in the bottom row we get:
Again working backwards and substituting: sA = -Y*4 pA 2sA = Y*3, therefore pA = -Y*5 -2rA + pA = Y, therefore -2rA = 6Y, therefore rA = -Y*3 Now, it might seem kind of strange that we get a bunch of negative numbers here when we got positive ones before. This is probably just a side effect of the fact that the average payoff for Player A is probably positive while Player Bs is probably negative, but in either case it all factors out because we just care about the relative ratio of Rock to Paper to Scissors. For Player A, this is 3 Rock to 4 Scissors to 5 Paper: rA = 3/12 rB = 4/12 pA = 5/12 pB = 5/12 sA = 4/12 sB = 3/12 This is slightly different from Player Bs optimal mix: Now, we can use this to figure out the actual advantage for Player A. We could do this through actually making a 1212 chart and doing all 144 combinations and counting them up using probability, or we could do a Monte Carlo simulation, or we could just plug these values into our existing equations. For me that last one is the easiest, because we already have a couple of equations from earlier that directly relate these together: sA = -Y*4, therefore Y = -1/12 rB = X*4, therefore X = +1/12 We know that RA=PA=SA and RB=PB=SB, so this means the payoff for Player A is +1/12 and for Player B its -1/12. This makes a lot of sense and acts as a sanity check: since this is still a zero-sum game, we know that the payoff for A must be equal to the negative payoff for B. In a symmetric game both would have to be zero, but this is not symmetric. That said, it turns out that if both
players play optimally, the advantage is surprisingly small: only one extra win out of every 12 games if both play optimally! Solving Extended RPS So far all of the relationships weve analyzed have had only three choices. Can we use the same technique with more? Yes, it just means we do the same thing but more of it. Lets analyze the game Rock-Paper-Scissors-Lizard-Spock. In this game, Rock beats Scissors and Lizard; Paper beats Rock and Spock; Scissors beats Paper and Lizard; and Lizard beats Spock (and Lizard beats Paper, Spock beats Scissors and Rock). Our payoff table is (with k for Spock since theres already an s for Scissors, and z for Lizard so it doesnt look like the number one): r R P S Z K p 0 +1 -1 -1 +1 s -1 0 +1 +1 -1 z +1 -1 0 -1 +1 k +1 -1 +1 0 -1 -1 +1 -1 +1 0
We also know r+p+s+z+k=1, and R=P=S=L=K=0. We could solve this by hand as well, but theres another way to do this using Excel which makes things slightly easier sometimes. First, you would enter in the above matrix in a 55 grid of cells somewhere. Youd also need to add another 15 column of all 1s (or any non-zero number, really) to represent the variable X (the payoff) to the right of your 55 grid. Then, select a new 15 column thats blank (just click and drag), and then enter this formula in the formula bar: =MMULT(MINVERSE(A1:E5),F1:F5) For the MINVERSE parameter, put the top left and lower right cells of your 55 grid (I use A1:E5 if the grid is in the extreme top left corner of your worksheet). For the final parameter (I use F1:F5 here), give the 15 column of all 1s. Finally, and this is important, press Ctrl+Shift+Enter when youre done typing in the formula (not just Enter). This propagates the formula to all five cells that youve highlighted and treats them as a unified array, which is necessary. One warning is that this method does not always work; in particular, if there are no solutions or infinite solutions, it will give you #NUM! as the result instead of an actual number. In fact, if you enter in the payoff table above, it will give you this error; by setting one of the entries to something very slightly different (say, changing one of the +1s to +0.999999), you will generate a unique solution that is only off by a tiny fraction, so round it to the nearest few decimal places for the real answer. Another warning is that anyone who actually knows a lot about math will wince when you do this, because its kind of cheating and youre really not supposed to solve a matrix like that. Excel gives us a solution of 0.2 for each of the five variables, meaning that it is equally likely that the opponent will choose any of the five throws. We can then verify that yes, in fact, R=P=S=L=K=0, so it doesnt matter which throw we choose, any will do just as well as any other if the opponent plays randomly with equal chances of each throw. Solving Extended RPS with Unequal Relationships Not all intransitive mechanics are equally balanced. In some cases, even without weighted costs, some throws are just better than other throws. For example, lets consider the unbalanced game of Rock-Paper-Scissors-Dynamite. The idea is that with this fourth throw, Dynamite beats Rock (by explosion), and Scissors beats Dynamite (by cutting the wick). People will argue which should win in a contest between Paper and Dynamite, but for our purposes lets say Dynamite beats Paper. In
theory this makes Dynamite and Scissors seem like really good choices, because they both beat two of the three other throws. It also makes Rock and Paper seem like poor choices, because they both lose to two of the other three throws. What does the actual math say? Our payoff table looks like this: r R P S D p 0 +1 -1 +1 s -1 0 +1 +1 d +1 -1 0 -1 -1 -1 +1 0
Before we go any further, we run into a problem: if you look closely, youll see that Dynamite is better than or equal to Paper in every situation. That is, for every entry in the P row, it is either equal or less than the corresponding entry in the D row (and likewise, every entry in the p column is worse or equal to the d column). Both Paper and Dynamite lose to Scissors, both beat Rock, but against each other Dynamite wins. In other words, there is no logical reason to ever take Paper because whenever youd think about it, you would take Dynamite instead! In game theory terms, we say that Paper is dominated by Dynamite. If we tried to solve this matrix mathematically like we did earlier, we would end up with some very strange answers and wed quickly find it was unsolvable (or that the answers made no sense, like a probability for r, p, s or d that was less than zero or greater than one). The reason it wouldnt work is that at some point we would make the assumption that R=P=S=D, but in this case that isnt true the payoff for Paper must be less than the payoff for Dynamite, so it is an invalid assumption. To fix this, before proceeding, we must systematically eliminate all choices that are dominated. In other words, remove Paper as a choice. The new payoff table becomes: R R S D s 0 -1 +1 d +1 0 -1 -1 +1 0
We check again to see if, after the first set of eliminations, any other strategies are now dominated (sometimes a row or column isnt strictly dominated by another, until you cross out some other dominated choices, so you do have to perform this procedure repeatedly until you eliminate everything). Again, to check for dominated strategies, you must compare every pair of rows to see if one dominates another, and then every pair of columns in the same way. Yes, this means a lot of comparisons if you give each player ten or twelve choices! In this case eliminating Paper was all that was necessary, and in fact were back to the same exact payoff table as with the original Rock-Paper-Scissors, but with Paper being renamed to Dynamite. And now you know, mathematically, why it never made sense to add Dynamite as a fourth throw. Another Unequal Relationship What if instead we created a new throw that wasnt weakly dominated, but that worked a little different than normal? For example, something that was equivalent to Scissors except it worked in reverse order, beating Rock but losing to Paper? Lets say Construction Vehicle (C), which bulldozes (wins against) Rock, is given a citation by (loses against) Paper, and draws with Scissors because neither of the two can really interact much. Now our payoff table looks like this: r p s c
R P S C
0 +1 -1 +1
-1 0 +1 -1
+1 -1 0 0
-1 +1 0 0
Here, no single throw is strictly better than any other, so we start solving. We know r+p+s+c=1, and the payoffs R=P=S=D=0. Our matrix becomes: 0 +1 [ +1 -1 0 -1 -1 +1 -1 +1 0 -1 +1 0 0 0 0 0 0 0 ]
Rearranging the rows to get non-zeros along the diagonal, we get this by reversing the order from top to bottom: +1 -1 [ 0 -1 +1 +1 -1 0 0 0 +1 0 0 -1 -1 0 0 +1 0 0 ]
Zeroing the first column by adding the first two rows, and subtracting the third from the first, we get: +1 0 [ 0 -1 0 0 -1 0 0 -1 +1 0 0 +1 -1 0 0 -1 0 0 ]
Curious! The second row is all zeros (which gives us absolutely no useful information, as its just telling us that zero equals zero), and the bottom two rows are exactly the same as one another (which means the last row is redundant and again tells us nothing extra). We are left with only two rows of useful information. In other words, we have two equations (three if you count r+p+s+c=1) and four unknowns. What this means is that there is actually more than one valid solution here, potentially an infinite number of solutions. We figure out the solutions by hand: r-p=0, therefore r=p -p+s-c=0, therefore c=s-p Substituting into r+p+s+c=1, we get: p+p+s+(s-p)=1, therefore p+2s=1, therefore p=1-2s (and therefore, r=1-2s). Substituting back into c=s-p, we get c=s-1+2s, therefore c=3s-1. We have thus managed to put all three other variables in terms of s: p=1-2s r=1-2s c=3s-1 So it would seem at first that there are in fact an infinite number of solutions: choose any value for s, then that will give you the corresponding values for p, r, and c. But we can narrow down the
ranges even further. How? By remembering that all of these variables are probabilities, meaning they must all be in the range of 0 (if they never happen) to 1 (if they always happen). Probabilities can never be less than zero or greater than 1. This lets us limit the range of s. For one thing, we know it must be between 0 and 1. From the equation c=3s-1, we know that s must be at least 1/3 (otherwise c would be negative) and s can be at most 2/3 (otherwise c would be greater than 100%). Looking instead at p and r, we know s can range from 0 up to 1/2. Combining the two ranges, s must be between 1/3 and 1/2. This is interesting: it shows us that no matter what, Scissors is an indispensible part of all ideal strategies, being used somewhere between a third and half of the time. At the lower boundary condition (s=1/3), we find that p=1/3, r=1/3, c=0, which is a valid strategy. At the upper boundary (s=1/2), we find p=0, r=0, c=1/2. And we could also opt for any strategy in between, say s=2/5, p=1/5, r=1/5, c=1/5. Are any of these strategies better than the others, such that a single one would win more than the others? That unfortunately requires a bit more game theory than I wanted to get into today, but I can tell you the answer is it depends based on certain assumptions about how rational your opponents are, whether the players are capable of making occasional mistakes when implementing their strategy, and how much the players know about how their opponents play, among other things. For our purposes, we can say that any of these is as good as any other, although Im sure professional game theorists could philosophically argue the case for certain values over others. Also, for our purposes, we could say that Construction Vehicle is probably not a good addition to the core game of Rock-Paper-Scissors, as it allows one winning strategy where the throw of C can be completely ignored, and another winning strategy where both P and R are ignored, making us wonder why were wasting development resources on implementing two or three throws that may never even see play once the players are sufficiently skilled! Solving the Game of Malkav So far weve systematically done away with each of our basic assumptions: that a game has a symmetric payoff, that its zero-sum, that there are exactly three choices. Theres one other thing that we havent covered in the two-player case, and thats what happens if the players have a different selection of choices not just an asymmetric payoff, but an asymmetric game. If we rely on there being exactly as many throws for one player as the other, what happens when one player has, say, six different throws when their opponent has only five? It would seem such a problem would be unsolvable for a unique solution (there are six unknowns and only five equations, right?) but in fact it turns out we can use a more powerful technique to solve such a game uniquely, in some cases. Let us consider a card called Game of Malkav from an obscure CCG that most of you have probably never heard of. It works like this: all players secretly and simultaneously choose a number. The player who played this card chooses between 1 and 6, while all other players choose between 1 and 5. Each player gains as much life as the number they choose unless another player chose a number exactly one less, in which case they lose that much life instead. So for example if you choose 5, you gain 5 life, unless any other player chose 4. If anyone else chose 4, you lose 5 life and they gain 4, unless someone else also chose 3, and so on. This can get pretty complicated with more players, so lets simply consider the two-player case. Lets also make the simplifying assumption that the game is zero-sum, and that you gaining 1 life is equivalent in value to your opponent losing 1 life (I realize this is not necessarily valid, and this will vary based on relative life totals, but at least its a starting point for understanding what this card is actually worth). We might wonder, what is the expected payoff of playing this card, overall? Does the additional option of playing 6 when your opponent can only play up to 5? What is the best strategy, and what
is the expected end result? In short, is the card worth playing and if so, when you play it, how do you decide what to choose? As usual, we start with a payoff table. Lets call the choices P1-P6 (for the Player who played the card), and O1-O5 (for the Opponent): O1 P1 P2 P3 P4 P5 P6 O2 0 -3 +2 +3 +4 +5 O3 +3 0 -5 +2 +3 +4 O4 -2 +5 0 -7 +2 +3 O5 -3 -2 +7 0 -9 +2 -4 -3 -2 +9 0 -11
We could try to solve this, and there do not appear to be any dominated picks for either player, but we will quickly find that the numbers get very hairy very fast and also that it ends up being unsolvable for reasons that you will find if you try. Basically, with 6 equations and 5 unknowns, there is redundancy except in this case, no rows cancel, and instead you end up with at least two equations that contradict each other. So there must actually be some dominated strategies here its just that they arent immediately obvious, because there are a set of rows or columns that are collectively dominated by another set, which is much harder to just find by looking. How do we find them? We start by finding the best move for each player, if they knew what the opponent was doing ahead of time. For example, if the opponent knows we will throw P1, their best move is O5 (giving them a net +4 and us a net -4). But then we continue by reacting to their reaction: if the player knows the opponent will choose O5, their best move is P4. But against P4, the best move is O3. Against O3, the best move is P2. Against P2, there are two equally good moves: O1 and O5, so we consider both options: Against O5, the best response is P4, as before (and we continue around in the intransitive sequence O5->P4->O3->P2->O5 indefinitely). Against O1, the best response is P6. Against P6, the best response is O5, which again brings us into the intransitive sequence O5->P4->O3->P2->O1->P6->O5. What if we start at a different place, say by initially throwing P3? Then the opponents best counter is O2, our best answer to that is P6, which then leads us into the O5->P4->O3->P2->O1->P6->O5 loop. If we start with P5, best response is O4, which gets the response P3, which we just covered. What if we start with O1, O2, O3, O4, O5, P2, P4 or P6? All of them are already accounted for in earlier sequences, so theres nothing more to analyze. Thus, we see that no matter what we start with, eventually after repeated play only a small subset of moves actually end up being part of the intransitive nature of this game because they form two intransitive loops (O5/P4/O3/P2, and O5/P4/O3/P2/O1/P6). Looking at these sequences, the only choices ever used by either player are O1, O3, O5 and P2, P4, P6. Any other choice ends up being strictly inferior: for example, at any point where it is advantageous to play P6 (that is, you are expecting a positive payoff), there is no reason you would prefer P5 instead (even if you expect your opponent to play O5, your best response is not P5, but rather P4). By using this technique to find intransitive loops, you can often reduce a larger number of choices to a smaller set of viable ones or at worst, you can prove that all of the larger set are in fact viable. Occasionally you will find a game (Prisoners Dilemma being a famous example, if youve heard of that) where there are one or more locations in the table that are equally advantageous for both players, so that after repeated play we expect all players to be drawn to those locations; game
theorists call these Nash Equilibriums after the mathematician who first wrote about them. Not that you need to care. So in this case, we can reduce the table to the set of meaningful values: O1 P2 P4 P6 O3 -3 +3 +5 O5 +5 -7 +3 -3 +9 -11
From there we solve, being aware that this is not symmetric. Therefore, we know that O1=O3=O5 and P2=P4=P6, but we do not know if they are all equal to zero or if one is the negative of the other. (Presumably, P2 is positive and O1 is negative, since we would expect the person playing this card to have an advantage, but well see.) We construct a matrix, using X to stand for the Payoff for P2, P4 and P6: -3 [ +5 +5 +3 +3 -3 -7 -11 X +9 X X ]
This can be reduced to triangular form and then solved, the same as earlier problems. Feel free to try it yourself! I give the answer below. Now, solving that matrix gets you the probabilities O1, O3 and O5, but in order to learn the probabilities of choosing P2, P4 and P6 you have to flip the matrix across the diagonal so that the Os are all on the left and the Ps are on the top (this is called a transpose). In this case wed also need to make all the numbers negative, since such a matrix is from Player Os perspective and therefore has the opposite payoffs: +3 [ +3 -3 -5 -9 -5 +7 +11 Y -3 Y Y ]
This, too, can be solved normally. If youre curious, the final answers are roughly: P2:P4:P6 = 49% : 37% : 14% O1:O3:O5 = 35% : 41% : 24% Expected payoff to Player P (shown as X above): 0.31, and the payoff for Player O (shown as Y) is the negative of X: -0.31. In other words, in the two-player case of this game, when both players play optimally, the player who initiated this card gets ahead by an average of less than one-third of a life point so while we confirm that playing the card and having the extra option of choosing 6 is in fact an advantage, it turns out to be a pretty small one. On the other hand, the possibility of sudden large swings may make it worthwhile in actual play (or maybe not), depending on the deck youre playing. And of course, the game gets much more complicated in multi-player situations that we havent considered here. Solving Three-Player RPS So far weve covered just about every possible case for a two-player game, and you can combine the different methods as needed for just about any application, for any kind of two-player game. Can we extend this kind of analysis to multiple players? After all, a lot of these games involve more than just a single head-to-head, they may involve teams or free-for-all environments.
Teams are straightforward, if theres only two teams: just treat each team as a single player for analysis purposes. Free-for-all is a little harder because you have to manage multiple opponents, and as well see the complexity tends to explode with each successive player. Three-player games are obnoxious but still quite possible to solve; four-player games are probably the upper limit of what Id ever attempt by hand using any of the methods Ive mentioned today. If you have a sixplayer free-for-all intransitive game where each player has a different set of options and a massive payoff matrix that gives payoffs to each player for each combination well, lets just say it can be done, probably requiring the aid of a computer and a professional game theorist, but at this point you wouldnt want to. One thing that game theorists have learned is that the more complex the game, the longer it tends to take human players in a lab to converge on the optimal strategies which means for a highly complex game, playtesting will give you a better idea of how the game actually plays in the field than doing the math to prove optimal solutions, because the players probably wont find the optimal solutions anyway. Thus, for a complicated system like that, youre better off playtesting or more likely, youre better off simplifying your mechanics! Lets take a simple multi-player case: three-player Rock-Paper-Scissors. We define the rules like this: if all players make the same throw or if all players each choose different throws, we call it a draw. If two players make the same throw and the third player chooses a different one (odd man out), then whoever throws the winning throw gets a point from each loser. So if two players throw Rock and the third throws Scissors, each of the Rock players get +1 point and the unfortunate Scissors player loses 2 points. Or if its reversed, one player throwing Rock while two throw Scissors, the one Rock player gets +2 points while the other two players lose 1 point each. (The idea behind these numbers is to keep the game zero-sum, for simplicity, but you could use this method to solve for any other scoring mechanism.) Of course, we know because of symmetry that the answer to this is 1:1:1, just like the two-player version. So lets throw in the same wrinkle as before: wins with Rock count double (which also means, since this is zero-sum, that losses with Scissors count double). In the two-player case we found the solution of Rock=Scissors=1/4, Paper=1/2. Does this change at all in the three-player case, since there are now two opponents which make it even more dangerous to throw Scissors (and possibly even more profitable to throw Rock)? The trick we need to use here to make this solvable is to look at the problem from a single players perspective, and treat all the opponents collectively as a single opponent. In this case, we end up with a payoff table that looks like this: rr R P S rp 0 +2 -4 rs -1 +1 0 pp +2 0 -2 ps -2 0 +2 ss 0 -1 +1 +4 -2 0
You might say: wait a minute, theres three variables here and six unknowns (two r and two p and two s, one for each opponents) which means this isnt uniquely solvable. But the good news is that this game is symmetric, so we actually can solve it, because the probabilities of the opponents are taken together and multiplied (recall that we multiply probabilities when we need two independent things to happen at the same time). One thing to be careful of: theres actually nine possibilities for the opponents, not six, but some of them are duplicated. The actual table is like this: rr R P S rp 0 +2 -4 pr -1 +1 0 rs -1 +1 0 sr +2 0 -2 pp +2 0 -2 ps -2 0 +2 sp 0 -1 +1 ss 0 -1 +1 +4 -2 0
All this means is that when using the original matrix and writing it out in longhand form, we have
to remember to multiply rp, rs and ps by 2 each, since there are two ways to get each of them (rp and pr, for example). Note that I havent mentioned which of the two opponents is which; as I said earlier, it doesnt matter because this game is symmetric, so the probability of any player throwing Rock or Scissors is the same as that of the other players. This payoff table doesnt present so well in matrix form since were dealing with two variables rather than one. One way to do this would be to actually split this into three mini-matrices, one for each of the first opponents choices, and then comparing each of those to the second opponents choice then solving each matrix individually, and combining the three solutions into one at the end. Thats a lot of work, so lets try to solve it algebraically instead, writing it out in longhand form and seeing if we can isolate anything by combining like terms: Payoff for R = -2rp+4rs-2pp+4ss = 0 Payoff for P = 2rr+2rp-2sp-2ss = 0 Payoff for S = -4rr-4rs+2pp+2sp = 0 r+s+p=1 (as usual)
The =0 at the end is because we know this game is symmetric and zero sum. Where do you start with something like this? A useful starting place is usually to use r+s+p=1 to eliminate one of the variables by putting it in terms of the other, then substituting into the three Payoff equations above. Eliminating Rock (r=1-s-p) and substituting, after multiplying everything out and combining terms, we get: -4pp+2ps-2p+4s = 0 -2p-4s+2 = 0 2pp-6ps+8p+4s-4 = 0 We could isolate either p or s in the first or last equations by using the Quadratic Formula (you know, minus b plus or minus the square root of 4ac, all divided by 2a). This would yield two possible solutions, although in most cases youll find you can eliminate one as it strays outside the bounds of 0 to 1 (which r, p and s must all lie within, as they are all probabilities). However, the middle equation above makes our lives much easier, as we can solve for p or s in terms of the other: p = 1-2s Substituting that into the other two equations gives us the same result, which lets us know were probably on the right track since the equations dont contradict each other: 20ss-26s+6 = 0 Here we do have to use the dreaded Quadratic Formula. Multiplying everything out, we find s=(26+/-14)/40 that is, s=100% or s=30%. Are both of these valid solutions? To find out, we evaluate p=1-2s and any other equation with r. For s=30%, we find p=40% and r=30%, so that is a valid solution. For s=100%, we get p= -100% and r=100%, which is invalid (p cannot be below zero), leaving us with only a single valid solution: r:p:s = 3:4:3. It turns out that having multiple players does have an effect on the rock wins count double problem, but it might not be the result we expected; with three players it is actually closer to 1:1:1 than it was with two players! Perhaps its because the likelihood of drawing with one player choosing Rock, one choosing Paper and one choosing Scissors makes Scissors less risky than it would be in a two-player game, because even if one opponent chooses Rock, the other might choose Paper and turn your double-loss into a draw. Summary
This week we looked at how to evaluate intransitive mechanics using math. Its probably the most complicated thing weve done, as it brings together the cost curves of transitive mechanics, probability, and statistics which is why Im doing it at the very end of the course only after covering those! To solve these, you go through this process: Make a payoff table. Eliminate all dominated choices from both players (by comparing all combinations of rows and columns and seeing if any pair contains one row or column that is strictly better or equal to another). Keep doing that until all remaining choices are viable. Find all intransitive loops through finding the best opposing response to each players initial choice. Calculate the payoffs of each choice for one of the players, setting the payoffs equal to the same variable X. In a zero-sum game, X for one player will be the negative of the X for the other player. In a symmetric game, X is zero, so just set all the payoffs to zero instead. Add one more equation, that the probabilities of all choices sum to 1. Using algebraic substitution, triangular-form matrices, Excel, or any other means you have at your disposal, solve for as many variables as you can. If you manage to learn the value of X, it tells you the expected gain (or loss) for that player. Summing all players X values tells you if the game is zero-sum (X1+X2+=0), positive-sum (>0) or negative-sum (<0), and by how much overall. If you can find a unique value for each choice that is between 0 and 1, those are the optimal probabilities with which you should choose each throw. For asymmetric games, youll need to do this individually for each player. This is your solution. For games with more than two players each making a simultaneous choice, choose one players payoffs as your point of reference, and treat all other players as a single combined opponent. The math gets much harder for each player you add over two. After all, with two players all equations are strictly linear; with three players you have to solve quadratic equations, with four players there are cubic equations, with five players you see quartic equations, and so on. I should also point out that the field of game theory is huge, and it covers a wide variety of other games we havent covered here. In particular, its also possible to analyze games where players choose sequentially rather than simultaneously, and also games where players are able to negotiate ahead of time, making pleas or threats, coordinating their movements or so on (as might be found in positive-sum games where two players can trade or otherwise cooperate to get ahead of their opponents). These are beyond the scope of this course, but if youre interested, Ill give a couple of references at the end. If Youre Working on a Game Now Think about your game and whether it features any intransitive mechanics. If not, ask yourself if there are any opportunities or reasons to take some transitive mechanics and convert them to intransitive (for example, if youre working on an RPG, maybe instead of just having a sequence of weapons where each is strictly better than the previous, perhaps theres an opportunity at one point in the game to offer the player a choice of several weapons that are all equally good overall, but each is better than another in different situations). If you do have any intransitive mechanics in your game, find the one that is the most prominent, and analyze it as we did today. Of the choices you offer the player, are any of them dominant or dominated choices? What is the expected ratio of how frequently the player should choose each of the available options, assuming optimal play? Is it what you expected? Is it what you want? Homework For practice, feel free to start by doing the math by hand to confirm all of the problems Ive solved here today, to get the hang of it. When youre comfortable, heres a game derived from a mini-game
I once saw in one of the Suikoden series of RPGs (I forget which one). The actual game there used 13 cards, but for simplicity Im going to use a 5-card deck for this problem. Here are the rules: Players: 2 Setup: Each player takes five cards, numbered 1 through 5. A third stack of cards numbered 1 through 5 is shuffled and placed face-down as a draw pile. Progression of play: At the beginning of each round, a card from the draw pile is flipped face up; the round is worth a number of points equal to the face value of that card. Both players then choose one of their own cards, and play simultaneously. Whoever played the higher card gets the points for that round; in case of a tie, no one gets the points. Both players set aside the cards they chose to play in that round; those cards may not be used again. Resolution: The game ends after all five rounds have been played, or after one player reaches 8 points. Whoever has the most points wins. It is easy to see that there is no single dominant strategy. If the opponent plays completely randomly (20% chance of playing each card), you come out far ahead by simply playing the number in your hand that matches the points each round is worth (so play your 3 if a 3 is flipped, play your 4 on a 4, etc.). You can demonstrate this in Excel by shuffling the opponents hand so that they are playing randomly, and comparing that strategy to the point matching strategy Ive described here, and youll quickly find that point matching wins the vast majority of the time. (You could also compute the odds exhaustively for this, as there are only 120 ways to rearrange 5 cards, if you wanted.) Does that mean that matching points is the dominant strategy? Certainly not. If I know my opponent is playing this strategy, I can trounce them by playing one higher than matching on all cards, and playing my 1 card on the 5-point round. Ill lose 5 points, but Ill capture the other 10 points for the win. Does the one higher strategy dominate? No, playing two higher will beat one higher and three higher will beat two higher, four higher will beat three higher, and matching points beats four higher an intransitive relationship. Essentially, the goal of this game is to guess what your opponent will play, and then play one higher than that (or if you think your opponent is playing their 5, play your 1 on that). Since each strategy is just as good as any other if choosing between those five, you might think that means you can do no worse than choosing one of those strategies at random except that as we saw, if you play randomly, matching points beats you! So it is probably true that the optimal strategy is not 1:1:1:1:1, but rather some other ratio. Figure out what it is. If youre not sure where to start, think of it this way: for any given play there are only five strategies: matching, one higher, two higher, three higher, or four higher. Figure out the payoff table for following each strategy across all five cards. You may shift strategies from round to round, like rock-paper-scissors, but with no other information on the first round you only have five choices, and each of those choices may help or hurt you depending on what your opponent does. Therefore, for the first play at least, you would start with this payoff table (after all, for the first round there are only five strategies you can follow, since you only have five cards each): matching M M+1 M+2 M+3 M+4 0 +5 -3 -9 -13 match+1 -5 0 +7 +1 -3 match+2 +3 -7 0 +9 +10 match+3 +9 -1 -9 0 +11 match+4 +13 +3 -10 -11 0
References
Here are a pair of references that I found helpful when putting together todays presentation. Game Architecture and Design (Rollings & Morris), Chapters 3 and 5. This is where I first heard of the idea of using systems of equations to solve intransitive games. Ive tried to take things a little farther today than the authors did in this book, but of course that means the book is a bit simpler and probably more accessible than what Ive done here. And there is, you know, the whole rest of the book dealing with all sorts of other topics. Im still in the middle of reading it so I cant give it a definite stamp of personal approval at this time, but neither can I say anything bad about it, so take a look and decide for yourself. Game Theory: a Critical Text (Heap & Varoufakis). I found this to be a useful and fairly accessible introduction to Game Theory. My one warning would be that in the interest of brevity, the authors tend to define acronyms and then use them liberally in the remainder of the text. This makes it difficult to skip ahead, as youre likely to skip over a few key definitions, and then run into sentences which have more unrecognizable acronyms than actual words!
Level 10: Final Boss

This Week Welcome to the final week of the season. This week I didnt know ahead of time what to do, so I intentionally left it as an unknown on the syllabus as a catch-all for anything interesting that might have come up over the summer. As it turns out, there are four main topics I wanted to cover today, making this the longest post of all of them, so if you have limited time I suggest bookmarking this and coming back later. First Id like to talk a bit about economic systems in games, and how to balance a system where the players are the ones in control of it through manual wealth generation and trading. Then, Ill talk about some common multiplayer game balance problems that just didnt fit anywhere else in the previous nine weeks. Third, Ill get a bit technical and share a few tips and tricks in Excel. Lastly, Ill return to last summer with this whole concept of fun and how this whole topic of game balance fits into the bigger picture of game design, because for all of the depth that weve gone into still feels like a pretty narrow topic sometimes. Economic Systems What is an economic system? First, we use the word economy a lot, even in everyday life, so I should define it to be clear. In games, Ill use the word economy to describe any in-game resource. Thats a pretty broad term, as you could take it to mean that the pieces in Chess are a piece economy and Id argue that yes, they could be thought of that way, its just not a particularly interesting economy because the resources cant really be created, destroyed or transferred in any meaningful way. Most economies that we think of as such, though, have one or more of these mechanics: Resource generation, where players craft or receive resources over time Resource destruction, where players either burn resources for some use in the game, or convert one type of resource to another. Resource trading, where players can transfer resources among themselves, usually involving some kind of negotiation or haggling. Limited zero-sum resources, so that one player generating a resource for themselves reduces the available pool of resources for everyone else. Still, any of those elements individually might be missing and wed still think of it as an economy. Like the creation of level design tools, tabletop RPGs, or metrics, creating an economic system in your game is a third-order design activity, which can make it pretty challenging. Youre not just creating a system that your players experience. Youre creating a system that influences player behavior, but then the players themselves are creating another social system within your economic system, and it is the combination of the two that the players actually experience. For example, in Settlers of Catan, players are regularly trading resources between them, but the relative prices of each resource are always fluctuating based on what each individual player needs at any given time (and usually, all the players need different things, with different levels of desperation and different ability to pay higher prices). The good news is that with economies, at least, a lot of human behavior can be predicted. The other good news is that in-game economies have a lot of little design knobs for us designers to change to modify the game experience, so we have a lot of options. Ill be going over those options today. Supply and Demand First, a brief lesson from Economics 101 that we need to be aware of is the law of supply and demand, which some of you have probably heard of. We assume the simplest case possible: an economy with one resource, a really large population of people who produce the resource and want to sell it, and another really large population of people who consume the resource and want to buy
it. Well also assume for our purposes that any single unit of the resource is identical to any other, so consumers dont have to worry about choosing between different brands or anything. The sellers each have a minimum price at which theyre willing to part with their goods. Maybe some of them have lower production costs or lower costs of living than others, so they can accept less of a price and still stay in business. Maybe others have a more expensive storefront, or theyre just greedy, so they demand a higher minimum price. At any rate, we can draw a supply curve on a graph that says that for any given price (on the x-axis), a certain number or percentage of sellers are willing to sell at that price (on the y-axis). So, maybe at $1, only two sellers in the world can part with their goods at that price, but at $5 maybe there are ten sellers, and at $20 youve got a thousand sellers, and eventually if you go up to $100 every single seller would be willing to sell at that price. Basically, the only thing you need to know about supply curves is that as the price increases, the supply increases; if ten people would sell their good at $5, then at $5.01 you know that at least those ten sellers would still accept (if they sold at $5 then they would clearly sell at $5.01), and you might have some more sellers that finally break down and say, okay, for that extra penny were in. Now, on the other side, it works the same but in reverse. The consumers all have a maximum price that theyre willing (or able) to pay, for whatever reason. And we can draw a demand curve on the same graph that shows for any given price, how many people are willing to buy at that price. And unlike the supply curve, the demand curve is always decreasing; if ten people would buy a good at $5, then at $5.01 you might keep all ten people if youre lucky, or some of them might drop out and say thats too rich for their blood, but you certainly arent going to find anyone who wouldnt buy at a lower price but would buy at a more expensive price. Now of course, in the real world, these assumptions arent always true. More teenagers would rather buy $50 shoes than $20 shoes, because price has social cred. And some sellers might not be willing to sell at exorbitantly high prices because theyd consider that unethical, and theyd rather sell for less (or go out of business) than bleed their customers dry. But for our purposes, we can assume that most of the time in our games, supply curves will increase and demand curves will decrease as the price gets more expensive. And heres the cool part: wherever the two curves cross, will generally turn out to be the actual market price that the players all somehow collectively agree to. Even if the players dont know the curves, the market price will go there as if by magic. It wont happen instantly if the players have incomplete information, but it does happen pretty fast, because players who sell at below the current market price will start seeing other people selling at higher prices (because they have to) and say, hey, if they can sell for more than I should be able to also! And likewise, if a consumer pays a lot for something and then sees the guy sitting next to them who paid half what they did for the same resource, theyre going to demand to pay a whole lot less next time. Now, this can be interesting in online games that have resource markets. If you play an online game where players can sell or trade in-game items for in-game money, see if either the developer or a fansite maintains a historical list of selling prices (not unlike a ticker symbol in the stock market). If so, youll notice that the prices change over time slightly. So you might wonder what the deal is: why do prices fluctuate? And the answer is that the supply and demand are changing slightly over time. Supply changes as players craft items and put them on sale, and that supply is constantly changing; and demand also changes, because at any given time a different set of players is going to be online shopping for any given item. You can see this with other games that have any kind of resource buying and selling. And because the player population isnt infinite, these things arent perfectly efficient, so you get unequal amounts of each item being produced and consumed over time. Now, that points us to another interesting thing about economies: the fewer the players, the more well tend to see prices fluctuate, because a single player controls more and more of the production or consumption. This is why the prices youll see for one Clay in the Catan games change a lot
from game to game (or even within a single game) relative to the price of some piece of epic loot in World of Warcraft. Now, this isnt really something you can control directly as the game designer, but at least you can predict it. It also means if youre designing a trading board game for 3 to 6 players, you can expect to see more drastic price fluctuations with fewer players, and you might decide to add some extra rules with less players to account for that if a stable market is important to the functioning of your game. Multiple resources Things get more interesting when we have multiple goods, because the demand curves can affect one another. For example, suppose you have two resources, but one can be substituted for another maybe one gives you +50 health, and the other gives you +5 mana which you can use as a healing spell to get +50 health, so theres two different items (with similar uses) so if one is really expensive and one is really cheap you can just buy the cheap one. Even if the two arent perfect substitutes, players may be willing to accept an imperfect substitute if the price is sufficiently lower than the market value for the thing they actually want, and the price difference between what people will pay for the good and what theyll pay for the substitute tells you how efficient that substitute is (that is, how perfect it substitutes for the original). On the flip side, you can also have multiple goods where the demand for one increases demand for the other, because they work better if you buy them all as a set (this is sort of the opposite of substitutes). For example, in games where collecting a complete set of matching gear gives your character a stat bonus, or where you can turn in one of each resource for a bonus, a greater demand for one resource pulls up the demand for all of the others and once a player has some of the resources in the set, their demand for the others will increase even more because theyre already part of the way there. By creating resources that are meant to be perfect or imperfect substitutes, or several resources that naturally go together with each other, you can change the demand (and therefore the market price) of each of them. Marginal pricing As we discussed a long time ago with numeric systems, sometimes demand is a function of how much of a good you already have. If you have none of a particular resource, the first one might be a big deal for you, but if you have a thousand of that resource then one more isnt as meaningful to you, so demand may actually be on a decreasing curve based on how many of the thing you already have. Or maybe if you can use lots of resources more efficiently to get larger bonuses, it might be that collecting one resource means your demand for more of that resource increases. The same is true on the supply side, where producing lots of a given resource might be more or less expensive per-unit than producing smaller amounts. So you can add these kinds of mechanics in order to influence the price; for example, if you give increasing returns for each additional good that a player possesses, youll tend to see the game quickly organize into players going for monopolies of individual goods, as once a player has the majority of a good in the game theyre going to want to buy the rest of them. As an example, if you have decreasing returns for players if they collect a lot of one good, then adding a decreasing cost for producing the good might make a lot of sense if you want the price of that good to be a little more stable. Scarcity I probably dont need to tell you this, but if the total goods are limited, that increases demand. You see this exploited all the time in marketing, when a company wants you to believe that they only have limited quantities of something, so that youll buy now (even at a higher price) because you dont want to miss your chance. So you can really change the feeling of a game just by changing whether a given resource is limited or infinite.
As an example, consider a first-person-view shooting video game where you have limited ammunition. First, imagine it is strictly limited: you get what you find, but thats it. A game like that feels more like a survival-horror game, where the player only uses their ammo cautiously, because they never know when theyll find extra or when theyll run out. Compare to a game where you have enemies that respawn in each area, random item drops, and stores where you can sell the random drops and buy as much extra ammo as you need. In a game like that, a player is going to be a lot more willing to experiment with different weapons, because they know theyll get all of their ammo back when they reach the next ammo shop, which makes the game feel more like a typical FPS. Now compare that with a game where you have completely unlimited ammo so its not even a resource or an economy anymore, where you can expect the player to be shooting more or less constantly, like some of the more recent action-oriented FPSs. None of these methods is right or wrong but they all give very different player experiences, so my point is just that you increase demand for a good (and decrease the desire to actually consume it now because you might need it later) the more limited it is. If the resources of your game drive players towards a victory condition, making the resource limited is a great way to control game length. For example, in most RTS games, the board has a limited number of places where players can mine a limited amount of resources, which they then use to create units and structures on the map. Since the core resources that are required to produce units are themselves limited, eventually players will run out, and once it runs out the players will be unable to produce any more units, giving the game a natural time limit of sorts. By adjusting the amounts of resources on the map, you can control when this happens; if the players drain the board dry of resources in the first 5 minutes, youre going to have pretty short games but if it takes an hour to deplete even the starting resources near your start location, then the players will probably come to a resolution through military force before resource depletion forces the issue, and the fact that theyre limited at all is just there to avoid an infinite stalemate, essentially to place an upper limit on game length. With multiplayer games in a closed economy, you also want to be very careful with strictly limited goods, because there is sometimes the possibility that a single player will collect all of a good, essentially preventing anyone else from using it, and you should decide as a designer if that should be possible, if its desirable, and if not what you can do to prevent it. For example, if resources do no good until a player actually uses them (and using them puts them back in the public supply), then this is probably not going to be a problem, because the player who gets a monopoly on the good has incentive to spend them, which in turn removes the monopoly. Open and closed economies In systems, we say a system is open if it can be influenced by things from outside the system itself, and it is closed if the system is completely self-contained. Economies are systems, and an open economy has different design considerations than a closed economy. Most game economies are closed systems; you can generate or spend money within the game, but thats it, and in fact some people get very uncomfortable if you try to change it to an open system: next time you play Monopoly, try offering one of your opponents a real-world cash dollar in exchange for 500 of their Monopoly dollars as a trade, and see what happens at least one other player will probably become very upset! Closed systems are a lot easier to manage from a design standpoint, because we have complete control as designers over the system, we know how the system works, and we can predict how changes in the system will affect the game. Open economies are a lot harder, because we dont necessarily have control over the system anymore. A simple example of an open economy in a game is in Poker if additional player buy-ins are allowed. If players can bring as much money as they want to the table, a sufficiently rich player could have an unfair advantage; if skill is equal, they could just keep buying more chips until the
luck in the game turns their way. To solve this balance problem, usually additional buy-ins are restricted or disallowed in tournament play. Another place where this can be a problem is CCGs, where a player spending more money can buy more cards and have a greater collection. Ideally, for the game to be balanced, we would want larger collections to give players more options but not more power, which is why I think rarity shouldnt be a factor in the cost curve of such a game, at least if you want to maximize your player base. If more money always wins, you set up an in-game economy that essentially has a minimum bar for money to spend if you want to be competitive, and in the real world we also have supply and demand curves, and the higher your initial required buy-in is, the fewer people who will be willing to pay (and thus the smaller your player base). There are other games where you can buy in-game stuff with real-world cash; this is a typical pattern for free-to-play MMOs and Facebook games, and developers have to be careful with exactly what the player can and cant buy; if the player can purchase an advantage over their opponents, especially in games that are competitive by nature, that can make the game very unbalanced very quickly. (Its less of an issue in games like FarmVille where there isnt really much competition anyway.) Some designers intentionally unbalance their game in this way, assuming that if they create a financial incentive to gain a gameplay advantage, players will pay huge amounts; and to be fair, some of them do, and if the game itself isnt very compelling in its core mechanics then this might be the only thing you can fall back on to make money, but if you set out to do this from the start I would call it lazy design. A better method is to create a game thats actually worth playing for anyone, and then offer to trade money for time (so, maybe you get your next unlock after another couple of hours of gameplay, and the gameplay is fun enough that you can do that without feeling like the game is arbitrarily forcing you to grind but if you want to skip ahead by paying a couple bucks, well let you do that). In this way, money doesnt give any automatic gameplay advantage, it just speeds up the progression thats already there. One final example of an open economy is in any game, most commonly MMOs, where players can trade or gift resources within the game, because in any of those cases you can be sure a secondary economy will emerge where players will exchange real-world money for virtual stuff. Just google World of Warcraft Gold and youll probably find a few hundred websites where you can either purchase Gold for real-world cash, or sell Gold to them and get paid in cash. There are a few options you can consider if youre designing an online game like this with any kind of trade mechanic: You could just say that trading items for cash is against your Terms of Service, and that any player found to have done so will have their account terminated. This is mostly a problem because its a huge support headache: you get all kinds of players complaining to you that their account was banned, and just sending them a form email with the TOS still takes time. In some cases, like Diablo where there isnt really an in-game trading mechanism and instead the players just drop stuff on the ground and then go pick it up, it can also be really hard to track this. If its easy to track (because trades are centralized somewhere), if you really dont want people to buy in-game goods for cash, you should ask yourself why your trading system that you designed and built even allows it. You could say that an open economy is okay, but you dont support it, so if someone takes your money and doesnt give you the goods, its the players problem and not the developers. Unfortunately, it is still the developers problem, because you will receive all kinds of customer support emails from players claiming they were scammed, and whether you fix it or not you still have to deal with the email volume. If you dont fix it, then you have to accept youre going to lose some customers and generate some community badwill. If you do fix it, then accept that players will think of you as a safety net which actually makes them more likely to get scammed, since theyll trust other people by assuming that if
the other person isnt honest, theyll just send an email to support to get it fixed. Trying to enforce sanctions against scammers is an unwinnable game of whack-a-mole. You can formalize trading within your game, including the ability to accept cash payments. The good news for this is that players have no excuses; my understanding is that when Sony Online did this for some of their games, the huge win for them was something like a 40% reduction in customer support costs, which can be significant for a large game. The bad news is that you will want to contact a lawyer on this, to make sure you dont accidentally run afoul of any national banking laws since you are now storing players money. Youll also want to consider whether players are allowed to sell their entire character, password and all. For Facebook games this is less of an issue because a Facebook account links to all the games and its not so easy for a player to give that away. For an MMO where each player has an individual account on your server that isnt linked to anything else, this is something that will happen, so you need to decide how to deal with that. (On the bright side, selling a whole character doesnt unbalance the game.) In any case, you again want to make sure that whatever players can trade in game does not unbalance the game if a single player uses cash to buy lots of in-game stuff. One common pattern to avoid this is to place restrictions on items, for example maybe you can purchase a really cool suit of armor but you have to be at least Level 25 to wear it. Inflation Now, remember from before that the demand curve is based on each players maximum willingness to pay for some resource. Normally wed like to think of the demand curve as this fixed thing, maybe fluctuating slightly if a different set of players or situations happen to be online, but over time it should balance out. But there are a few situations that can permanently shift the demand curve in one direction or another, and the most important for our purpose is when each players maximum willingness to pay increases. Why would you change the amount youre willing to pay? Mostly, if you have more purchasing power. If I doubled your income overnight but Starbucks raised the price of its coffee from $5 to $6, if you liked their coffee before you would probably be willing to pay the new price, because you can afford it. How does this work in games? Consider a game with a positive-sum economy: that is, it is possible for me to generate wealth and goods without someone else losing them. The cash economy in the board game Monopoly is like this, as weve discussed before; so is the commodity economy in Catan, as is the gold economy in most MMOs. This means that over time, players get richer. With more total money in the economy (and especially, more total money per player on average), we see what is called inflation: the demand curve shifts to the right as more people are willing to pay higher prices, which then increases the market price of each good to compensate. In Catan, this doesnt affect the balance of the game; by the time youre in the late game and willing to trade vast quantities of stuff for what you need, youre at the point where youre so close to winning that no one else is willing to trade with you anyway. In Monopoly the main problem, as I mentioned earlier, is that the economy is positive-sum but the object of the game is to bankrupt your opponents; here we see that one possible solution to this is to change the victory condition to be the first player to get $2500 or something like that. In MMOs, inflation isnt a problem for the existing players, because after all they are willing to pay more; however, it is a major problem for new players, who enter the game to find that theyre earning one gold piece for every five hours of play, and anything worth having in the game costs millions of gold, and they can never really catch up because even once they start earning more money, inflation will just continue. So if youre running an MMO where players can enter and exit the game freely, inflation is a big long-term problem you need to think about. There are two ways to fix this: reduce the positive-sum nature of the economy, or add negative-sum elements to counteract the positive-sum ones.
Negative-sum elements are sometimes called money sinks, that is, some kind of mechanism that permanently removes money from the player economy. The trick is balancing the two so that on average, it cancels out; a good way to know this is to actually take metrics on the total sum of money in the game, and the average money per person, and track that over time to see if its increasing or decreasing. Money sinks take many forms: 1. Any money paid to NPC shopkeepers for anything, especially if that something is a consumable item that the player uses and then its gone for good. 2. Any money players have to pay as maintenance and upkeep; for example, having to pay gold to repair your weapon and armor periodically. 3. Losing some of your money (or items or stats or other things that cost money to replace) when you die in the game. 4. Offering limited quantities of high-status items, especially if those items are purely cosmetic in nature and not something that gives a gameplay advantage, which can remove large amounts of cash from the economy when a few players buy them. 5. While I dont know of any games that do this, it works in real life: have an adventurers tax that all players have to pay as a periodic percentage of their wealth. This not only gives an incentive to spend, but it also penalizes the players who are most at fault for the inflation. Another alternative would be to actually redistribute the wealth, so instead of just removing money from the economy, you could transfer some money from the richest players and distribute it among the poorest; that on its own would be zero-sum and wouldnt necessarily fix the inflation problem, but it would at least give the newer players a chance to catch up and increase their wealth over time. To reduce the positive-sum nature of the economy is a bit harder, because players are used to going out there, killing monsters and getting treasure drops. If you make the monsters limited (so they dont respawn) then the world will become depopulated of monsters very quickly. If you give no rewards, players will wonder why theyre bothering to kill monsters at all. In theory you could do something like this: Monsters drop treasure but not gold, and players cant sell or trade the treasure thats dropped; so it might make their current equipment a little better, but thats about it. Players receive gold from completing quests, but only the first time each quest, so the gold they have at any given point in the game is limited. Players cant trade gold between themselves. Players can use the gold to buy special items in shops, so essentially it is like the players have a choice of what special advantages to buy for their character. One other, final solution here is the occasional server reset, when everyone loses everything and has to start over. This doesnt solve the inflation problem a new player coming in at the end of a cycle has no chance of catching up but it does at least mean that if they wait for everything to reset theyll have as good a chance as anyone else after the reset. Trading Some games use trading and bartering mechanics extensively within their economies. Trading can be a very interesting mechanic if players have a reason to trade; usually that reason is that you have multiple goods, and each good is more valuable to some players than others. In Settlers of Catan, sheep are nearly useless to you if you want to build cities, but theyre great for settlements. In Monopoly, a single color property isnt that valuable, but it becomes much more powerful if you own the other matching ones. In World of Warcraft, a piece of gear that cant be equipped by your character class isnt very useful to you, no matter how big the stat bonuses are for someone else. By giving each player an assortment of things that are better for someone else than for them, you give players a reason to trade resources. Trading mechanics usually serve as a negative-feedback loop, especially within a closed economy.
Players are generally more willing to offer favorable trades to those who are behind, while they expect to get a better deal from someone who is ahead (or else they wont trade at all). There are a lot of ways to include trading in your game; it isnt as simple as just saying players can trade but this is a good thing, because it gives you a lot of design control over the player experience. Here are a few options to consider: Can players make deals for future actions as part of the trade (Ill give you X now for Y now and Z later)? If so, are future deals binding, or can players renege? Disallowing future deals makes trades simpler; with future considerations, players can buy on credit so to speak, which tends to complicate trades. On the other hand, it also gives the players a lot more power to strike interesting deals. If future deals are non-binding, players will tend to be a lot more cautious and paranoid about making them. Think about whether you want players to be inherently mistrustful and suspicious of each other, or whether you want to give players every incentive to find ways of cooperating. Can players only trade certain resources but not others? For example, in Catan you can trade resource cards but not victory points or progress cards; in Monopoly you can trade anything except developed properties; in some other games you can trade anything and everything. Resource that are tradable are of course a lot more fluid than others. Some resources may be so powerful (like Victory Points) that no one in their right mind would want to trade them, so simply making them untradeable stops the players from ruining the game by making bad trades. Can players only trade at certain times? In Catan you can only trade with the active player before they build; in Bohnanza there is a trading phase as part of each players turn; in Monopoly you can trade with anyone at any time. If players can trade at any time, consider if they can trade at instant speed in response to a game event, because sometimes the ability to react to game events can become unbalanced. For example, in Monopoly you could theoretically avoid Income Tax by trading all of your stuff to another player, then taking it all back after landing on the space, and you could offer the other player a lesser amount (say, 5%) in exchange for the service of providing a tax shelter. In short, be very clear about exactly when players can and cant trade. If trading events in the game are infrequent (say, you can only trade every few turns or something), expect trading phases to take longer, as players have had more time to amass tradable resources so they will probably have a lot of deals to make. If this is a problem, consider adding a timer where players only have so much time to make deals within the trading phase. Does the game require all trades to be even (e.g. one card for one card) or are uneven trades allowed (Catan can have uneven numbers of cards, but at least one per side; other games might allow a complete gift of a trade)? Requiring even trades places restrictions on what players can do and will reduce the number of trades made, but it will also cause trading to move along faster because theres less room for haggling, and theres also less opportunity for one weak player to make a bad trade that hands the game to someone else. I could even imagine a game where uneven trades are enforced: if you trade at all, someone must get the shaft. Are trades limited in quantity, or unlimited? A specific problem here is the potential for the kingmaker problem, where one player realizes they cant win, but the top two players are in a close game, and the losing player can choose to gift all of their stuff to one of the top two players to allow one of them to win. Sometimes social pressure prevents people from doing something like this, but you want to be very careful in tournament situations and other
official games where the economic incentive of prizes might trump good sportsmanship (I actually played in a tournament game once where top prize was $20, second prize was $10, and I was in a position to decide who got what, so I auctioned off my pieces to the highest bidder.) Are trades direct, or indirect? Usually a trade just happens, I give you X and you give me Y, but its also possible to have some kind of trade tax where maybe 10% of a gift or trade is removed and given to the bank for example, to limit trades. This seems strange why offer trading as a mechanic at all, if youre then going to disincentivize it? But in some games trading may be so powerful (if two players form a trading coalition for their mutual benefit, allowing them both to pull ahead of all other players, for example) to the point where you might need to apply some restrictions just to prevent trades from dominating the rest of the game. Is there a way for any player to force a trade with another player against their will? Trades usually require both players to agree on the terms, but you can include mechanisms to allow one player to force a trade on another player under certain conditions. For instance, in a set collection game, you might allow a player to able to force-trade a more-valuable single item for a less-valuable one from an opponent, once per game. Auctions Auction mechanics are a special case of trading, when one player auctions off their stuff to the other players, or where the bank creates an item out of thin air and it is auctioned to the highest bidding player. Auctions often serve as a self-balancing mechanic in that the players are ultimately deciding on the price of how much something is worth, so if you dont know what to cost something you can put it up for auction and let the players decide. (However, this is lazy design; auctions work the best when the actual cost is variable, different between players, and situational, so that figuring out how much its worth is something that changes from game to game, so that the players are actually making interesting choices each time. With an effect that is always worth the same amount, auction is meaningless once players figure out how much its worth; theyll just bid what its worth and be done with it.) Auctions are a very pure form of willingness to pay because each player has to decide what theyre actually willing to pay so that they can make an appropriate bid. A lot of times there are meta-considerations: not just how much do I want this for myself but also how much do I not want one of my opponents to get it because it would give them too much power or even I dont want this, but I want an opponent to pay more for it, so Ill bid up the price and take the chance that I wont get stuck with it at the end. An interesting point with auctions is that normally if the auction goes to the highest bidder, that the item up for auction sells for the highest willingness to pay among all of the bidders thats certainly what the person auctioning the item wants, is to get the highest price. But in reality, the actual auction price is usually somewhere between the highest and second highest willingness to pay, and in fact its usually closer to the second-highest, although that depends on the auction type: sometimes you end up selling for much lower. Just as there are many kinds of trading, there are also many kinds of auctions. Heres a few examples: Open auction. This is the type most people think of when they think of auctions, where any player can call a higher bid at any time, and when no other bids happen someone says going once, going twice, sold. If everyone refuses to bid beyond their own maximum willingness to pay, the person with the highest willingness will purchase the item for one unit more than the second-highest willingness, making this auction inefficient (in the sense that youd ideally want the item to go for the highest price), but as well see that is a problem with most auctions.
Fixed price auction. In turn order, each player is offered the option to purchase the item or decline. It goes around until someone accepts, or everyone declines. This gives an advantage to the first player, who gets the option to buy (or not) before anyone else and if its offered at less than the first players willingness to pay, they get to keep the extra in their own pocket, so how efficient this auction is depends on how well the fixed price is chosen. Circle auction. In turn order, each player can either make a bid (higher than the previous one) or pass. It goes around once, with the final player deciding whether to bid one unit higher than the current highest bid, or let the other player take it. This gives an advantage to the last player, since it is a fixed-price auction for them and they dont have to worry about being outbid, so they may be able to offer less than their top willingness to pay. Silent auction. Here, everyone secretly and simultaneously chooses their bid, all reveal at once, and highest bid wins. You need to include some mechanism of resolving ties, since sometimes two or more players will choose the same highest bid. This can often have some intransitive qualities to it, as players are not only trying to figure out their own maximum willingness to pay, but also other players willingness. If the item for auction is more valuable for you than the other players, you may bid lower than your maximum willingness to pay because you expect other players bids to be lower, so you expect to bid low and still win. Dutch auction. These are rare in the States as it requires some kind of special equipment. You have an item that starts for bid at high price, and theres some kind of timer that counts down the price at a fixed rate (say, dropping by $1 per second, or something). The first player to accept at the current price wins. In theory this means as soon as the price hits the top players maximum willingness to pay, they should accept, but there may be some interesting tension if theyre willing to wait (and possibly lose out on the item) in an attempt to get a better price. If players can read each others faces in real-time to try to figure out who is interested and who isnt, there may be some bluffing involved here. Even once you decide on an auction format, there are a number of ways to auction items: The most common is that theres a single item up for auction at a time; the top bidder receives the item, and no one else gets anything. Sometimes an entire set of items are auctioned off at the same time in draft form: top bid simply gets first pick, then second-highest bidder, and so on. The lowest bidder gets the one thing no one else wanted or sometimes they get nothing at all, if you want to give players some incentive to bid higher. In other words, even if a player doesnt particularly want any given item, they may be willing to pay a small amount in order to avoid getting stuck with nothing. Conversely, if you want to give players an incentive to save their money by bidding zero, giving the last-place bidder a free item is a good way to do that but of course if multiple players bid zero, youll need some way of breaking the tie. If its important to have negative feedback on auction wins so that a single player shouldnt win too many auctions in a row, giving a bonus to everyone who didnt win (or even just the bottom bidder) for winning the next auction is a way to do that. Some auctions are what are called negative auctions because they work in reverse: instead of something good happening to the highest bidder, something bad happens to the lowest bidder. In this case players are bidding for the right to not have something bad happen to them. This can be combined with other auctions: if the top bidder takes something from the bottom bidder, that gives players an incentive to bid high even if they dont want anything. The auction game Fist of Dragonstones had a really interesting variant of this, where the top bidder takes something from the second highest bidder, meaning that if you bid for the auction at all you wanted to be sure you won and didnt come in second place! On the other hand, if only one person bids for it, then everyone else is in second place, and the bidder can choose to take from any of their opponents, so sometimes it can be dangerous to not bid as well.
Even once you decide who gets what, there are several ways to define who pays their bid: The most common for a single-item auction is that the top bidder pays their bid, and all other players spend nothing. If the top two players pay their bid (but the top bidder gets the item and the second-highest bidder gets nothing), making low or medium bids suddenly becomes very dangerous, turning the auction into an intransitive mechanic where you either want to bid higher than anyone else, or low enough that you dont lose anything. This is most common in silent auctions, where players are never sure of exactly what their opponents are bidding. If you do something like this with an open auction things can get out of hand very quickly, as each of the top two bidders is better off paying the marginal cost of outbidding their opponent than losing their current stake for example, if you auction off a dollar bill in this way and (say) the top bid is 99 cents and the next highest is 98 cents, the second-highest bidder has an incentive to bid a dollar (which lets them break even rather than lose 98 cents) which then gives incentive to the 99-cent bidder to paradoxically bid $1.01 (because in such a situation theyre only losing one cent rather than 99 cents), and if both players follow this logic they could be outbidding each other indefinitely! The top bidder can win the auction but only pay an amount equal to the second-highest bid. Game theory tells us, through some math I wont repeat here, that in a silent auction with these rules, the best strategy is to bid your maximum willingness to pay. In some cases, particularly when every player gets something and highest bid just chooses first, you may want to have all players pay their bid. If only the top player gets anything, that makes it dangerous to bid if youre not hoping to win although in a series of such auctions, a player may choose to either go all in on one or two auctions to guarantee winning them, or else they may spread them out and try to take a lot of things for cheap when no one else bids. If only the top and bottom bidders pay, players may again have incentive to bid higher than normal, because theyd be happy to win but even happier if they dont have to lose anything. You may want to force players to bid a certain minimum, so there is always at least something at stake to lose (otherwise a player could bid zero and pay no penalty) although if zero bids are possible, that makes low bids appear safer as theres always the chance someone will protect you from losing your bid by bidding zero themselves. If everyone pays their bid except the lowest bidder, that actually gives players an incentive to bid really high or really low. Even if you know who has to pay what bid, you have to decide what happens to the money from the auction: Usually its just paid to the bank that is, its removed from the economy, leading to deflation. It could be paid to some kind of holding block that collects auction money, and is then redistributed to one or more players later in the game when some condition is met. The winning bid may also be paid to one or more other players, making the auction partially or completely zero-sum, in a number of ways. Maybe the bid is divided and evenly split among all other players. The board game Lascaux has an interesting bid mechanic: each player pays a chip to stay in the auction, in turn order, and on their turn a player can choose instead to drop out of the auction and take all of the chips the players have collectively paid so far. The auction continues with the remaining players, thus it is up to the player if its a good enough time to drop out (and gain enough auction chips to win more auctions later) or if its worth staying in for one more go-round (hoping everyone else will stay in, thus increasing your take when you drop out), or even continuing to stay in with the hopes of winning the auction. Lastly, no matter what kind of auction there is, you have to decide what happens if no one bids:
It could be that one of the players gets the item for free (or minimal cost). If that player is known in advance to everyone, it gives other players an incentive to bid just to prevent someone else for getting an item for free. When the players know that the default state is one of their opponents getting a bonus, its often an incentive to open the bidding. If players dont know who it is (say, a random player gets the item for free) then players may be more likely to not bid, as they have just as good a chance as anyone else. As an alternative, the auction could have additional incentives added, and then repeated. If one resource is being auctioned off and no one wants it, a second resource could be added, and then the set auctioned and then if no one wants that, add a third resource, and so on until someone finally thinks its worth it. Or, what usually happens is the item is thrown out, no one gets it, and the game continues from there as if the auction never happened. Needless to say, there are a lot of considerations when setting up an in-game economy! Like most things, there are no right or wrong answers here, but hopefully Ive at least given you a few different options to consider, and the implications of those. Solving common problems in multiplayer This next section didnt seem to fit anywhere else in the course so Im mentioning it here, so if it seems out of place, thats why. In multiplayer free-for-all games where there can only be one winner, there are a few problems that come up pretty frequently, that can either be considered a balance or imbalance depending on the game, but they are things that usually arent much fun so you want to be very careful of them. Turtling One problem, especially in war games or other games where players attack each other directly, is that if you get in a fight with another player even if you win it still weakens both of you relative to everyone else. The wise player reacts to this by doing their best to not get in any fights, instead building up their defenses to make them a less tempting target, and then when all the other players get into fights with each other they swoop in and mop up the pieces when everyone else is in a weakened state. The problem here is that the system is essentially rewarding the players for not interacting with each other, and if the interaction is the fun part of the game then you can hopefully see where this is something that needs fixing. The game balance problem is that attacking you know, actually playing the game is not the optimal strategy. The most direct solution is to reward or incentivize aggression. A simple example is in the board game RISK, where attackers and defenders both lose armies so youd normally want to just not attack, so the game goes to great lengths to avoid turtling by giving incentives to attack: more territories controlled means you get more armies next turn if you hold onto them, same for continent bonuses, and lets not forget the cards you can turn in for armies but you only get a card if you attack. Another solution is to force the issue by making it essentially impossible to not attack. As an example, Plague and Pestilence and Family Business are both light card games where you draw 1 then play 1. A few cards are defensive in nature, but most cards hurt opponents, and you must play one each turn (choosing a target opponent, even), so before too long youre going to be forced to attack someone else its simply not possible to avoid making enemies. Kill the leader and Sandbagging One common problem in games where players can directly attack each other, especially when its very clear who is in the lead, is that everyone by default will gang up on the leader. On the one hand, this can serve as a useful negative feedback loop to your game, making sure no one gets too much ahead. On the other hand, players tend to overshoot (so the leader isnt just kept in check, theyre totally destroyed), and it ends up feeling like a punishment to be doing well.
As a response to this problem, a new dynamic emerges, which Ive seen called sandbagging. The idea is that if its dangerous to be the leader, then you want to be in second place. If a player is doing well enough that theyre in danger of taking the lead, they will intentionally play suboptimally in order to not make themselves a target. As with turtling, the problem here is that players arent really playing the game you designed, theyre working around it. The good news is that a lot of things have to happen in combination for this to be a problem, and you can break the chain of events anywhere to fix it. Players need a mechanism to join forces and gang up on a single player; if you make it difficult or impossible for players to form coalitions or to coordinate strategies, attacking the leader is impossible. In a foot race, players cant really attack each other, so you dont see any kill-the-leader strategies in marathons. In an FPS multiplayer deathmatch, players can attack each other, but the action is moving so fast that its hard for players to work together (or really, to do anything other than shoot at whoevers nearby). Or, even if players can coordinate, they need to be able to figure out who the leader is. If your game uses hidden scoring or if the end goal can be reached a lot of different ways so that its unclear who is closest, players wont know who to go after. Lots of Eurogames have players keep their Victory Points secret for this reason. Or, even if players can coordinate and they know who to attack, they dont need to if the game already has built-in opportunities for players to catch up. Some Eurogames have just a few defined times in the game where players score points, with each successive scoring opportunity worth more than the last, so in the middle of a round its not always clear whos in the lead and players know that even the person who got the most points in the first scoring round has only a minor advantage at best going into the final scoring round. Or, even if players can coordinate and they know who to attack, the games systems can make this an unfavorable strategy or it can offer other strategies. For example, in RISK it is certainly arguable that having everyone attack the leader is a good strategy in some ways but on the other hand, the game also gives you an incentive to attack weaker players, because if you eliminate a player from the game you get their cards, which gives you a big army bonus. Or, since kill-the-leader is a negative feedback loop, the textbook solution is to add a compensating positive feedback loop that helps the leader to defend against attacks. If you want a dynamic where the game starts equal but eventually turns into one-against-many, this might be the way to go. If you choose to remove the negative feedback of kill-the-leader, one thing to be aware of is that if you were relying on this negative feedback to keep the game balanced, it might now be an unbalanced positive feedback loop that naturally helps the leader, so consider adding another form of negative feedback to compensate for the removal of this one. Kingmaking A related problem is when one player is too far behind to win, but they are in a position to decide which of two other people wins. Sometimes this happens directly in a game with trading and negotiation, the player whos behind might just make favorable trades to one of the leading players in order to hand them the game. Sometimes it happens indirectly, where the player whos behind has to make one of two moves as part of the game, and it is clear to everyone that if they make one move then one player wins, and another move causes another player to win. This is undesirable because its anticlimactic: the winner didnt actually win because of superior skill, but instead because one of the losing players liked them better. Now, in a game with heavy diplomacy (like the board game Diplomacy) this might be tolerable; after all, the game is all about convincing other people to do what you want. But in most games, the winners both feel like the win wasnt really deserved, so the game designer generally wants to avoid this situation.
As with kill-the-leader, there are a lot of things that have to happen for kingmaking to be a problem, and you can eliminate any of them: The players have to know their standing. If no player knows who is winning, who is losing, and what actions will cause one player to win over another, then players have no incentive to help out a specific opponent. The player in last place has to know that they cant win, and that all they can do is help someone else to win. If every player believes they have a chance to win, theres no reason to give away the game to someone else. Or, you can reduce or eliminate ways for players to affect each other. If the person in last place has no mechanism to help anyone else, then kingmaking is impossible. Player elimination A lot of two-player games are all about eliminating your opponents forces, so it makes sense that multi-player games follow this pattern as well. The problem is that when one player is eliminated and everyone else is still playing, that losing player has to sit and wait for the game to end, and sitting around not playing the game is not very fun. With games of very short length, this is not a problem. If the entire game lasts two minutes and youre eliminated with 60 seconds left to go, who cares? Sit around and wait for the next game to start. Likewise, if player elimination doesnt happen until late in the game, this is not usually a problem. If players in a two-hour game start dropping around the 1-hour-50-minute mark, relatively speaking it wont feel like a long time to wait until the game ends and the next one can begin. Its when players can be eliminated early and then have to sit around and wait forever that you run into problems. There are a few mechanics that can deal with this: You can change the nature of your player elimination, perhaps disincentivizing players to eliminate their opponents, so that the only time a player will actually do this is when they feel theyre strong enough to eliminate everyone and win the game. The board game Twilight Imperium makes it exceedingly dangerous to attack your opponents because a war can leave you exposed, thus players tend to not attack until they feel confident that they can come out ahead, which doesnt necessarily happen until late game. You can also change the victory condition, removing elimination entirely; if the goal is to earn 10 Victory Points, instead of eliminating your opponents, then players can be so busy collecting VP that they arent as concerned with eliminating the opposition. The card game Illuminati has a mechanism for players to be eliminated, but the victory condition is to collect enough cards (not to eliminate your opponents), so players are not eliminated all that often. One interesting solution is to force the game to end when the first player is eliminated; thus, instead of the victory being decided as last player standing, victory is the player in best standing (by some criteria) when the first player drops out. If players can help each other, this creates some tense alliances as one player nears elimination; the player in the lead wants that player eliminated, while everyone else actually wants to help that losing player stay in the game! The card game Hearts works this way, for example. The video game Gauntlet IV (for Sega Genesis) also did something like this in its multiplayer battle mode, where as soon as one player was eliminated a 60-second countdown timer started, and the round would end even if several players were still alive. You can also give the eliminated players something to do in the game after theyre gone. Perhaps there are some NPCs in the game that are normally moved according to certain rules or algorithms, but you can give control of those to the eliminated players (my game group added this as a house rule in the board game Wiz-War, where an eliminated player would take control of all monsters on the board). Cosmic Encounter included rules for a
seventh player beyond the six that the game normally supports, by adding kibbutzing mechanics where the seventh player can wander around, look at peoples hands, and give them information and they have a secret goal to try to get a specific other player to win, so while they are giving away information they also may be lying. In Mafia/Werewolf and other variants, eliminated players can watch the drama unfold, so even though they cant interact the game is fun to observe, so most players dont mind taking on the spectator role. Excel Every game designer really needs to learn Excel at some point. Some of you probably already use it regularly, but if you dont, you should learn your way around it, so consider this a brief introduction to how Excel works and how to use it in game design, with a few tricks from my own experience thrown in. For those of you who are already Excel experts, I beg your patience, and hope I can show you at least one or two little features that you didnt know before. Note: Im assuming Excel 2003 for PC; the exact key combinations I list below may vary for you if youre using a different version or platform. Excel is a spreadsheet program, which means absolutely nothing to you if you arent a financial analyst, so an easier way of thinking about Excel is that its a program that lets you store data in a list or a grid. At its most basic, you can use it to keep things like a grocery list or to-do list in a column, and if you want to include a separate column to keep track of whether its done or not, then thats a perfectly valid use (Ive worked with plenty of spreadsheets that are nothing more than that, for example a list of art or sound assets in a video game and a list of their current status). Data in Excel is stored in a series of rows and columns, where each row has a number, each column has a letter, and a single entry is in a given row and column. Any single entry location is called a cell (as in cell phone or terrorist cell), and is referred to by its row and column (like A1 or B19). You can navigate between cells with arrow keys or by clicking with the mouse. Entering data into cells In general, each cell can hold one of three things: a number, written text, or a computed formula. Numbers are pretty simple, just type in a number in the formula bar at the top and then hit Enter, or click the little green checkmark if you prefer. Text is also simple, just type the text you want in the same way. What if you want to include text that looks like a number or formula, but you want Excel to treat it as text? Start the entry with a single apostrophe () and then you can type anything you want, and Excel will get the message that you want it to treat that cell as text. For a formula, start the entry with an equal sign (=) and follow with whatever you want computed. Most of the time you just want simple arithmetic, which you can do with the +, -, * and / characters. For example, typing =1+2 and hitting Enter will display 3 in the cell. You can also reference other cells: =A1*2 will take the contents in cell A1, multiply by 2, and put the result in whatever other cell there is. And the really awesome part about this is that if you change the value in A1, any formulas that reference it will change automatically, which is the main thing Excel does that saves you so much time. In fact, even if you insert new rows that change the actual name of the cell youre referencing, Excel will change your formulas to update what theyre referencing. Adding comments Suppose you want to leave a note to yourself about something in one of the cells. One way to do this is just to put another text field next to it, of course, although as youll see getting a lot of text to display in one of those tiny cells isnt trivial you can type it all in, of course, but there are times when thats not practical. For those cases you can instead use the Comment feature to add a comment to the box, which shows up as a little red triangle in the corner of the cell. Mousing over the cell reveals the comment. Moving data around
Cut, copy and paste work pretty much as youd expect them to. You can even click and drag, or hold Shift while moving around with the arrow keys, to select a rectangular block of cells or click on one of the row or column headings to select everything in that row or column or click on the corner between the row and column headings (or hit Ctrl-A) to select everything. By holding Ctrl down and clicking on individual cells, you can select several cells that arent even next to each other. Now, if you paste a cell containing a formula a whole bunch of times, a funny thing happens: youll notice that any cells that are referenced in the formula keep changing. For example, if youve got a formula in cell B1 that references A1, and you copy B1 and paste into D5, youll notice the new formula references D4 instead. Thats because by default, all of these cell references are relative in position to the original. So when you reference A1 in your formula in B1, Excel isnt actually thinking the cell named A1 its thinking the cell just to the left of me in the same row. So when you copy and paste the formula somewhere else, itll start referencing the cell just to the left in the same row. As you might guess in this example, if you paste this into a cell in column A (where theres nothing to the left because youre already all the way on the left), youll see a result thats an error: #REF! which means youre referencing a cell that doesnt exist. If you want to force Excel to treat a reference as absolute, so that it references a specific cell no matter where you copy or paste to, there are two ways to do it. First is to use a dollar sign ($) before the column letter or row number or both in the formula, which tells Excel to treat either the row or column (or both) as an absolute position. For our earlier example, if you wanted every copy-andpaste to look at A1, you could use $A$1 instead. Why do you need to type the dollar sign twice in this example? Because it means you can treat the row as a relative reference while the column is absolute, or vice versa. There are times when you might want to do this which Im sure you will discover as you use Excel, if you havent already. Theres another way to reference a specific, named cell in an absolute way, which is mostly useful if youre using several cells in a bunch of complicated formulas and its hard to keep straight in your head which cell is which value when youre writing the formulas. You can give any cell a name; the name is just the letter and number of the cell by default, its displayed in the top left part of the Excel window. To change it, just click on that name and then type in whatever you want. And then you can reference that name anywhere else in the spreadsheet and itll be an absolute reference to that named cell. Sorting Sometimes youll want to sort the data in a worksheet. A common use of Excel is to keep track of a bunch of objects, one per row, and each attribute is listed in a separate column. For example, maybe on a large game project youll have an Excel file that lists all of the enemies in a game, with the name in one column, hit points in another, damage in another, and so on. And maybe you want to sort by name just so you have an easy-to-lookup master list, or maybe you want to sort by hit points to see the largest or smallest values, or whatever. This is pretty easy to do. First, select all the cells you want to sort. Go to the Data menu, and choose Sort. Next, tell it which column to sort by, and whether to sort ascending or descending. If youve got two entries in that column that are the same, you can give it a second column as a tiebreaker, and a third column as a second tiebreaker if you want (otherwise itll just preserve the existing order when it sorts like that). Theres also an option for ignoring the header row, so if you have a header with column descriptions at the very top and you dont want that sorted well, you can just not select it when sorting, of course, but sometimes its easier to just select the whole spreadsheet and click the button to ignore the header row. If you accidentally screw up when sorting, dont panic just hit Undo. Sometimes you realize you need to insert a few rows or columns somewhere, in between others. The nice thing about this is that Excel updates all absolute and relative references just the way youd want it to, so you should never have to change a value or formula or anything just because
you inserted a row. To insert, right-click on the row or column heading, and insert row or insert column is one of the menu choices. You can also insert them from the Insert menu. If you need to remove a row or column it works similarly, right-click on the row or column heading and select delete. You might think you could just hit the Delete key on the keyboard too, but that works differently: it just clears the values of the cells but doesnt actually shift everything else up or left. Using your data Sometimes youve got a formula you want copied and pasted into a lot of cells all at once. My checkbook, for example, has formulas on each row to compute my current balance after adding or subtracting the current transaction from the previous one, and I want that computation on every line. All I had to do was write the formula once but if I had to manually copy then paste into each individual cell, Id cry. Luckily, theres an easy way to do this: Fill. Just select the one cell you want to propagate, and a whole bunch of other cells below or to the right of it, then hit Ctrl+D to take the top value and propagate it down to all the others (Fill Down), or Ctrl+R to propagate the leftmost value to the right. If you want to fill down and right, you can just select your cell and a rectangular block of cells below and to the right of it, then hit Ctrl+D and Ctrl+R in any order. You can also Fill Up or Fill Left if you want, but those dont have hotkeys; youll have to select those from the Edit menu under Fill. As with copying and pasting, Fill respects absolute and relative references to other cells in your formulas. Theres kind of a related command to Fill, which is useful in a situation like the following example: suppose youre making a list of game objects and you want to assign each one a unique ID number, and that number is going to start at 1 and then count upwards from there, and lets say you have 200 game objects. So in one column, you want to place the numbers 1 through 200, each number in its own cell. Entering each number manually is tedious and error-prone. You could use a formula, like, say, putting the number 1 in the first cell, lets say its cell A2, and then in B2 put the formula =A2+1 (which computes to 2), then Fill Down that formula to create the numbers all the way down to 200. And that will work at first, but whenever you start reordering or sorting rows, all of these cells referencing each other might get out of whack, and it might work or it might not but itll be a big mess regardless. And anyway, you dont really want a formula in those cells anyway, you want a number. You could create the 200 numbers by formula on a scratch area somewhere, then copy, then Paste Special (under the Edit menu), and select Values, which just takes the computed values and pastes them in as numbers without copying the formulas. And then you just delete the formulas that you dont need anymore. That would work, and Paste Special / Values is an awesome tool for a lot of things, but its overkill here. Heres a neat little trick: just create two or three cells and put the numbers 1, 2, 3 in them. Now, select the three cells, and youll notice theres a little black square dot in the lower right corner of the selection. Click on that, and drag down a couple hundred rows. When you release the mouse button, Excel takes its best guess what you were doing, and fills it all in. For something simple like counting up by 1, Excel can figure that out, and itll do it for you. For something more complicated you probably wont get the result youre looking for, but you can at least have fun trying and seeing what Excel thinks youre thinking. Functions Excel comes with a lot of built-in functions that you can use in your formulas. Functions are always written in capital letters, followed by an open-parenthesis, then any parameters the function might take in (this varies by function), then a close-parenthesis. If there are several parameters, they are separated by commas (,). You can embed functions inside other ones, so one of the parameters of a function might actually be the result of another function; Excel is perfectly okay with that.
Probably the single function I use more than any other is SUM(), which takes any number of parameters and adds them together. So if you wanted to sum all of the cells from A5 to A8, you could say =A5+A6+A7+A8, or you could say =SUM(A5,A6,A7,A8), or you could say =SUM(A5:A8). The last one is the most useful; use a colon between two cells to tell Excel that you want the range of all cells in between those. You can even do this with a rectangular block of cells by giving the top-left and bottom-right corners: =SUM(A5:C8) will add up all twelve cells in that 34 block. The second most useful function for me is IF, which takes in three parameters. The first is a condition thats evaluated either to a true or false value. The second parameter is evaluated and returned if the condition is true. The third parameter is evaluated and returned if the condition is false. The third parameter is optional; if you leave it out and the condition is false, the cell will just appear blank instead. For example, you could say: =IF(A1>0,1,5) which means that if A1 is greater than zero, this cells value is 1, otherwise its 5. One of the common things I use with IF is the function ISBLANK() which takes a cell, and returns true if the cell is blank, or false if it isnt. So you can use this, for example, if youre using one column as a checklist and you want to set a column to a certain value if something hasnt been checked. If youre making a checklist and want to know how many items have (or havent) been checked off, by the way, theres also the function COUNTBLANK() which takes a range of cells as its one parameter, and returns the number of cells that are blank. For random mechanics, look back in the week where we talked about pseudorandomness to see my favorite three functions for that: RAND() which takes no parameters at all and returns a pseudorandom number from 0 to 1 (it might possibly be zero, but never one). Changing any cell or pressing F9 causes Excel to reroll all randoms. FLOOR() and CEILING() will take a number and round it down or up to the nearest whole number value, or you can use ROUND() which will round it normally. FLOOR() and CEILING() both require a second parameter, the multiple to round to; for most cases you want this to be 1, since you want it rounding to the nearest whole number, but if you want it to round up or down to the nearest 5, or the nearest 0.1, or whatever, then use that as your second parameter instead. Just to be confusing, ROUND() also takes a second parameter, but it works a little differently. For ROUND(), if the second parameter is zero (which you normally want) then it will round to the nearest whole number. If the second parameter is 1, it rounds to the nearest tenth; if the second parameter is 2, it rounds to the nearest hundredth; if the second parameter is 3, it rounds to the nearest thousandth; and so on in other words, the second parameter for ROUND() tells you the number of digits after the decimal point to include in the significance. RANK() and VLOOKUP(), I already mentioned back in Week 6; theyre useful for when you need to take a list and shuffle it randomly. Multiple worksheets By default, a new Excel file has three worksheet tabs, shown at the bottom left. You can rename these to something more interesting than Sheet1 by just double-clicking the tab, typing a name, and hitting Enter. You can also reorder them by clicking and dragging, if you want a different sheet to be on the left or in the middle. You can add new worksheets or delete them by right-clicking on the worksheet tab, or from the Insert menu. Ctrl+PgUp and Ctrl+PgDn provide a convenient way to switch between tabs without clicking down there all the time, if you find yourself going back and forth between two tabs a lot. The reason to create multiple worksheets is mostly for organizational purposes; its easier sometimes if youve got a bunch of different but related systems to put each one in its own worksheet rather than having to scroll around all over the place to find what youre looking for on a single worksheet. You can actually reference cells in other worksheets in a formula, if you want. The easiest way to
do it is to type in the formula until you get to the place where youd type in the cell name, then use your mouse to click on the other worksheet, then actually click on the cell or cells you want to reference. One thing to point out here is that its easy to do this by accident, where youre entering something in a cell and dont realize you havent finished, and then you click on another cell to see whats there and instead it starts adding that cell to your formula; if that happens or you otherwise feel like youre lost and not sure how to get out of entering half of a formula, just hit the red X button to the left of the formula bar and itll undo any typing you did just now. Graphing One last thing that I find really useful is the ability to create a graph, great for looking graphically at your game objects when they relate to each other. Select two or more rows or columns, each one is just a separate curve, then select the Insert menu, then Chart. Select XY(Scatter) and then the subtype that shows curvy lines. From there, go through the wizard to select whatever options you want, then click Finish and youll have your chart. One thing youll often want to do with graphs is to add a trendline; right-click on any single data point on the graph, and select Add Trendline. Youll have to tell it whether the trendline should be linear, or exponential, or polynomial, or what. On the Options tab of the trendline wizard, you can also have it display the equation on the chart so you can actually see what the best-fit curve is, and also display the R-squared value (which is just a measure of how close the fitted curve is to the actual data; R-squared of 1 means the curve is a perfect fit, R-squared of 0 means it may as well be random although in practice, even random data will have an R-squared value of more than zero, sometimes significantly more). If youre trying to fit a curve, as happens a lot when analyzing metrics, youll probably want to add these right away. Another thing you should know is that by default, the charts Excel makes are umm really ugly. Just about everything you can imagine to make the display better, you can do: adding vertical and not just horizontal lines on the graph, changing the background and foreground colors of everything, adding labels on the X and Y axis, changing the ranges of the axes and labeling them its all there somewhere. Every element of the graph is clickable and selectable on its own, and generally if you want to change something, just right-click on it and select Format, or else doubleclick it. Just be aware that each individual element the gridlines, the graphed lines, the legend, the background, the axes are all treated separately, so if you dont see a display option it probably just means you have the wrong thing selected. Play around with the formatting options and youll see what I mean. Making things look pretty Lastly, there are a few things you can do to make your worksheets look a little bit nicer, even without the graphs, even if its just cells. Aside from making things look more professional, it also makes it look more like you know what youre doing The most obvious thing you can do is mess with the color scheme. You can change the text color and background color of any cell; the buttons are on a toolbar in the upper right (at least they are on my machine; maybe they arent for you, if not just add the Formatting toolbar and its all on there). You can also make a cell Bolded or Italicized, left or right justified, all the other things youre used to doing in Word. Personally, I find it useful to use background color to differentiate between cells that are just text headings (no color), cells where the user is supposed to change values around to see what effect they have on the rest of the game (yellow), and cells that are computed values or formulas that should not be changed (gray), and then Bolding anything really important. You also have a huge range of possible ways to display numbers and text. If you select a single cell, a block of cells, an entire row or column, or even the whole worksheet, then right-click (or go to the Format menu) and select Format Cells, youll have a ton of options at your disposal. The very first tab lets you say if this is text or a number, and what kind. For example, you can display a number as
currency (with or without a currency symbol like a dollar sign or something else), or a decimal (to any number of places). On the Alignment tab are three important features: Orientation lets you display the text at an angle, even sideways, which can make your column headings readable if you want the columns themselves to be narrow. (Speaking of which you can adjust the widths of columns and heights of rows just by clicking and dragging between two rows or columns, or right-clicking and selecting Column Width or Row Height). Word Wrap does exactly what you think it does, so that the text is actually readable. Excel will gleefully expand the row height to fit all of the text, so if the column is narrow and youve got a paragraph in there, itll probably be part of a word per line and the whole mess will be unreadable, so youll want to adjust column width before doing that. Then theres a curious little option called Merge Cells, which lets you convert Excels pure grid form into something else. To use it, select multiple cells, then Format Cells and then click the Merge Cells option and click OK. Youll see that all the cells you selected are now a single giant uber-cell. I usually use this for cosmetic reasons, like if youve got a list of game objects and each column is some attribute, and youve got a ton of attributes but you want to group them together say, you have some offensive attributes and some defensive ones, or whatever. You could create a second header column over the individual headers, merge the cells over each group, and have a single cell that says (for example) defensive attributes. Excel actually has a way to do this automatically in certain circumstances, called Pivot Tables, but thats a pretty advanced thing that Ill leave to you to learn on your own through google if you reach a point where you need it. One thing youll find sometimes is that you have a set of computed cells, and while you need them to be around because youre referencing them, you dont actually need to look at the cells themselves theyre all just intermediate values. One way to take care of this is to stick them in their own scratch worksheet, but an easier way is to stick them in their own row or column and then Hide the row or column. To do that, just right-click on the row or column, and theres a Hide option. Select that and the row or column will disappear, but youll see a thick line between the previous and next columns, a little visual signal to you that something else is still in there that you just cant see. To display it again, select the rows or columns on either side, right-click and select Unhide. If you want to draw a square around certain blocks of data in order to group them together visually, another button in the Formatting tab lets you select a border. Just select a rectangle of cells, then click on that and select the border that looks like a square, and itll put edges around it. The only thing Ill warn you is that when you copy and paste cells with borders, those are copied and pasted too under normal conditions, so dont add borders until youre done messing around (or if you have to, just remove the borders, move the cells around, then add them back in or use Paste Special to only paste formulas and not formatting). Another thing that I sometimes find useful, particularly when using a spreadsheet to balance game objects, is conditional formatting. First select one or more cells, rows or columns, then go to the Format menu and select Conditional Formatting. You first give it a condition which is either true or false. If its true, you can give it a format: using a different font or text color, font effects like bold or italic, adding borders or background colors, that sort of thing. If the condition isnt true, then the formatting isnt changed. When in the conditional formatting dialog, theres an Add button at the bottom where you can add up to two other conditions, each with its own individual formatting. These are not cumulative; the first condition is always evaluated first (and its formatting is used if the condition is satisfied). If not, then itll try the second condition, and if not that itll try the third condition. As an example of how I use this, if Im making a game with a cost curve, I might have a single column that adds up the numeric benefits minus costs of each game object. Since I want the benefits and costs to be equivalent, this should be zero for a balanced object (according to my cost
curve), positive if its overpowered and negative if its underpowered. In that column, I might use conditional formatting to turn the background of a cell a bright green color if benefits minus costs is greater than zero, or red if its less than zero, so I can immediately get a visual status of how many objects are still not balanced right. Lastly, in a lot of basic spreadsheets you just want to display a single thing, and youve got a header row along the top and a header column on the left, and the entire rest of the worksheet is data. Sometimes that data doesnt fit on a single page, but as you scroll down you forget which column is which. Suppose you want the top row or two to stay put, always displayed no matter how far you scroll down, so you can always see the headings. To do that, select the first row below your header row, then go to the Window menu and select Freeze Panes. Youll see a little line appear just about where youd selected, and youll see it stay even if you scroll down. To undo this, for example if you selected the wrong row by accident, go to the Window menu and select Unfreeze Panes (and then try again). If you want instead to keep the left columns in place, select the column just to the right, then Freeze Panes again. If you want to keep the leftmost columns and topmost rows in place, select a single cell and Freeze Panes, and everything above or to the left of that cell is locked in place now. About that whole Fun thing At this point weve covered just about every topic I can think of that relates to game balance, so I want to take some time to reflect on where balance fits in to the larger field of game design. Admittedly, the ultimate goal of game design depends on the specific game, but in what Id say is the majority of cases, the goal of the game designer is to create a fun experience for the players. How does balance fit in with this? When I was a younger designer, I wanted to believe the two were synonymous. A fun game is a balanced game, and a balanced game is a fun game. Im not too proud to say that I was very wrong about this. I encountered two games in particular that were fun in spite of being unbalanced, and these counterexamples changed my mind. The first was a card game that I learned as Landlord (although it has many other names, some more vulgar than others); the best known is probably a variant called The Great Dalmuti. This is a deliberately unbalanced game. Each player sits in a different position, with the positions forming a definite progression from best to worst. Players in the best position give their worst cards to those in the worst position at the start of the round, and the worst-position players give their best cards to those in the best position, so the odds are strongly in favor of the people at the top. At the end of each hand, players reorder themselves based on how they did in the round, so the top player takes top seat next round. This is a natural positive feedback loop: the people at the top have so many advantages that theyre likely to stay there, while the people at the bottom have so many disadvantages that theyre likely to stay there as well. As I learned it, the game never ends, you just keep playing hand after hand until youre tired of it. In college my friends and I would sometimes play this for hours at a time, so we were clearly having a good time in spite of the game being obviously unbalanced. Whats going on here? I think there are two reasons here. One is that as soon as you learn the rules, it is immediately obvious to you that the game is not fair, and in fact that the unfairness is the whole point. Its not that fairness and balance are always desirable, its that when players expect a fair and balanced game and then get one that isnt, the game doesnt meet their expectations. Since this game sets the expectation of unfairness up front, by choosing to play at all you have already decided that you are willing to explore an unbalanced system. Another reason why this game doesnt fail is that it has a strong roleplaying dynamic, which sounds strange because this isnt an RPG but at the same time, players in different seats do have different levels of power, so some aspect of roleplaying happens naturally in most groups. The players at the top are having fun because its good to be king. The players at the bottom are also having fun
because theres a thrill of fighting against the odds, striking a blow for the Little Guy in an unfair system, and every now and then one of the players at the bottom ends up doing really well and suddenly toppling the throne, and thats exciting (or one of the guys on top crashes and falls to the bottom, offering schadenfreude for the rest of the players). For me its equally exciting to dig my way out from the bottom, slowly and patiently, over many hands, and eventually reaching the top (sort of like a metaphor for hard work and retirement, I guess). Since the game replicates a system that we recognize in everyday life where we see the haves and have-nots, being able to play in and explore this system from the magic circle of a game has a strong appeal. The second game I played that convinced me that theres more to life than balance, is the (grammatically-incorrectly titled) board game Betrayal at House on the Hill. This game is highly unbalanced. Each time you play you get a random scenario which has a different set of victory conditions, but most of them strongly favor some players over others, and most dont scale very well with number of players (that is, most scenarios are much easier to win or lose if there are 3 players or 6 players, so the game is often decided as a function of how many players there are and which random scenario you get). The game has a strong random element that makes it likely one or more players will have a very strong advantage or disadvantage, and in most scenarios its even possible to have early player elimination. Not that it has to do with balance, but the first edition of this game also has a ton of printing errors, making it seem like it wasnt playtested nearly enough. (In fact, I understand that it was playtested extensively, but the playtesters were having such a fun time playing that they didnt bother to notice or report the errors they encountered.) In spite of the imbalances, the randomness and the printing errors, the game itself is pretty fun if you play in the right group. The reason is that no matter what happens, in nearly every game, some kind of crazy thing happened thats fun to talk about after the fact. The game is very good at creating a story of the experience, and the stories are interesting. Partly this has to do with all of the flavor text in the game, on the cards and in the scenarios but that just sets the haunted-house environment to put the players in the right frame of mind. Mostly its that because of the random nature of the game, somewhere along the line youll probably see something that feels highly unlikely, like one player finding a whole bunch of useful items all at once, or a player rolling uncharacteristically well or poorly on dice at a key point, or drawing just the right card you need at just the right time, or a player finding out the hard way what that new mysterious token on the board does. And so, players are willing to overlook the flaws because the core gameplay is about working together as a team to explore an unfamiliar and dangerous place, then having one of your kind betray the others and shifting to a one-against-many situation, and winning or losing as a coordinated team. And as a general game structure, that turns out to be unique enough to be interesting. Now, player expectation is another thing that is a huge factor in Betrayal. Ive seen some players that didnt know anything about the game and were just told, oh, this is a fun game and they couldnt get over the fact that there were so many problems with it. When I introduce people to the game, I always say up front that the game is not remotely balanced, because it helps people to enjoy the experience more. And incidentally, I do think it would be a better game if it were more balanced, but my point is that it is possible for a game design to succeed without it. So, at the end of the day, I think that what game balance does is that it makes your game fair. In games where players expect a fair contest, balance is very important; for example, one of the reasons a lot of players hate the rubber-banding negative feedback in racing games, where an AIcontrolled car suddenly gets an impossible burst of speed when its too far behind you, is that it feels unfair because real-life racing doesnt work that way. But in a game like The Great Dalmuti which is patently unfair, players expect it to be unbalanced so they accept it easily. This is also why completely unbalanced, overpowered cards in a Trading Card Game (especially if theyre rare) are seen as a bad thing, but in a single-player card-battle game using the same mechanics they can be a lot of fun: for a head-to-head tabletop card game players expect the game to provide a fair match, so they want the cards to be balanced; in the case of the single-player game, the core of the game is
about character growth and progression, so getting more powerful cards as the game progresses is part of the expectation. Just like everything in game design, its all about understanding the design goals, what it is you want the player to experience. But if you want them to experience a fair game, which is at least true in most games, then that is the function of balance. In fact, the only games I can think of where you dont want balance are those where the core gameplay is specifically built around playing with the concept of fairness and unfairness. If Youre Working on a Game Now Well, youre probably already using Excel in that case, so theres not much I can have you do to exercise those skills that youre not already doing. If your game has a free-for-all multiplayer structure, ask yourself if any of the problems mentioned in this post (turtling, kill-the-leader, sandbagging, kingmaking, early elimination) might be present, and then decide what (if anything) to do about them. If your game has an economic system, analyze it. Can players trade? Are there auctions? What effect would it have on the game if you added or removed these? Are there alternatives to the way your economic system works now that you hadnt considered? Homework Since its the end of the course, there are two ways to approach this. One is to say, no homework at all, because hey, the course is over! But that would be lazy design on my part. Instead, let me set you a longer challenge that brings together everything weve talked about here. Make a game, and then apply all the lessons of game balance that you can. Spend a month on it, maybe more if it ends up being interesting, and then youll have something you can add to your game design portfolio. Its up to you whether you want to do this, of course. Some suggestions: Design the base set for an original trading-card game. I usually tell students to stay away from projects like this, because TCGs have a huge amount of content, so lets keep it limited here: Design an expansion set to an existing TCG. First, use your knowledge of the existing game to derive a cost curve. Then, put that into Excel, and use it to create and balance a new set. Create one or two new mechanics that you need to figure out a cost for on the curve, and playtest on your own to figure out how much the new mechanic is actually worth. Make a small set, maybe 50 to 80 cards. Or, if youve got a bit more time and want to do a little bit of systems design work as well as balance: Make the game self-contained, and 100 cards or less. Consider games like Dominion or Roma or Ascension which behave like TCGs but require no collecting, and have players build their deck or hand during play instead. As soon as youre done with the core mechanics, make a cost curve for the game. Put the curve into an Excel spreadsheet and use it to create and balance the individual cards. Playtest the game with friends and challenge them to break the game by finding exploits and optimal strategies. Adjust your cost curve (and cards) accordingly, and repeat the process. Or, find a turn-based or real-time strategy game on computer that includes some kind of mod tools. Work on the balance: First, play the game a bit and use your intuition to analyze the balance. Are certain units or strategies or objects too good or too weak? Look around for online message
boards to see if other players feel the same, or if you were just using different strategies. Once youve identified one or more imbalances, analyze the game mathematically using every tool at your disposal, to figure out exactly what numbers need to change, and by how much. Mod the game, and playtest to see if the problem is fixed. For a more intense project, use the mod tools to wipe out an entire part of the gameplay, and start over designing a new one from scratch. For example, maybe you can design a brand-new set of technology upgrades for Civilization, or a new set of units for Starcraft. Use the existing art if you want, but change the nature of the gameplay. Then, work on balancing your new system. If you like RTS games but prefer something on tabletop, instead design a miniatures game. Most miniatures games are expensive (you have to buy and paint a lot of miniatures, after all) so challenge yourself to keep it cheap. Use piles of cardboard squares that you assemble and cut out on your own. With such cheap components, you could even add economic and production elements of the RTS genre if youd like. First, look at the mechanics of some existing miniatures games, which tend to be fairly complicated. Where can you simplify the combat mechanics, just to keep your workload manageable? Try to reduce the game down to a simple set of movement and attack mechanics with perhaps a small handful of special abilities. (For a smaller challenge you can, of course, just create an expansion set to an existing miniatures game that you already play.) As with other projects, create a cost curve for all attributes and abilities of the playing pieces, and use Excel to create and balance a set of unit types. Keep this small, maybe 5 to 10 different units; youll find it difficult enough to balance them even with just that few. Consider adding some intransitive relationships between the units, to make sure that no single strategy is strictly better than another. If you end up really liking the game, you can make another set of units of the same size for a new faction and try to balance the second set with the first set. Print out a set of cheap components, and playtest and iterate on your design. Or, if you prefer tabletop RPGs, analyze the combat (or conflict resolution) system of your favorite game to find imbalances, and propose rules changes to fix it. For a longer project, design your own original combat system, either for an existing RPG as a replacement, or for an original RPG set in an original game world. As a challenge to yourself and to keep the scope of this under control, set a page limit: a maximum of ten pages of rules descriptions, and it should all fit on a one-page summary. Playtest the system with your regular tabletop RPG group if you have one (if you dont have one, you might consider selecting a different project instead). References I found the following two blog posts useful to reference when writing about auctions and multiplayer mechanics, respectively: http://jergames.blogspot.com/2006/10/learn-to-love-board-games-again100.html#auctions and
http://pulsiphergamedesign.blogspot.com/2007/11/design-problems-to-watch-for-in-multi.html

Game Balance

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Game Balance

Hochgeladen von

Copyright:

Verfügbare Formate

Level 1: Intro to Game Balance

Level 2: Numeric Relationships

Level 3: Transitive Mechanics and Cost Curves

Level 4: Probability and Randomness

Level 5: Probability and Randomness Gone Horribly Wrong

Level 6: Situational Balance

Level 7: Advancement, Progression and Pacing

Level 8: Metrics and Statistics

Level 9: Intransitive Mechanics

Player Bs payoff table looks like this:

Our matrix is now:

This becomes the following matrix:

Our matrix is now:

Level 10: Final Boss

Das könnte Ihnen auch gefallen