Model Use from the Design through Retail Phases
In each stage of development, the benefits of models change. In the design, or prealpha phase, the use of models accomplishes two goals:
In the alpha and beta phases of development, the theoretical results produced by models can be compared to actual player data. Variable distributions such as player's primary statistics, skill points allocations by level, and randomly generated loot distributions can be refined so that a model's input factors are appropriate. Additionally, during this phase of development, assumptions within the model regarding player tendencies will be validated or invalidated. By comparing the model to actual in game results, a model can be refined and made much more accurate.
When testing a retail candidate, results generated by models can be compared to ingame logs. The greatest benefit is in regards to the everpresent player complaints regarding overpowered classes. By comparing ingame logs with simulated model results, designers can definitively state whether a character class is functioning as intended. When ingame results differ from simulated results, models can help find bugs within the code.
Who Designs the Models?
There are two logical ways for a team to assign the task of modeling game interactions. The first option is to have the various system designers on the game development team create their own models. The second option is to have one team member responsible for developing a comprehensive model for the entire game. In the latter case, the model will be based on information that the various system designers provide. Both options have their pros and cons:
Benefits/Drawbacks
To System Designers Creating Their Own Models:
+ The modeler has a comprehensive understanding of the system modeled
+ The ability to change and adjust inputs at will in order to observe
the outcome
+ Greater understanding of how input variables shift the curves
 Designer may be less familiar with Excel and probability distributions
 Parallel and interrelated systems may not be contained within
the same model
 Design leads are required to review the output from numerous sources
Benefits/Drawbacks
To Assigning One Person To Model All Systems:
+ It generates a cohesive model containing all the game systems
+ The ability for systems designers to view the outputs for all
systems at once
+ The ability for changes affecting the entire game to be analyzed
from one place
+ Designers need only speak to one person to review parallel systems
designs
 Necessity for full and open communication between the modeler
and the designer
 Modeler may end up with more input on balance than a system designer
can accept
 Only one person intimately understands how game systems are modeled
It's up to the lead designer to decide how the team should be structured. As both setups have their positive and negative aspects, the design lead should choose the one that will best fit the team's leadership style and corporate culture.
Limitations of Computer Models
Unfortunately, the use of spreadsheet models and the @Risk addin does not guarantee balance in a game. Player interaction models are simply a method by which real game results can be predicted. It's up to the designer to analyze the simulation results and determine whether they are acceptable. As previously mentioned, a simulation's results are only as good as the model that produced it. The two main drawbacks of spreadsheet models are:
In the case of the first drawback, it is imperative that the person designing a model be very familiar with not only the systems being designed, but also the ways in which players utilize their character's skills and abilities. Models attempt to predict the outcome of a given scenario. If a player, or group of players, approaches that situation in a completely different fashion than the developer originally anticipated, the model would be invalid.
A good example of this exists in Mythic's Dark Age of Camelot. The traditional Player vs. Environment (PvE) leveling group was composed of tanks, mages and healers. (A "leveling group" is a term applied to any number of players who actively play together with the objective of killing monsters in order to gain experience and thereby increase their character's levels.) Advancement rates for a traditional group could easily be modeled and determined if one expected players to utilize groups similar to this in PvE combat. However, the rise in popularity of the enchanter class, and their pet pulling techniques changed the leveling dynamic significantly.
(In Dark Age of Camelot, the enchanter class within the realm of Hibernia is a pet class. The player character is able to summon an NPC pet that can be loosely controlled by the player. Enchanters possess a spell line called "Damage Shield (Focus)." The "focus shield" can be cast and maintained on the enchanter's pet provided that the enchanter does not move, take damage, or attempt to cast any other spells. While the focus shield is active on the enchanter's pet, any PCs or NPCs damaging the pet will receive damage from the focus shield. Additionally, the focus shield generates massive amounts of "agro" as it damages NPC monsters. "Agro" is term used to describe the relative aggression that NPC monsters feel towards different players. In general, agro values are created and increased as damage is dealt to the NPC as well as when healing spells are cast upon PCs. The game's AI generally compels NPCs to attack the player character with the greatest agro value. In the case of enchanters and their "pet pulls" and focus shields, the enchanter's NPC pet constantly maintains agro.)
Instead of a traditional group composed of several melee characters that could absorb damage and generate agro (thereby maintaining the NPC monster's focus upon the melee characters), the typical group composition changed to include a greater number of mages. Given that mages generate the majority of the damage dealt in PvE combat, monsters were dying significantly faster than expected, and the rate of player level advancement increased. It would have been unlikely for the original model to include a simulation such as this. Therefore any encounters that were designed based on "traditional" group models were invalidated, as the enchanter method of leveling is completely different.
When player tendencies change like this, models should be revised to account for the new play patterns. In this way, adjustments can be made to game systems to ensure that the original design objectives are maintained.
In the case of incorrect inputs, this can occur in several situations. When inserting variables into a model, the modeler attempts to generate results for an average player. If the modeler's notions of "average" are inaccurate, then the results generated by the model will be invalid. It is therefore important to obtain real player character data in the alpha and beta phases and compare that data to a model's input distributions. If the real data differs from the model's inputs, then those inputs should be revised and the simulation rerun to check for any changes in the expected outcome.
Practical Model Design
When developing RPG systems, the models should be as complex as the game systems themselves. Irrespective of a system's complexity, the fundamentals of developing and balancing a game system remain the same. In order to demonstrate the methodology by which risk analysis principles can be used in games, I've devised an example. It's an extremely simplistic combat system, which will be modeled and then balanced. Two example scenarios will be reviewed with the first scenario serving as the model's base case.
The base case for this system is a onehanded sword/shield melee versus a twohanded swordwielding melee. The objective is to model the system and determine how to adjust the system such that when this kind of duel occurs, either character has an equal chance of winning. The second scenario pits each of the two melee types (from scenario one) against an NPC monster of the same level. The objective is to balance each of the two player characters in relation to the monster without changing either of their design templates. The overall goal is to design the characters such that they possess a 75% chance of winning the fight against the NPC monster.
The sample combat system is exceptionally simple due to the spatial constraints of this article. Nevertheless, a more complex system using multiple factors affecting "to hit" rolls, positional attacks, variable speed weapons, and critical strikes would be no more difficult to balance.
Design Parameters
In this basic system, the following constants, variables and equations were used.
Constants:
Variables:
Equations
Character level was arbitrarily defined as 50 in the model. The skills assigned to each of the two characters were limited to their primary weapon type and primary defense mechanism. Armor factor was set to 200. Initially the weapon damage and variance were set to 24.3 and 1.5, respectively, for the one handed melee. For the twohanded melee weapon, damage and variance were set to 31.3 and 1.5, respectively.
In the basic character template, the only variables defined were primary character statistics. A system was adopted where each of the four primary statistics had a base value of 90 points. Players could allocate an additional 60 points between the four stats with a maximum of 25 to any one statistic. In order to simulate player choice, a truncated chi squared function was used. The mean value was set at 15 points per skill, with a median of 14.86 and a mode of 14.04. The distribution was truncated on the low side at 1 point per statistic, and on the high side at 25 points per statistic. In order to ensure that exactly 60 points were distributed among the four primary statistics, distributions were applied to only three. Quickness was left as a simple summation function that equalized the overall statistic point expenditure at 60.
The equations used in the simulation were mainly related to the character's primary statistics and skills. Hit points were based on a function of character level and constitution. Damage absorption was directly related to armor factor (a constant) plus the character's variable constitution amount.
The "to hit" roll was calculated as follows. Dodge was directly related to a character's quickness, such that an average of 10.5% chance to dodge existed. The blocking chance related to 80% of shield skill, while parry was set to 60% of the base skill level. The dodge, blocking and parry chances were added together and then subtracted from 1.0 in order to determine the percentage chance of each character taking damage.
The last two equations deal with weapon damage and damage variance. A primary statistic modifier was developed that would affect both weapon damage and variance. The modifier was based on the rootmeansquare of strength and dexterity. In turn, the actual weapon damage was a function of the primary statistic modifier, the weapon's base damage, and also the character's weapon skill. Weapon variance was related to only base variance and the primary statistic modifier.
When modeling the combat rounds, a uniform distribution between 01 was used. This provided the information needed to determine if a hit was successful. A uniform distribution was also used to determine the damage dealt each round. This distribution ranged between the weapon average variance subtracted from average weapon damage, and the weapon average variance added to the average weapon damage. IF statements were used to track the hit/nohit rates in each round and therefore determine whether damage was taken. Finally, the hit points of each character were tracked on a perround basis.
As weapon damage and variance were the only factors adjusted when balancing this model, their initial values were determined by multiplying weapon damage by evasion chance and absorption factor, and then dividing into the character's hit points. The average number of combat rounds desired for combat resolution was 30.


Figure 3: Inconsistent Results Observe with 2000 iterations. Click on Image to Enlarge. 
Analyzing the Results
Initially, 2,000 iterations of the model were performed, but limiting the samples to a range this small displayed unacceptable variances in the simulation results. The variations observed in the low iteration trials are shown in Figure 3. In order generate more consistent and natural results with Monte Carlo sampling, 5,00010,000 iterations were required. Consequently, all further simulations were run with 10,000 iterations.


Figure 4a: OneHanded Melee vs. TwoHanded Melee Expected Results. Click on Image to Enlarge. 


Figure 4b: OneHanded Melee vs. TwoHanded Melee Expected Results (Closeup). Click on Image to Enlarge. 
By increasing the number of iterations performed for each simulation, variations in the probability curves disappeared and simulation results returned consistent values. The results of the onehanded vs. twohanded melee trials are shown in Figure 4a and in Figure 4b. Information presented in these two figures represents the percentage chance of either melee winning the combat in any given round. As Figure 4a shows, prior to round 22, there is almost no chance of that either character will die. In the twentythird round, the onehanded melee has a 5% chance of death while the twohanded melee's remains at 0%. During round 24, the chance of death for the onehanded melee has increased to 10% and the twohanded melee is observed to have a 5% chance of dying. In the 30th round, the fifty percent 50% mark is reached. The point where the two lines cross is the balance point. From this point, the expected outcome of any given fight is obtained. If the junction of the two lines occurred at 60%, then the twohanded melee would have a 60% chance of winning and given battle, while the onehanded melee's chance of success would be 40%.
It is important to observe the similarities exhibited between the two curves. In the base case involving the melee combat between a onehanded weapon and a twohanded weapon, the general slope of the two curves is the negative value of the other. This trend is not always the case. By adjusting the inputs into the model, it is possible to change the slope. An example of this is shown in Figure 5. In that graph, the junction of the two lines continues to occur at 50% in round 30. However, the slope of the two curves is no longer the negative value of the other. The twohanded melee weapon has a 100% chance of killing the onehanded melee weapon by the 40th round. However, the onehanded melee weapon only attains a 75% maximum before death. On the low end of the scale, the onehanded melee weapon is observed to have a chance of killing the twohanded melee weapon earlier. However, these two do not balance out. Calculating the area under each of the two curves between rounds 15 and 40, the areas add up to 9.4 and 8.5 for the onehanded and twohanded melee weapons, respectively. This indicates that the onehanded melee weapon is more likely to win in single combat.


Figure 5: Inequalities Resulting from Unequal Slopes. Click on Image to Enlarge. 
The ability to modify the slopes, and therefore the overall probability of winning, is an advantage when designing skills and abilities for classes of different types. When comparing a melee character to a mage, archer or other ranged character, variables and equations can be adjusted to balance range versus damage and differing absorption.