Multiplayer
support in console games has traditionally been a secondary feature,
appealing to the hardcore market and generating relatively few extra
sales. This has been changing as online gaming goes mainstream,
especially since consoles began supporting broadband.
Many
developers find themselves required to add multiplayer for the first
time, and are in the position of developing multiplayer systems
from scratch. The straightforward solution is to implement a reliable
packet system and use this to encode and transmit game-level messages.
This results in network messages being hardcoded at a low level
of abstraction, requires more development time over the life of
the project, and may limit your game's feature set. A game-level
network architecture can abstract away many common tasks, saving
time, improving flexibility, and addressing cheating issues that
can't be solved easily elsewhere. While architecture is game specific,
this article will present high level ideas that can be used in whole
or in part when designing your architecture, along with optimization
ideas that can benefit existing systems.
Choosing Network Topology
Any
networking architecture deals with three major issues: security,
flexibility, and performance. The first thing to consider is what
topology to use, as this is probably the single most important issue
to resolve. Traditionally, games take one of two approaches: peer-to-peer
or client-server. In a peer-to-peer system data is stored on the
client and authorized by a consensus system whereas with client-server
the server is trusted and stores most to all game data. Peer-to-peer
avoids putting network load on a single system. This can be advantageous
for data that doesn't need to be routed through a server, such as
directed chat messages. Another situation where peer-to-peer becomes
beneficial is when you don't have a single system that would be
a suitable as a server - such as when all systems are consoles,
and no one system has enough bandwidth to handle all combined traffic.
Peer to peer communications can provide better performance but a
client-server system is more secure and flexible.
An
architecture commonly used is a client-server object replication
system. In this system, specific objects and data members are transmitted
on creation or changes. This can happen explicitly, with a call
to serialize and replicate, or implicitly, where data is polled
every update cycle and compared to the value it had the last update
cycle. This is a good approach. It is straightforward and easy to
understand. It fits into existing single player systems. With implementations
that replicate objects automatically, scripts can modify data without
knowledge of the underlying system and still achieve network synchronization.
However, it is less efficient than a hand-tweaked system, sending
data independent of context and often redundantly. It is also less
secure, creating a strong coupling between game-code and network
activity. As game-code changes are made, especially by programmers
that aren't familiar with network security, vulnerabilities arise.
We're going to expand on this to address these problems efficiently.
The
architecture suggestions described herein are intended to provide
the following qualities:
1.
Cheat-resistant. When applicable, data and some game-code is maintained
server side, optionally supplemented with peer-to-peer communications
to avoid unnecessary server traffic.
2. Flexible and extensible. Intentional abstraction achieves loose
coupling of game code with network code.
3. Low bandwidth utilization. Low level context related optimizations
are exposed to higher level routines so scripters can define algorithms
and rules by which to send data.
4. High performance. Optimizations and features described should
be fast, require low memory utilization, and be optional improvements
to existing proven algorithms.
Security
and cheat-resistance
Security
is important for any game, but especially if you plan to publish
for the mass market. The more popular your game is, the more people
will try to hack it. As soon as one hack is found it will spread
through the ranks of would-be cheaters like wildfire with negative
press likely to follow. This article doesn't cover cheating in its
entirety, which deserves a whole book of its own (Applied Cryptology
by Bruce Schneier is excellent), but we can stop some cheats by
making wise architectural decisions.
Some
of the most common methods used to cheat:
Method
System
Level
Example
Game
Description
Bug/Design
Exploits
Game
Ultima
Online
Attacking
other players with firefields so as to not gain notoriety.
Network
Attacks
Game-level
network system
Counter-Strike
Intentional
latency to cause client authoritative damage.
Modified
binaries
Binary
protection, Game-level network system
America's
Army
Modifying
memory to see through teargas and bushes
Modified
packets
Low
level network system
Quake
Modifying
direction parameter in a packet - aimbot
Falsified
connections
Low
level network system
Numerous
games
DoS
by filling server with client bots
Cheating
can be addressed through logging, robust implementation, validation,
and architecture. Logging involves recording the last n packets,
chat messages, connections and disconnections, invalid packet data,
unusual game behavior, and per-client statistical data. Your low
level networking system should include packet tampering detection,
encryption, and security against falsified connections. Executable
and system level protection involves data encryption, obfuscation,
and validation. Game network-system issues will be covered throughout
this article, beginning with network topologies.
Sensitive
data on the server:
The
traditional game topology is client-server, with most to all sensitive
data kept server-side. This works well in most cases. I'm going
to go one step further, and also suggest that game algorithms insofar
as practical be kept server-side as well. In this situation, the
client doesn't know game related algorithms in advance; it only
knows how to process generic commands.
As
an example, a traditional implementation of capture the flag has
capture the flag implementation code on both the client and the
server. The client is usually trusted to handle mundane events such
as playing a sequence of animations and position the camera when
the 'flag capture' message arrives. In my proposed implementation,
no capture the flag code would exist on the client. In its place
would be an extended command interface used for all game modes,
enabling a high level of control from the server over the client's
interface and in-game actions. All algorithms related to capture
the flag exist only on the server.
The benefit of the second system is twofold. First, any changes
to a game mode or new game modes require only a server-side patch.
Second, potential cheaters have no foundation to work from, at least
for CTF, because there is no CTF related code on the client. Any
hacks would either have to deal with generics, or be discovered
and implemented in real-time. The downside of this approach is that
it requires more bandwidth - however this can be minimized through
techniques discussed later in this article.
Abstract
programming is much more difficult to design and implement than
concrete programming. In practice what you implement will depend
largely on the existing system, advance requirements knowledge,
time, and reuse goals. The intent one should strive for is to keep
game-specific data out of the client. For example, the game code,
on both the server and client, should never assume the type of an
object sent over the network as an ID. Data must be validated upon
reception, by verifying ID types, and bounding numeric and other
data.
Often
you don't want one client to know the IP address of another client.
Allowing players to know one another's IP's can lead to denial of
service attacks and spoofed connections. However, in certain contexts,
such as setting up voice communication or with small games, it can
be reasonably safe for one client to know the IP of another client.
In that case it's worth considering supporting a peer-to-peer system
to transmit non-sensitive data directly, rather than through the
server. Even with client-server architecture, in some cases it would
still be necessary to perform server migration, for example in a
4 player game one player who was acting as the server might disconnect,
forcing a different player to assume the role of server.
Sensitive
data on the client:
When
data is on the client, such as with a peer to peer topology, one
method to detect cheating is a trust metric based on certain assumptions:
1.
Cheaters are likely to cheat right away, if for no other reason
than to check if the cheat works.
2. Cheaters are likely to perform the same cheat many times in a
row. A speed hack will speed up your character every frame, rather
than one frame.
3. A cheater's data is likely to be widely divergent from the data
on the other systems - For example, gain 100 HP instead of gain
1 HP.
4. It's likely that only one or two systems will be doing the same
cheat at the same time.
The
process is straightforward. A new peer/client is assigned an initial
trust rating. This value is continuously increased over time. Every
time there is suspicious data from that client, subtract from the
trust rating. For a client-server based system, you are done. If
x <= 0, or if x drops too quickly, kick that client. With a peer
to peer system, if x <= 0, or if x drops too quickly, then compare
this value to that of other systems. If a majority of systems all
report similar changes to the trust rating, kick the offending system
out of the group. Otherwise, set the trust rating back to some average
of the systems. These values are all contextual, and I have no specific
numbers to recommend. A good rule of thumb is to make unintentional
latency cause decreases in the trust rating that are marginally
smaller than what is actually considered a cheat.
Efficient
Network Messaging
With
cheating behind us, it's time to address messaging. Broadly speaking,
messaging is used for:
Object
replication involves sending a message when an object is created
or destroyed by an authoritative system so that this same object
is created or destroyed on other systems. Data members of that object
may be synchronized as well and inherit the update rules of the
object. Remote Method Invocation involves parsing commands and data
out of messages, ensuring correct encoding, and mapping the command
to a function on the remote machine with the data in the message
passed as arguments to that function. Data synchronization involves
determining which memory address to update given a system independent
identifier, ensuring that this memory address is valid to write
to in the given context, and changing the value. As this article
is about architecture, I won't go into specific implementations
but will instead explore improvements to these operations.
Relevance
Based Messaging
Relevance
based messaging is the practice of sending only relevant state-change
data to relevant systems in order to reduce network traffic. In
this case I am referring to object creation/destruction and updates
to variable values. Update culling can be performed in both these
cases. The most common culling method for objects is distance based,
where objects will be created at range X, and destroyed at range
X+Y from the target system's field of interest. The most common
culling method for variables is to not resend if the data is identical.
Obviously, if an object is culled so are its member variables.
The
reason more advanced culling methods are not usually used is that
they are contextual and we don't know much about the context at
the network system level. For example, it may not be necessary to
send most position updates if another player is on the opposite
side of a impenetrable wall, even if they are nearby. The solution
I propose is simple: the common solutions can remain as default
behaviors, and expose the culling method or method properties to
the higher level in the form of callbacks, overridable functions,
a parameter list, or some other method. Architecturally, this means
that each system which broadcasts updates needs to know the prior
value per-system for each synchronized object or variable. If memory
is a concern, we can limit saving prior values to only variables
that change frequently.
Encoding
Network Commands, Text, And Data
1.
Encoding network commands
Games often send the same command frequently. A common operation
is to use the first byte of a packet to represent a command identifier.
This works but we can shave off a few bits by creating a frequency
table of network commands which indicate likely or previously recorded
session averages. For example:
Per
system:
Movement: Called 84 times
Press trigger: Called 22 times
New player joined: Called 5 times.
We
can use this frequency table as input for arithmetic or Huffman
encoding, which will then give us an optimal bit-wise representation
of each command. Table generation can be performed beforehand so
as to not slow down the game at runtime. Clients and servers have
different output sets and frequencies so this will need to be done
separately for each system.
2.
Encoding text commands
We may also apply this technique to commonly sent strings. On connection,
a server can send a string table to the client of all strings that
are mapped to a unique identifier. More complex implementations
may encode special characters in the string for printf style format
specifiers. From that point on, one bit can be sent, indicating
a string lookup (or not), followed by a string identifier. If desired,
this can also be based off some optimal encoding method as was done
with commands. Alternatively, the first time a string is sent the
server can send the string identifier followed by the string. From
that point on the client is expected to use the string identifier.
This can work unidirectional, or you can ack string identifiers
and use them both ways.
3.
Encoding data
Memory and processing permitting, we may also wish to perform global
compression. The technique is the same. Generate a frequency table
based on previously recorded server output and client output sent
from the parsing system. Adaptive run-time compression is possible
for data sent reliably and ordered.
A
more useful and practical approach than global compression is to
write data to bit streams that perform simple run-time contextual
compression. The approach used by the network library RakNet
is to provide WriteCompressed and ReadCompressed calls that recursively
write a <1> if the high half of the data is all 0's Otherwise
it writes a 0, then the rest of the data. The approach used by the
Torque network engine is to specify the min and max values
of a data element by which the minimum number of bits will be used
to write that data. The first technique requires less maintenance
and is more flexible, while the second technique provides better
compression. If you decide it is worthwhile to use the second technique,
this can be extended by a further parameter: granularity. Sometimes
we don't care whether a value is 503.135224 or 503. By specifying
the bits of granularity, we can perform this reduction automatically.
In
either case, the option to specify compression parameters should
be exposed to the higher level so that scripters or implementers
can take advantage of it. In the case of specifying bits, it would
be a very good idea to use a utility class to specify parameters
and share this class between systems to reduce maintenance and the
chance of mistakes.
Packet
Content
One
way to reduce the number of client network commands as well as reduce
the likelihood of cheating is to have the client send inputs, rather
than the results of inputs. This is straightforward and can make
programming easier in some ways.
Send:
" Movement commands
" Use key command
" Jump key command
" Switch to weapon #5 command.
Don't
send:
" Pick up red team's flag
" Capture flag
" Kill player with ID 381
" Switch to weapon 'Sniper Rifle'
The
difference is that the set of actions has a one-to-many relationship
with the set of input commands. Once you implement the input commands
you gain the actions for free, and actions are a moving target while
commands tend to stay fixed. This can reduce required maintenance
and code-size. From the context of cheating, it limits the amount
of code you have to deal with and in the worst-case scenario the
client won't be able to do anything they couldn't have done within
the normal context of the game. In practice it's not possible to
achieve this level of abstraction and still have good game play
because entropy will cause clients to get out of synch.
We
can address entropy by including relevant data along with client
commands. An example would be including position and orientation
along with a movement or shoot command. This layer should be exposed
to scripters or gameplay programmers because the information to
determine what to send is only available at that level. This adds
some complexity because we are sending client specific data, which
is not trusted, along with commands, which are trusted. This client
originated data will have to be parsed at the destination and compared
against previous and likely values.
The
practice of sending relevant data along with commands is also necessary
for the server, for the same reasons. The difference is that clients
can trust server data, so aside from value interpolation no additional
processing needs to be done. As before, the interface for determining
which variables are relevant should be exposed to scripters and
gameplay programmers.
Network
data summary
Throughout
this article I have referred to applying properties to commands,
variables, objects, systems, and strings. It may be clearer in certain
implementations to aggregate these properties into a single system,
whereby you can define all these properties up-front. This table
summaries the properties and extra data I've covered:
Per
command:
A unique identifier that can be used to find the command in a
look-up table.
A
bitwise encoding representing this command.
A list of variables and objects that need to be synchronized when
this command is used.
Per
variable:
A
unique network transmittable identifier that is two way mappable.
From the variable we should be able to get the identifier, and
from the identifier we should be able to get the variable, given
the context of the enclosing struct or class, if any.
What
properties or method to use to interpolate this variable, if it
is interpolated at all.
If
this variable is written with compression, and if so the number
of bits, min, and max values
What culling function to call or properties to use, if any, to
determine if a variable has changed enough to be worth sending.
What object this variable is a member of, if any. Used to determine
if a send is not necessary because the object itself is culled.
Which
system(s) are allowed to update this variable.
Per
object:
A
unique network transmittable identifier that is two way mappable.
From the object we should be able to get the identifier and from
the identifier we should be able to get the object.
What
culling algorithm to call or properties to use, if any, to determine
if an object is in the area of interest (to send object creation
and updates). Also what algorithm to process, if any, to determine
if the object is in the area of disinterest (to send object destruction).
The list of synchronized member variables.
Which
system(s) are allowed to create and destroy this class.
Per
remote system, on a server or peer:
Which
objects this system has instantiated
The
field of interest for this system (usually the camera position)
The
last values for the variables that were sent to this system.
Per
encoded string:
A unique identifier by which programmers or scripters can refer
to this string.
A bitwise encoding of this identifier that can be transmitted
over the network
Optionally, what fields each string expects to take, if format
specification is used.
Synchronizing
non-critical data
Some
data isn't critical but would be beneficial to have synchronized
among systems. This includes graphical effects such as randomized
particles or playing the same animation key frame at the same time.
While it is possible to synchronize this data with messages, it
is not necessary because they don't affect gameplay. A better and
somewhat well-known approach is to synchronize one or more integers.
The integers can change value over time according to some predetermined,
low bandwidth method. For example, sending the current and next
value, where the current value will change to the next value at
a predetermined time. These integers can be used as key frame indices,
random number generator seeds, and other things.
Low
level network system requirements
It's
important to point out that I didn't cover the low level network
system that this system relies on. At a minimum the low level network
layer should support, in order of most important to least:
Required
features:
Reliable
and unreliable messaging
Ordering and sequencing with multiple channel support
Secure connections with strong packet encryption and tampering
detection
A
wide range of statistical reports
Automatic
message to packet aggregation and splitting
Remote
method invocation
Automatic
flow control
Message
priority level support
Native
internet simulation capabilities (packetloss, spikes, lag, out
of order packets)
Path
MTU discovery with the ability to set this explicitly
Native
bitstream support
Optional
but helpful features:
Multithreading
support
Packet
level compression
IO
completion ports (for Windows based games)
Cross
platform
Conclusion
This
article is intended to provide architectural level ideas that can
be used to make your game-level network system programming easier
and less prone to cheats. It doesn't provide specific implementations,
which are game specific. For further information, I recommend searching
the websites Gamasutra, Gamedev.net, Google, Flipcode. Excellent
mailing lists include the GameProgrammer.com list, and GD-Algorithms-list.
Low level network libraries that have most to all of the capabilities
specified here include my own creation RakNet (free) and the Torque
game engine (low cost).