Scaling the System
Dealing with failure leads to issues of scalability. Scalability refers to both the capacity of the system to handle increased demand and to the ability of the system to handle failure of individual components. In our example, being able to add and remove a game master server will probably mean that each game master server should also be able to serve out a list of all of the other master servers that can be checked. In other words, users have to be able to find out the list of active servers in some fashion.
Another aspect of scalability is how comprehensive your databases are across the various backend servers. When your backend services include multiple game master servers, you have to determine how you are going to make sure that basically the same list of game servers exists on each of the master servers. This particular process is generally referred to as peering of the game server databases. There are a couple of basic ways to accomplish peering.
In the first method, the burden of making sure all master servers know about a game server is placed on each game server. In this scenario, the game server sends keepalive messages to all known game master servers. The downside is that each game server must duplicate the keepalive message and must somehow track the addresses of all of the master servers. Probably the better way to do this is to have a peering protocol in the actual game master server. With this type of protocol, the game master servers can inform all of the other master servers of keepalive and termination messages. Upon receiving a keepalive request, for example, the master acts more or less as a conduit and passes the keepalive message (and the actual IP address of the game server) on to the other master servers. Those servers then simply add the underlying game server to their active lists (as if they had themselves received the keepalive message directly) and life goes on. The only caveat is to make sure that the passthrough packets are labeled in such as a way as to prevent them also being peered to the other master, thereby causing a never-ending, ever-expanding message loop to occur.
If, rather than having multiple separate addresses for your master servers, you are going to house multiple game master servers behind a single IP address, then you will probably want to use a load balancing system.
Even though the game master server is a fairly straightforward server to implement, there are a lot of things to consider. With this in mind, we'll turn to a different type of server that we used during the deployment of Half-Life, the authentication server. This server has a different usage characteristic and requirements than the game master server and will help to demonstrate several other considerations for backend services.
The main purposes of the authentication server we deployed in Half-Life were to validate a user's CD key and to check to see if the user's executable was out-of-date (which would then invoke an auto-update mechanism using yet another backend server).
In order to never send a plain text (i.e., unencrypted) CD key over the Internet, we designed the authentication protocols to use public/private key cryptographic techniques for transmission of the back and forth dialog between the authentication server and the end-user software.
The CD keys we used were algorithmically generated so as to be very difficult to guess randomly. Because authenticating takes several seconds, and the odds of guessing a valid CD key are low, there is a large barrier to repetitive key guessing.
In addition to sending the CD key to the server, the client also sends encrypted version information to the authentication server so that the user can be told about updated versions of the software.
One thing we found out is that a lot of users have virus problems on their systems. In particular, the CIH virus turned out to be the main culprit behind version mismatch errors and was apparently infectious enough to affect thousands of our users. This was causing our versioning system to tell the client that it was in need of an upgrade. Of course this was not actually true at the time. As a result, we implemented routines in the client to self-CRC check the executable at startup.
Similar to the game master server, the authentication server can be quite resource intensive. This is especially true considering it must not only check versioning data, but also validate CD keys and perform all of the necessary cryptographic functions. Therefore, it is important to be able to bring on-line additional authentication servers as needed and to make sure that the end-user software can fail over to the other authentication servers when there are problems reaching a particular server.
Making the Protocol Choice:
Based on the need for a multi-part conversation, it made sense for us to consider using TCP/IP as the transport mechanism for authentication. Using TCP/IP, as noted, requires a significant OS overhead in setting up and dedicating a socket to handling a particular conversation. Thus, you should probably consider setting up the backend server using a thread for each listening socket. To prevent a malicious user from totally occupying all available server sockets, you should quickly disconnect the TCP connection as soon as there is trouble in the information, or if the socket times out.
Server load is the biggest issue for the authentication server. The following is a bit of information about the load we've seen on one of the multiple authentication servers we use for Half-Life.
Averaged over days: 384,745
Averaged over hours: 16,031
Averaged over minutes: 267
Averaged over seconds: 4.45
A typical usage graph shows how the data load (outgoing bandwidth needed) varies and peaks throughout the week:
Figure 1. A typical usage graph demonstrating
how data load varies during a week.
Additional Backend Services
In addition to game master servers and authentication / CD key checking services, there are various other backend services that you might choose to provision. For instance, after we released Half-Life, we soon realized that we needed to make the process of finding, downloading, and installing custom games or MODs (game MODifications) easier than it had been in the past. We chose to solve this problem by creating a new master server to handle serving out information about existing mods. The master just provides our clients with a list of MODs, a bit of information about each one, and the ftp site from which the MOD could be downloaded. The clients would then handle downloading and installing the MOD to the right spot.
The engineers a WON.net have developed a robust set of backend services that you might consider using for your games if you don't have the bandwidth, courage, or expertise to develop and deploy your own systems. Please feel free to e-mail me for further contact information for WON.
On-line Chat Service: Another interesting backend service that you might choose to deploy for your game platform is chat. One method of delivering 'chat' to your users is simply to code IRC client support into your game. While this is certainly functional, you should be aware that IRC servers are subject to a whole host of interesting attacks and user behaviors that might not be desirable.
If, instead, you determine that you will be creating a custom chat service, then there are a couple of ways you can handle design and implementation. The main issue will be whether to use a client / server or a peer-to-peer model. The other consideration is how many connections you want to support and whether you want to maintain any control over the creation and participation in chat rooms.
Using a client / server model can be a bit simpler, but does require that you (or one of your users) set up a host server. Will the server portion host handle multiple chat rooms or will it simply service just one chat room? If it services multiple chat rooms, then you need to consider the load this could create on the server. On the other hand, if each server will only handle one chat room, then the main issue is making sure that users find out the address of the server so they can initiate a connection. You could create a chat master server similar to the game master server to accomplish this.
For chat, the underlying protocols are pretty easy. First, each client initiates a connection to the server. If you use UDP, you need to build in a way to make sure, in a reliable way, that the connection succeeds. If you use TCP, this concern is obviated. However, using TCP could limit the number of simultaneous users you can handle in your chat rooms. The server then notifies all of the listeners of the new user joining (again handling all reliability issues). Finally, when users talk, the server simply echoes the text to all other users.
Using a peer-to-peer approach is a bit more complicated since each peer must be able to keep up-to-date on all of the other peers. You can accomplish this by having one of the peers act like the "server" and handle join/part and text message retransmits to everyone else. Of course, this also means that you have to handle that guy dropping out of the chat (do you kill the chat or appoint a new "server" on the fly?) Otherwise, each client must be able to handle join / leave messages and to be able to retransmit text to all other users. You still have the issue of how other people find out the addresses of participants so they can join the chat. In addition, synchronization of the peers becomes an issue.
Auto-Update: Another type of service you might wish to provision is an auto-update service. For us, this service was a natural extension and justification for our authentication service. We believe that fragmentation of our user base caused by "voluntary" upgrades is generally a really bad idea. Therefore, we implemented the authentication system as a way to ensure that all of our on-line players are always up-to-date and compatible. When authentication fails because the version data appears out of date, we invoke a separate auto-update executable. This executable is nothing more than a fancied up FTP client that knows where to search for updates and how to download, decompress, and run the installers for them.
PowerPlay: Most action game experiences on the Internet can be characterized as realtime, latency sensitive applications. The current state of the Internet infrastructure is not tuned well for this kind of gaming. To make the Internet the future of entertainment, improving the infrastructure will be critical. We are currently getting started on an industry initiative to create an open-standard to address various infrastructure issues on the Internet. This initiative is called PowerPlay. For more up-to-date information about PowerPlay, please check http://www.powerplayinfo.com/.
Provisioning backend services for your game platform is a critical component to the success and longevity of your game. There are a variety of such services that you might provision, but they generally fall within just a few classifications. In general, backend services are there to make your user's lives easier and, therefore, designing them with typical usage patterns in mind is important. Understanding how your backend services can be attacked or overloaded is also important. For almost all backend services, you will have to take into consideration a similar set of design decisions , and you will need to handle scalability and failure cases elegantly in order to keep your user base happy.
You can contact Yahn Bernier at [email protected].