It's free to join Gamasutra!|Have a question? Want to know who runs this site? Here you go.|Targeting the game development market with your product or service? Get info on advertising here.||For altering your contact information or changing email subscription preferences.
Registered members can log in here.Back to the home page.    

Search articles, jobs, buyers guide, and more.

by Pete Hallenberg
Gamasutra
[Author's Bio]
September 16, 2002

Do You Need SQL?

Database Design

How Big is Big?

Printer Friendly Version

[Back To] Online Games Resource Guide

Sponsored by:

This feature originally appeared in the Proceedings of the 2002 Game Develoers Conference.

 

 


Resource Guide

Of Internet Servers and SQL Databases:
Designing the Backend for Power and Performance

As online games proliferate, a growing number of Internet servers are being created and deployed by game developers. These servers must deal with massive quantities of data generated by the games that connect to them. Off-the-shelf SQL database software seems like the ideal way to handle all this data, but which RDBMS system is right for your project, how will it impact server performance, and how should the data be structured and used? This session will answer these questions and many others. It is intended to give programmers a clear and practical understanding of the things databases do well and the things they do horribly. Using a database without this knowledge is a virtual guarantee of lousy server performance. Topics include server architecture, ODBC vs. proprietary connection APIs, keeping the query optimizer from killing you, and everything you never wanted to know about designing tables, indexes, and stored procedures to make your servers fly.

Background: the ITNet Network

The paper that follows grew out of my experiences of the last three years during which I served as architect, lead programmer, and project manager for Incredible Technologies Internet gaming network ITNet. ITNet is a gaming network unlike any other: it provides Internet gaming services to the company's flagship arcade product – the blockbuster coin-op golf game Golden Tee Fore! Instead of interconnecting individual players, ITNet links public gaming devices that are themselves accessed by hundreds of individual players. This along with the phenomenal success of Golden Tee Fore means that the ITNet system handles a truly staggering amount of data: over 20,000 online machines worldwide dump 150,000 individual transactions per day into the system. ITNet provides service around the clock to games in 7 foreign countries and the United States, as well as tracking individual game play statistics for over 700,000 registered players worldwide. In addition to player statistics, ITNet manages the exchange of complete financial and game play statistics, multiple national and international tournament leader boards, user-defined option settings, collection and diagnostic data, recorded “shadow games” for multi-machine play, operator registration information, card purchase data, detailed course statistics, code patches, advertising graphics and promotional contests, as well as allowing for the sending, receiving, and deleting of any file on the remote machine's hard drive. In addition, plans are already underway to add two more arcade products to the ITNet network. Since I will be constantly referring to ITNet throughout the remainder of this paper, this description is intended to give you some idea of the capabilities of the system I will be discussing.

Do You Need a SQL Database?

If you are in the process of designing or building an Internet game server of any size, you have probably wondered whether you need a database. Considering the amount of data that a modern Internet server must deal with, integrating an enterprise-grade SQL database into your server backend becomes a more and more attractive option all the time. Before you go rushing off to buy that shiny copy of Oracle 9i, though, let's pause a moment and clarify when a SQL database is really necessary.

There are basically three options you can take with respect to databases. You can skip the database entirely and create a “stateless” server, you can roll your own database solution, or you can buy database software off the shelf and integrate it with your server. For the most part, I recommend you go with the first or third options. If your project is particularly small and simple, you may be able to get away without any database at all. This type of architecture can work well for a simple matchmaking and chat server that won't be tracking information about individual players. If you can get away with building a stateless server, I highly recommend you give it a try. The Internet server I created for Incredible Technologies Golden Tee PC Golf game used this architecture, and it's one of the most trouble-free server systems I've ever seen. It quite literally requires attention once or twice a year.

If, on the other hand, your game is going to require some amount of state data, then I recommend you go with an off the shelf SQL database rather than a home-brewed solution. Although it might seem like the home-brewed solution could be made cheaper and faster, you will rapidly discover that this approach is more trouble than it's worth. The minute you need to do anything halfway interesting with your data – even something as simple as accessing it from an external viewing or reporting tool, you'll wish you'd gone with SQL. In addition to remarkably fast performance, an enterprise-grade SQL database will store your data in a format that can be easily accessed and used by a wealth of external tools and systems – from accounting packages to web servers. Even if you never do anything other than issue queries interactively, you'll rapidly grow to love the flexibility and power of SQL. In addition, you'll be able to hire experienced help from a huge pool of potential applicants thanks to the ubiquitous use of the SQL standard in business.

Physical Infrastructure

Once you've decided you need a SQL database, setting the system up is pretty straightforward. Your SQL database will run on its own box and will be accessed via a fast LAN link by your Internet servers, web server, development tools, etc. Since you are going to want extremely fast network speeds between your database and critical Internet server components, you should carefully choose the physical networking infrastructure that will connect these machines. A dedicated, high-speed Ethernet backplane is a good choice here (we use 100 Mbps Ethernet for ITNet).

When purchasing computer hardware, I recommend that you buy the biggest and buffest box you can afford for your database server. We discovered after running ITNet for 2 years that the database is almost always the speed bottleneck when your Internet servers slow down. Our main database server is currently running on a duel Pentium IV machine with 1 Gig. of RAM, and 5 of the largest and fastest SCSI hard drives (connected and mirrored via a RAID array) that we could find.

Internet Server Design

Your servers will be the conduit between your game clients connecting over the Internet and your database, so your first design questions are: how should your clients connect to the servers and how should the servers connect to the database?

Connecting to The Database
There are two basic ways of connecting to a database. You can either use the Open Database Connectivity (ODBC) API, or connect using your database vendor's proprietary access API. ODBC is generally considered to be slightly slower, but has the advantage of being a universal standard on all major databases. The major advantage of a native API over ODBC is it's purported speed improvement, and a richer set of database access calls.

We used ODBC for ITNet and our query performance has exceeded all our expectations. I strongly recommend you use it. In practice, you'll find that you have other much more terrifying performance problems to worry about than latency introduced by ODBC.

Connecting to The Games
The first and most important choice you will face when architecting your servers is whether you should use TCP/IP or UDP as the underlying communications protocol.

If you've attended any server design lectures in the past, you may have heard it said that you should always use UDP for games. This is true as long as fastest possible delivery of data is the overriding design consideration for your project. In games where players are competing head-to-head over the Internet in real time, this is clearly the case. In turn-based games or other types of server-to-game interactions (like sending patch files), however, things are not so clear cut. I'm not suggesting you make your next multiplayer quake-killer using TCP/IP, but we need to be careful not to throw the baby out with the bath water here. For applications where it's appropriate, TCP/IP can cut months off a development schedule by providing a foundation protocol with world-class reliable delivery, sliding window packet delivery, flow control, retransmission characteristics, and countless esoteric refinements and optimizations. After all, the last thing you want to be doing while staring down the muzzle of a milestone is trying to work out your own solution for send-side silly window avoidance.

For all the reasons noted above, I decided to use TCP/IP as the underlying protocol when architecting my Internet server for Incredible Technologies Golden Tee PC Golf game. PC Golf was a turn based game, so it was a perfect fit for TCP/IP. This worked great except for one problem: with TCP you're much farther from the metal than you are with UDP, so you don't have as much information about what's going on when the network starts acting up. With PC Golf, my strategy had been to open TCP connections between the host and server and keep these connections open throughout a multiplayer match. What I discovered, though, was that it's really difficult to recover a lost TCP connection and figure out how to pick up where the last connection left off.

When it came time to design ITNet we had another great fit for TCP/IP on our hands, but I was wary of the problems I'd seen with TCP while creating the PC Golf multiplayer code. Again I faced a difficult choice: should we use UDP and reinvent all those expensive wheels, or should we somehow try to work around the problems of TCP? The solution we came up with turned out to one of the best architectural decisions we made while creating ITNet. After several grueling design meetings, we decided to use TCP/IP with a twist: instead of holding a TCP connection open during an entire communication session, we would adopt the same brief connection model used by web servers. Our design specified that ITNet clients would be required to open a new TCP/IP connection for every message they needed to send, and that each connection would stay open only until the server had sent one or more messages in reply.

The design worked great. The brief connection strategy neatly solved the connection recovery problem with TCP/IP: if a connection went south, it was very easy for the client to recover by simply throwing away the connection in progress, and then opening a new connection and resending the last message. All those thorny issues of determining how much data had already been sent and building servers that could resume processing at any point in the message handling pipeline simply went away.

The only thing that bothered us about brief connection was the fear that all those connects and disconnects would kill our data transfer efficiency. Happily, this was a non-issue. As it turns out, TCP/IP connection setup and teardown are actually pretty speedy affairs: opening a connection requires only 3 IP message sends, and closing requires only 4. The fast connection setup and teardown allowed us to make 50 to 60 brief connections per call and still blow away all our targets for call speed.

Brief connection was one of the very best choices we made when designing the ITNet servers. All told, it probably cut 3 or 4 months off our development schedule, and has continued to work exceptionally well even after 2 years of nonstop growth. Based on my experience, TCP/IP with brief connection has earned a permanent place in my network programmers bag of tricks. If it fits your project's parameters, I strongly urge you to give it a try.

The Standard Server Architecture
Now that you know how to establish basic connections to the database and your games, the next step is to settle on an overall architecture for the server itself.

Important Safety Tip: At Incredible Technologies, we write all our servers in C and C++ to run under Windows NT. Since servers don't need much of a user interface, I recommend avoiding MFC so that you don't add unnecessary overhead. Also, any serious server application will need to be written as a Windows NT service. A service is a successor to the venerable DOS TSR. One of the most useful properties of a service is that it can be set to run every time your NT server boots up - even when no user is logged in. This is very important for robustness: if anything should cause your server PC to reboot, you can sleep well knowing your server software will be restarted right along with it.

When I originally sat down to design the ITNet servers, I had planned to use a variant of the classical multithreaded server architecture. In this architecture, a dedicated thread waits for incoming connections and spawns a new thread for each client that connects to the server. Each spawned thread is responsible for handling all communication with a single client and is shut down when that client disconnects.

The variant on this design uses a small pool of pre-allocated “worker” threads that can handle any request/reply message transaction with any client. When a new message arrives from a game, it's handed off to an idle worker thread for processing. This architecture improves on the classical design by reducing the amount of context switching on a busy server, as well as eliminating the need to constantly create and destroy threads. Another nice feature of this architecture is that Windows NT has native support for it through an object called an I/O Completion Port. A full discussion of I/O Completion Ports is beyond the scope of this paper, but you can find a good overview of the topic in the MSDN document “Writing Windows NT Server Applications in MFC Using I/O Completion Ports” by Ruediger Asche (it should be included in your Visual Studio 6.0 help).

This multithreaded pooling architecture was the design I had used for the PC Golf server, and although it was very fast, it did have some problems. In particular, it took a ton of work to resolve all the thread synchronization bugs that cropped up. Also, having all the server's code in a single process made the server extremely fragile: any bug in any part of the code could bring the entire server crashing down. This made evolving the existing code or adding new features exceptionally hazardous… sort of like performing open heart surgery on a patient who's awake and walking around.

A New Hope
It was while mulling over these problems that we hit upon an alternative scheme. The idea was to unbundle all the pieces of a standard multithreaded server into a family of single-threaded servers that could each handle a single, specialized task. Figuring out how to chop up one big server into a lot of little servers was actually pretty easy. Most of the servers would be dedicated to processing specific messages or groups of messages. For example, we could have one server that handled incoming game data, and another that maintained and distributed tournament leader boards. The glue binding all these servers together would be a “meta” server, which would be responsible for telling clients where to send individual messages to get them processed. Clients would know only the IP address of the meta server, but would begin each communication session by requesting a “server resolution map” to get the addresses of the other servers. A server resolution map is just a look-up table that matches every client message type with an IP address and port number of the server that processes that message.

In the end, we decided to use this second “server family” architecture for ITNet. In addition to solving the multithreading and fault-tolerance problems, the server family approach had a number of very cool properties:

  • A single server could only ever be as powerful as the PC it was running on, but a server family could be spread across multiple machines for greater performance.
  • The meta server could dynamically create server resolution maps from data in the database. This meant that new servers could be added dynamically to a family, and also that we could override server assignments on a client-by-client basis (useful for sending specific messages from just one client to a debug server, for example).
  • By designing the meta to randomly route messages between two or more identical servers in the same family we could build in automatic load-balancing for free.

This was our reasoning, at any rate, when we embarked on the creation of ITNet. After running the ITNet server family under heavy load for two years, though, my opinion of the server family architecture is mixed. It did many of the things that we had hoped – and even surprised us with some unexpected benefits – but it also had its share of problems. The following sections explain what worked and what didn't work.

The Good
One of the biggest wins with the server family approach was being able to selectively move parts of the “virtual server” around to different computers on the network. This made debugging in the production environment incredibly easy: we could run a problematic server right from our development machines under the debugger when hunting bugs. The beauty of this approach was that the bulk of the “virtual server” continued to function exactly as it always did, which was a huge help in tracking down code that misbehaved only when the servers were running under load.

Being able to add new servers at will was also hugely beneficial. This capability came in especially handy when adding new products to ITNet. The architecture allowed us to easily add a new product-specific server to the server family which contained all the code under development for the new product. Even when this code was incredibly fragile and buggy, it never effected the stable parts of ITNet: since it was running as a separate process under a protected memory OS, it could crash all day long without impacting the other servers.

Another big boon was the meta server. Having a dedicated “traffic cop” managing the other servers let us add a lot of advanced behavior very easily. It also gave us a natural place to put code that handled global connection properties. The meta server eventually evolved into a session and call management server as well as a place to get server resolution maps.

There was one last benefit of the server family approach that really took me by surprise. Having many small servers broken down by project and function gave us an unexpectedly powerful tool for managing the production process. It turned out that making individual programmers responsible for specific servers neatly solved many of the problems that come with shared programming projects. In particular, when an ITNet server crashed, it was immediately obvious who's code had failed and who should be looking into it. Up until this time, I hadn't realized how much energy could be spent wrangling over who should do the work of finding certain tricky bugs. This ambiguity is unavoidable in a typical programming project, but it completely disappeared with ITNet. The best part about this was the way it protected careful programmers from mistakes made by other members of the team. Programmers who caught most of their bugs during coding just naturally ended up doing less nasty debugging work. The efficiencies this added to the server creation process are hard to overstate. Indeed, this is such a shockingly valuable feature that it almost justifies using the server family architecture all by itself!

The Bad
Although the idea of having a server that could “gracefully degrade” sounded good on paper, in practice it didn't buy us much. It didn't take long for us to realize that the problems caused by allowing partial calls to complete more than outweighed the benefit of having a system that could function with one or more failed servers. In the end, it was much less trouble to just detect and deal with downed servers than to try to build a system that could run robustly with “partially successful” communication sessions.

Managing the dozen or so servers that eventually formed part of the ITNet server family required more work than expected. Even checking the status of the system meant looking at and digesting 12 separate status values instead of just one. While this problem wasn't a deal-breaker, it was nonetheless a pain in the neck.

Data sharing between the servers turned out to be a bigger problem than originally anticipated. In particular, we found that the database was a very unsuitable channel for transferring shared state between different servers.

By far the biggest problem, though, was the single-threaded design of the individual servers. Without multiple threads to spread out the load, individual servers could easily build up a backlog of unprocessed messages when sudden spikes in server activity occurred. The only solution to this was essentially to hand-tune the servers: we ended up shifting message processing responsibilities around among servers until each one was carrying a roughly equal share of the processing load. This enabled the server family to pipeline data efficiently, but it involved a lot of tedious hand tweaking. Automatic load balancing might have helped alleviate the problem, which brings us to our next section…

The Ugly
Somehow or other, the automatic load balancing feature never quite got finished. Although it worked early on, something broke about six months after rollout and I never got around to fixing it. By that time, we were in full-blown crisis mode, and only the most urgent projects were being worked on. Since we never actually started using the load balancing feature, it went straight to the bottom of the project list and has stayed there ever since.

________________________________________________________

Database Design


join | contact us | advertise | write | my profile
news | features | companies | jobs | resumes | education | product guide | projects | store



Copyright © 2002 CMP Media LLC. All rights reserved.
privacy policy | terms of service