My Message close
GAME JOBS
Latest Jobs
spacer View All     Post a Job     RSS spacer
 
May 18, 2013
 
Sony Computer Entertainment America LLC
Sr. Network Systems Engineer
 
Amazon Game Studios
Quality Assurance Manager
 
Amazon Game Studios
Sr. Game Designer
 
Treyarch / Activision
Technical Animator
 
Amazon Game Studios
Lead 3D Environment Artist
 
Amazon Game Studios
Game Graphics Engineer
spacer
Blogs

  On The Practical Application of Multiplayer Matchmaking
by Nick Halme on 12/21/09 03:16:00 pm   Expert Blogs   Featured Blogs
3 comments Share on Twitter Share on Facebook RSS
 
 
The following blog was, unless otherwise noted, independently written by a member of Gamasutra's game development community. The thoughts and opinions expressed here are not necessarily those of Gamasutra or its parent company.

Want to write your own blog post on Gamasutra? It's easy! Click here to get started. Your post could be featured on Gamasutra's home page, right alongside our award-winning articles and news stories.
 

Elo is a system used to dealing with competitive professionals.  We can safely assume that while videogames have a smattering of intelligent, highly rational and competitively skilled players, playerbases are largely comprised of amateurs either aspiring to competitive play or simply trying to exist in an amateur bracket.

What Microsoft's Xbox 360 and Windows PC matchmaking system Trueskill will do varies, as it is often combined with a specific matchmaking system and perhaps has to deal with other variables when it is integrated into any videogame, variables that are not present in Elo (as it is not hooked up to anything, but rather runs separately as a sort of tracking database in the real world).  But it is more or less similar in the way it operates.

This presents some problems, some of which I've seen here at Relic.  I want to make it clear that this is not a complaint or a rant against Trueskill, but an attempt to point out what it's doing to players.  My understanding of the system is a surface understanding, one of resulting effects more than a technical one -- please feel free to correct and add to this conversation as you see fit (my flame retardant suit is always zipped up tight, so don't be afraid to poke holes).

1) It accounts for player skill based on matchups -- win against an opponent of the same Trueskill and you will gain a little skill, probably remain in your bracket.  Lose against someone worse than yourself and you will lose a little skill and probably remain in your bracket.  However if you win or lose against someone above or below your bracket, your Trueskill will be changed accordingly at a much higher rate.

This assumes a competitive consistency in play that many players do not practice, a check at home in Major League Baseball or Chess but not within a pool of thousands of players who may range from teenagers to adults practicing this game in their spare time. 

2) As the only way to check for skill is to look at win/loss ratio and the only way to improve is to match against a higher skill bracket, what you get is a matchmaking system that at times can simulate the absence of a matchmaking system.

What this means is that a player of base skill with zero games played will be matched against more advanced players to check their skill.  The system wants to make sure it can judge the player's skill -- but the player assumes he will be matched against another base level opponent.  Instead he accrues loss after loss until Trueskill has judged: this is a poor player.

There is a sweet spot where Trueskill shines.  Currently in Warhammer 40,000: Dawn Of War II, I reside in the 25 Trueskill bracket, which is of mid-range skill (I believe it is exactly "average" if the max Trueskill value is 50).  For almost a year I have only fluctuated by a couple points up and down, ie: down to 24 then up to 27, then down to 25.  This has lead to games against players of equal skill (enjoyable games), players of considerable skill (tense games) and still sometimes players of almost no skill (games in which I feel like a jerk).

What is the solution then?  First we have to ask what most players want, and I think that is usually "I want to play a fair game against someone my 'own size'".  This means defocusing the competitive matching and making things more user controlled.

Perhaps a player wants to play against players of a low bracket of skill; let him select the low skill bracket.  If he feels he has improved, let him select medium and dream of selecting high.  This does open up the potential for griefing (high skill players joining low skill matchmaking to cream unskilled players for fun), but that's a separate issue.

Of course, this is exactly like choosing difficulties in a single player game.  As I write this it occurs to me that I heard a coworker discussing such a thing this morning, and it has most definitely seeped into my thoughts -- it's a good idea, and a step towards more standardization.

What would such a thing mean for actual competitive play?  Nothing at all.  When I played Soldier of Fortune 2 and Call of Duty competitively, our clan used community ladder sites and their rules, such as OGL and Team Warfare League (more pedestrian variants of CAL).  

We were ranked accordingly and able to judge skill accurately by ladder position, able to work our way up to the number one spot and jostle between first and second place, fighting off challenging teams that were not good enough and losing our spot to superior teams.

There is no math behind this, but rather social organization allows for players to heuristically judge skill through matching together.

 
 
Comments

Rik Newman
profile image
This is all excellent stuff, and great to hear someone else has had similar thoughts to me. I've been on about this for years, and wrote up a mini-series of articles about it too, from a specific slant, but came to many of the same conclusions you did.

http://agoners.wordpress.com/2009/11/11/balancing-match/

:)

Jonathon Walsh
profile image
I'm really against the Trueskill rating system, it just seems so flawed.



1. It seems unintuitive for people who don't know how it works. In Trueskill it's possible to win a match and lose rating or lose a match and still gain rating. This is because of how your uncertain is affected. I had both happen to me in DoW2.



2. It becomes stubborn. After your uncertainty drops a significant degree it takes much more work to adjust your ranking. In ELO winning 10 games in a row after playing 100 games has the same affect as winning 10 games in a row as your first 10 games. In Trueskill when your uncertainty value is very low you don't move much per win.



Trueskill is good at first placing people but really bad or slow at rewarding people for improving their skill. This seems to me to be especially problematic for new players of the game but veterans of a genre. These players may perform mediocre at first as they learn the quirks of the specific game but once they are comfortable in the game they will excel. This is a problem when the ranking system becomes less willing to move the players around after their first run of games.



Also the 'select your skill level' system was used for Age of Mythology with mixed success. The problem was that you'd have a 1600 ELO player boosting themselves to 1800 then playing against another 1800 player. But since ELO is zero sum and the player's actual rating is 1600 when the 1600 self rated player won they got a ton of points and the 1800 player lost a bunch. As the 1800 player this was really frustrating as it really could damage your rating.

Ron OKeefe
profile image
Elo can be and will be just as useful for amateur gaming as well as more professional environments. The beauty of the Elo system is that variables inside the system can be manipulated easily to adjust for different game environments. The K values used for Chess may not be the best for a multiplayer FPS game. Over time, and given enough data, the rating mechanism can be adjusted slightly to better fit the real curve with the desired theoretical curve. In addition, the K values can be scaled to allow not only for stepped adjustment based on a players current rating but also scaled to allow for margin of victory.


none
 
Comment:
 




 
UBM Tech