GDC – Wednesday – ArenaNet scaling

Create great products through iteration. Continuous delivery to an online audience. Get other peoples’ stuff into the game.

Anyone can push a build live.

Any version to any version patching. All files in all versions have diffs to all other versions’ files.

Moved from one single team working on the whole game to 20 feature teams each working on features for the game.

One build server, 40 minutes to make a build, people would kill each others’s builds to get their build going.

Delivering new content to players once or twice a month.

Agility at scale. 200 developers, 20 teams each with their own build server, 20 feature branches with alpha, staging and live branch.

Tested live before checking in, in the old days. Now staging branch before going live.

Goal is no down time for servers, rule is server can’t be down for more than 20 seconds because that’s how long it would take to affect users.

Guild Wars 2 builds in 6 minutes. Server changes compile and run in about 40 seconds. Client takes 90 seconds to link.

Local dev machine can run everything on one machine.

Want same behavior on dev machine as in data center.

Use IO Completion Ports to avoid as many context switches as possible. Threads are good and bad. Want to have just enough threads because context switches cost (cache draining etc).

Stress tests on live servers. Performance measurement before game goes live. Measured thread concurrency over time.

Graphite – display program.

No login queues, replaced with overflow maps so you can play while waiting for real region, and join friend in overflow instance.

EU is 5 to 1 concurrency range, US is 2.5:1 concurrency range.

Not using VMs for game servers because of bad problems with Gw1, because VM schedulers do stuff differently from base OS schedulers – extra scheduler with different rules causes utilization problems.

When you just barely use all the memory, you create paging, which creates CPU lag, and it’s hard to track.

CPU measurement – windows perf counters sampling rate too low to be useful. Windows thread cycle counters.

EU to US network is good most of the time, but claim that 4 hours a month it sucks.

500 MB of new patches released each month.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>