At the recent 2010 Computational Intelligence in Games conference in Copenhagen, Denmark, there were competitions for making race car controllers, human-like FPS bots, and Ms. Pac Man players, among others. The competition that drew my interest, however, was the Mario level design competition, which challenged entrants to create procedural level generators that could generate fun and interesting levels based on information about a particular player’s style. The restrictions on entries (use only fixed numbers of gaps, coin blocks, and Koopas) meant that the winner would have to be cunning: it wouldn’t be easy to just estimate player skill and make the level more or less difficult by adding or removing obstacles; you would instead have to find ways to place obstacles that made the same obstacles more or less difficult. And because entries would be played by players at different skill levels, they would have to be flexible and adjust their output over a broad range of difficulties to get a high score. Finally, because score was based on the audience’s relative rating of enjoyment between level pairs, there would be no way to game the system and optimize some metric set without making truly enjoyable levels. Between these constraints, the winning entry should have been a demonstration of the power of procedural content generation to adapt to players of different skill levels, which is one of several reasons that PCG is useful in games. Unfortunately, the competition design may have been a bit too clever.
I say this because the winning entry (submitted by my fellow EIS student Ben Weber) didn’t have any of the desirable properties mentioned above (you can read about its algorithm in the previous post). Ben hacked together a simple generator in a couple of hours of spare time, which built levels by scattering things about randomly over the course of a few passes. Ben’s entry ignored the data about the player, and didn’t try to work with the constraints, instead just ignoring them and then adding a final processing pass that made sure they were satisfied (which my entry did as well, to be honest). Thus rather than showcasing smart PCG that adapts to the player in order to provide an enjoyable experience, the competition seemed to give evidence that a good randomized (or even static) level could beat the best efforts at adaption.
So what does this tell us about the competition? Did it have a flawed design? Is the best that considerable research into procedural content generation has to offer worse than a few hours of hacking?
Personally, I think that there are a couple of things to take away from this. First and foremost, the competition was a success because it resulted in six practical functioning generation systems. One of the main goals of any academic competition is to motivate the creation of working systems from research ideas, and this one succeeded at that. Second, I think that the design of the competition could have used some more work. It focused on adaption, which is pretty difficult to measure, and the evaluation framework wasn’t quite up to the task. A really great competition not only provides motivation for system building, but also becomes a means of comparing the practical results of different approaches to a problem, and I don’t think that the CIG 2010 results were strong enough to be valid in this regard.
On the other hand, Ben’s victory should be taken seriously. It’s evidence that a strong level design (embedded in his algorithm, in this case) can have a large influence on fun, and that adaption (at least as done by the other competition entries) may pale in comparison. Of course, I don’t think that the competition results are quite strong enough to say that conclusively, but it certainly is worth investigating the tradeoffs between raw level design and adaption. Perhaps future competitions should pick a different aspect of PCG to encourage, or at least think hard about how they are designed as experiments. Of course, part of the problem may lie with the entrants: only one of the entries from EIS stressed adaption, since our work in this lab is more focused on having a large and interesting output space for our generators, and on working with humans during the design process. If other entrants treated the issue similarly, it may have been the case that there simply weren’t any entries with strong adaption techniques in the competition.
Despite mixed results, the competition was fun: I got to show off my own generation algorithm (which unfortunately broke during the competition itself) and meet other people to learn about their efforts. And even though the experiment design wasn’t ideal in practice, the constraints motivated me to address the issue of adaption within my system, which wouldn’t have happened without them. When I gave a talk that explained my entry the response was pretty positive, and Alex Champanard from AiGameDev.com mentioned the competitions prominently in his review of the conference. If the level generation competition is repeated next year at CIG 2011 in Seoul, it’s likely to see continued participation from EIS, and maybe we’ll even become perennial favorites.
(For the curious, the slides from my talk at CIG and the paper that presents the algorithm I used for my entry are available from my website at http://www.cs.hmc.edu/~pmawhorter/research.html)
4 Comments