You are on page 1of 6

A Video Game-Based Mobile Robot

Simulation Environment
Josh Faust Cheryl Simon William D. Smart
Department of Computer Science and Engineering
Washington University in St. Louis
One Brookings Drive
St. Louis, MO 63130
United States

Abstract— Simulation is becoming an increasingly important In this paper, we begin by describing the component tech-
aspect of mobile robots. As we are better able to simulate the nologies used to build the simulator. We describe the overall
real world, we can usefully perform more research in simulated architecture, and discuss how simulated robots are controlled
environments. The key aspects of a good simulator, an accurate
physics simulation and realistic graphical rendering system, are through the well-known Player Application Programmer In-
also central to modern computer games. In this paper, we terface [1]. We give examples of the simulator in use, and
describe a robot simulation environment built from technolo- discuss our plans for future experiments in the simulated
gies typically used in computer video games. The simulator is environments.
capable of simulating multiple robots, with realistic physics and
rendering. It can also support human-controlled avatars using a
traditional first-person interface. This allows us to perform robot- II. C OMPONENT T ECHNOLOGIES
human interaction and collaboration studies in the simulated
environment. The distributed nature of the simulation allows us The simulation environment was designed to take advan-
to perform large-scale experiments, with users participating from tage of existing technologies used for distributed interactive
geographically remote locations.
computer games. Gaming applications have much in common
I. I NTRODUCTION with robotics simulations, and require many of the same fea-
tures, such as high-quality computer graphics and an accurate
Simulation is becoming an increasingly important aspect physics simulation. In particular, we use computer games
of mobile robotics. As we become better able to realistically technology in three areas of our simulator: 3D graphics,
simulate the world, more and more development can usefully physics simulation, and networking. All of the technologies
be done in these environments. that we use in the simulator are cross-platform, allowing both
High-quality simulators allow us to develop software for clients and servers to run on a variety of operating systems.
mobile robots without having to contend with the ever-present
hardware failures, and without having to worry about battery A. 3D Graphics
life. They also allow us to perform research with large numbers
of robots. Simulators also allow us to easily change the We use the open source Ogre3D [2] as our graphics engine.
environmental parameters of the world (such as the lighting Ogre3D provides many of the advanced features seen in
level) quickly, and without inconveniencing other members of current commercial and open-source games. In particular, it
the laboratory. provides support for using the vertex and pixel shaders on
Modern computer games share much in common with the graphics card, which provides high-quality graphics while
modern mobile robot simulators. They include high-quality reducing the load on the CPU. It also provides advanced fea-
physics simulations. They are capable of rendering highly tures such as level-of-detail rendering and scripted animations,
realistic views of the simulated environment. They are capable which are not present in lower-level APIs, such as OpenGL.
of supporting many interacting objects and players in a large The engine provides support for both indoor and outdoor
world. environments and a flexible scene management system (for
We have developed a robot simulation environment based specifying the environment). It also gives us easy access to
on technologies commonly used to develop distributed com- “special effects” such as particle systems (for smoke, and
puter games. The simulator allows us to run several mobile similar effects), transparency, realistic shadow effects, and
robots in a shared environment, with realistic physics and realistic material properties. This is important to us, since it
graphics. Additionally, unlike other existing simulators, we allows us to generate more realistic synthetic camera images.
can also support human-controlled characters (avatars) in the The more realistic these images are, the more likely it is that
world. This allows us to perform robot-human interaction and the same computer vision algorithms will work both in the
collaboration studies in the simulation environment. simulator and in the real world.
B. Physics Simulation any software that uses them will not transfer to the real world.
For our physics simulation, we use Ageia’s PhysX SDK [3]. Although it is possible to develop similarly-detailed models
This is an advanced commercial physics engine, designed for using low-level graphics APIs, such as OpenGL, it requires
the computer games industry, and is free for non-commercial much more effort to do so.
use. At the time of writing PhysX is the fastest freely available
physics engine of its type. In addition, Ageia also produce III. S YSTEM A RCHITECTURE
a hardware-accelerated physics engine which will allow us The simulator environment is organized as a client-server
to increase the number of robots that we can support in the system. Robot and human clients connect to the server, which
simulator by one or two orders of magnitude. performs the processing necessary to simulate the world. The
The PhysX engine is an appealing choice because it supports number of clients that a single server can accommodate is
advanced material properties, such as regular and anisotropic limited only by computational resources. The overall system
friction. This allows us to model objects in the world ac- architecture is shown in figure 1.
curately, and provide a more realistic simulation that would
be possible with other freely-available physics engines. These A. The Server
material properties allow us to support accurate robot-ground The server keeps track of the properties of all objects in
interactions for a wide variety of terrain types, such as world, runs the physics simulation, calculates the simulated
concrete, carpet, and ice. sensor data, and manages the connections to all the clients.
PhysX is quite similar to the Open Dynamics Engine [4], a All communications between the server and the clients are
commonly-used physics engine in robot simulators. However, handled by TNL.
PhysX uses a faster physics integration algorithm, allowing Commands from clients are translated by the server into
it to simulate many more colliding objects, has more exact sequences of actuator motions. The results of these motions
collision detection than ODE’s heuristic system, and is widely are determined by the physics simulation, and the resulting
regarded as being more stable. These features, coupled with new world state communicated back to the clients.
possible hardware acceleration make PhysX a better choice, Simulated sensor data are generated by directly querying
in our opinion. the underlying representations of the world. These data are
C. Networking then modified to reflect realistic measurement error, and sent
back to the robot clients. Currently, we have simple (Gaussian
High-performance networking is provided by the TNL,
noise) error models for distance sensors, but it would be
the Torque Networking Library [5]. This is a networking
straightforward to add more realistic models that take the
subsystem designed for multi-player distributed games, and
surface properties of the detected object into account.
has been used in a number of professional products. It is open
source, and provides a secure, efficient and robust networking B. The Robot Client
The robot client allows the robots in the simulated en-
D. Interoperability vironment to be controlled. It has no graphical interface,
All of the technologies that we use are multi-platform and since all interactions are performed though the sensor-actuator
are not tied to a specific operating system. We have adopted Application Programmer Interface (API). A single client can
the Player [1] API as our interface to robots in our simulator. accept connections from many robot controllers, each control-
This allows us to take advantage of a well-developed API and ling a different robot in the environment. Information about
a large existing user community. the robots are loaded from an XML configuration file that
We have also built in support for COLLADA [6], an emerg- specifies the robot type, sensors, initial position in the world,
ing XML standard that describes the graphics, physics, and and connection settings to communicate with the controllers.
materials of 3D artwork. We can use models created in most When a robot client connects to the server, a new robot is
major 3D modeling programs, including Maya [7], 3D Studio created (“spawned”) in the environment, and the appropriate
Max [8], Blender [9], and Softimage/XSI [10], by exploiting TNL communication mechanisms are enabled to allow data to
the COLLADA format. These COLLADA descriptions can be passed back and forth between the client and the server.
be loaded directly into our simulator. This allows designers Clients request new simulated sensor data from the server, in
to create complete 3D scenes in their preferred modeling response to API calls by the robot controllers.
software (including structural, physics, texture, and material One or more robot controllers can communicate with a robot
property information), and to easily import them into the client using the Player protocol [1] over a standard TCP socket.
simulator. Again, this allows us to take advantage of a huge We assume that the controller and the client communicate over
body of pre-existing models created for other purposes. a low-latency link, such as a Local Area Network (LAN).
Using a professional modeling package makes it easier to We chose to use the Player protocol for two main reasons:
develop accurate models of the robots and other objects in the the Player API is well-established and many researchers are
environment. This is important since, without such accurate familiar with it, and API bindings already exist in a number
models, any simulated camera images will be unrealistic, and of major languages. Users already familiar with the Player
controller robot robot robot
robot client client controller

human observer
client client

Fig. 1. Overall system architecture

keyboard, and joystick). The avatar can interact with objects

in the world, and is subject to the laws of physics. Interaction
with other human clients is currently possible using a text-
based chat interface.
The avatar appears to other clients, including robots, as
a human-shaped entity with appropriate body motions (fig-
ure 3(a)). The human operator provides high-level commands
to the avatar, such as movement and gaze direction, just as
in a first-person game. An automatic low-level controller then
supplies appropriate limb motions to make the movement of
Fig. 2. Pioneer 3 model and actual robot [11] the avatar look realistic to any observers.
D. The Observer Client
API can use their robot controllers in our simulation with no A third type of client allows human operators to observe
modification. the activity in the simulated environment, without interacting
Although we have chosen to use the Player protocol initially, with it (figure 3(c)). Observer clients have no simulated
we have designed the robot clients to make it straightforward physical presence in the world, and are not bound by the
to add other APIs. Since the API-specific code is limited to laws of physics. They can move arbitrarily about in all three
the robot client, no changes need to be made to the server, dimensions and cannot be seen by other clients. This allows
assuming the models for the robots and sensors already exist. a third-person view of the activity in the world while running
All that must be changed on the client-side is how information experiments. As with human clients, observers are controlled
is repackaged from the replicated objects for the new API. using standard input devices, such as keyboard, mouse, and
However, if a new robot or sensor is required, a model joystick.
of it must be created, and support for generating simulated E. Client-Server Interactions
measurements from it added to the server. All client-server interactions are performed through TNL,
Currently, our robot client supports differential drive robots, and take one of two forms: event messages and replicated
such as the ActivMedia Pioneer 3 [11] (see figure 2) and a objects.
subset of the Player API (position, laser range-finder, and Event messages are implemented using TNL NetEvents.
camera proxies). Since there is no graphical interface for These are atomic messages used to inform a client or server
the robot client it has very low computational requirements, of an event, such as user input, or a request for sensor data.
and can be run on inexpensive, low-end systems. The robot Replicated objects are implemented using TNL GhostCon-
controller, however, will have its own requirements. To ensure nections. This allows for data structures on the server which
good performance, the client machine should be connected to are automatically replicated on each of the clients. Optimiza-
the server by a high speed (10 Mbps or better) connection. tions are performed to minimize the bandwidth needed by
these updates. For example, only data elements that have
C. The Human Client changed are actually sent, and only objects that are currently
In addition to robot clients that interact with the world necessary for the client are actively updated. For example,
through an API, we also provide for human clients using a only the world objects that are currently in the camera’s view
standard first-person game interface (figure 3(b)). This client frustum are updated. Updates are performed automatically
allows an operator to control a human-shaped avatar in the whenever the data held in the object change, and rely on
simulated environment using standard input devices (mouse, serialize/deserialize functions being available for the object.
(a) Robot’s view. (b) Human’s view. (c) Observer’s view.

Fig. 3. Robot, human, and observer points-of-view in the simulation environment. The human avatar is a model supplied with Ogre3D.

All network communications use a novel network proto- approximately 10 laser range-finder sensors, each requiring
col unique to TNL, called the “Notify” protocol. This is a 360 ray-casts, at a frequency of 10Hz.
connection-oriented unreliable communication protocol with While we can accurately simulate cameras, based on a 3D
packet delivery notification. Processes receive receipt notifica- rendering of the world, this is also an expensive operation.
tions for each packet that is successfully delivered. The receipt Each camera in the simulation requires the world to be
notifications are packed into the headers of other data packets, rendered from a new view point. If there are many cameras,
to conserve bandwidth. This protocol has been shown to be this again increases the computational load on the server. It
more efficient than either UDP or TCP in the distributed game is possible to address this by adding more, faster graphics
setting. cards to the server (since the rendering can be performed
A number of mechanisms are available in TNL to provide on the GPU). However, just like ray-casting, there will be
for guaranteed ordered delivery (like TCP), guaranteed deliv- a fundamental limit to how many cameras we can effectively
ery, unguaranteed delivery (like UDP), and quickest possible support. Currently, on our test development machine, with a
delivery (similar to TCP out-of-band data). Encryption mecha- single graphics card, we can support three cameras operating
nisms are also available for all delivery options, although they at 30Hz without noticing a slow-down in the simulation.
are not currently being used in our simulation environment. In the clients, the rendering of the scene is the major compu-
Input from a human or observer client is handled in two tational expense. We can improve performance by reducing the
ways. For the most common interactions (such as movement, quality of the rendering for human operators. We can disable
rotation, and simple world interactions), an TNL UserAction advanced features such as realistic shadows, or reduce the
message is sent to the server periodically. This message resolution of the rendered scene.
contains information about attempted movement, attempted In order to provide for better scalability, we are currently
rotation, and a description of the state of a set of “action looking at ways to distribute the server computations over
keys” on the keyboard and mouse that control more complex multiple computers, and across multiple threads in a single
interactions. This abstraction allows a human to locally map computer. We are also investigating how the clients can bear
keys and mouse motions to different actions, and makes the more of the computational load. This will allow us to scale
addition of new input devices straightforward. the system to much larger environments, with many more
To handle other less common actions, a special TNL connected clients.
NetEvent message containing all of the details of the action
is dispatched. V. U SING THE S IMULATOR
IV. I MPLEMENTATION AND S CALING I SSUES Our simulator can be used similarly to the other systems
The most computationally expensive part of the simulation described in section VI. Robots can operate in the simulated
are the physics calculations. In particular, the ray-casting environment, receiving simulated sensor data and interacting
necessary to calculate simulated sensor readings is a severe with the world using a realistic physics model (see figures 4,
bottleneck. As we add robots with laser range-finders to the 5, and 6). However, the ability to integrate distributed human
simulation, the slow-down is very noticeable. This can be ad- players into the simulation allows us to perform novel exper-
dressed by reducing the data rate or the fidelity of the sensors, iments that are unsupported by any other simulator.
but this is undesirable, since it makes the simulated sensors dif- In particular, we can perform robot-human collaboration
ferent from their real counterparts. However, special-purpose experiments. Such experiments can either involve collabo-
hardware support for the PhysX API will allow us to perform rative teams or antagonistic teams. Collaborative teams are
ray-casting in hardware. This will significantly increase the composed of human and robot clients, who work together to
number of high-fidelity sensors that we can support in our perform some task, such as searching a building. There are
environment. Currently, using a Pentium-IV class computer, two possible variations of this task. In the first, the robots
with 512MB of RAM, running Windows XP, we can support are autonomous, and all interaction with them must happen
at a high level. In the second variation, the robots are tele-
operated, or have mixed-initiative controllers [12]. The humans
controlling the robots work with the humans controlling the
avatars in the world to perform a given task. Search and rescue
operations are a good example of a collaborative task.
In antagonistic experiments, one team tries to accomplish
a goal while another tries to stop them. Both teams can have
humans and robots on them. Games such as “capture the flag”
are good examples of antagonistic tasks.
Some tasks, such as perimeter guarding, are critically-
dependent on human behavior. Our simulator allows us to
verify these algorithms without venturing out into the real
world, and without committing to an almost certainly flawed
model of human behavior. For example, in the perimeter
guarding example, we can deploy our robots on the perimeter,
and have a team of human avatars try to penetrate it. We do
Fig. 4. Several robots in a simulated environment, with laser sensor
visualizations enabled.
not have to model the humans avatars’ behavior, since they
are being directly controlled by humans. Most likely, these
humans have a lot of experience with first-person computer
games, and will behave in an appropriately realistic manner
in the simulated environment.
There are a number of robot simulation environments avail-
able currently. In this section we briefly survey the most well-
known, and discuss how they differ from our own environment.
All of the environments have common elements. They all use
a physics simulation, a graphics system, and a modeling lan-
guage. While the choice of physics simulation and modeling
language vary, most of the systems described here use (the
low-level) OpenGL as their underlying graphics system.
The closest existing system to the one described in this
paper is USARsim [13], a high-fidelity simulator based on
Epic Games’ UnrealEngine2 engine. USARsim was explicitly
developed to serve as a simulator for urban search and
Fig. 5. A robot in a simulated environment, with physics debugging
information and sensor visualizations enabled.
rescue (USAR) robots, and includes models of the NIST
reference test arenas for mobile robots [14]. Robots sensors
and actuators can be accessed through either the Player [1]
or Pyro [15] APIs. Six robot types, including the popular
ActivMedia Pioneer 2 series, are supported. Since the system
is based on a game engine, human-controlled avatars should
be available, but these are not mentioned in the published
descriptions of the system.
Webots [16] is a commercial simulation package produced
by Cyberbotics. It uses the open source Open Dynamics
Engine (ODE) [4] to simulate realistic physics, and includes an
extensive library of sensors and actuators from which complete
robots can be built. Users can program their simulated robots
in C, C++, or Java, and this code can then be transferred
to a real robot (for a subset of the supported robots). The
package includes tools for editing robot configurations and
environments, and can inter-operate with other modeling sys-
tems using the VRML97 standard. Webots runs on Windows,
Linux, and Mac operating systems.
Fig. 6. A close-up of the robot in figure 5.
Darwin2K is an open source simulation built in support of
evolutionary robotics research [17]. It models the robot system
at a very low level, since the purpose of the evolutionary more complete support for the Player API, and adding more
algorithms is to discover novel and useful configurations of realistic, empirically-determined sensor measurement errors.
primitive components, such as gears and bushings. Darwin2k In the longer term, we would like to add a number of
runs on a Linux or Irix operating system under X windows. new features to the system. We currently have a keyboard-
OpenSim [18] is another open source simulation package based chat interface to allow humans to communicate with
under active development. It uses ODE for physics simula- each other. We will replace this with an audio channel that
tion, and is focused on supporting research into the inverse humans could directly speak into. Any humans or robots close
kinematics of redundant manipulators. to the speaker would be able to “hear” what was said, at an
Perhaps the most widely-used open source simulation appropriate volume. We also plan on adding realistic robot and
environment is the Player/Stage project [1], [19]. Player environmental noises, so that humans (and other robots) can
provides device abstractions for real and simulated robots. hear robots approaching. Avatars currently have only a simple
Stage provides a traditional 2D simulated environment, while movement animation. We plan to implement a richer motion
Gazebo [20] is a full 3D simulation with realistic physics. vocabulary for them, based on recent work in computer games.
Gazebo uses ODE for its physics simulation, and can run on This will also allow an avatar to have a richer interaction with
Linux and Mac OS X machines. the world, and to manipulate objects directly.
Our simulation environment differs from those described ACKNOWLEDGMENTS
in the section in several ways. We use a different physics
simulation package, described in section II-B. In itself, this We would like to thank Joyce Santos for building many of
is not particularly significant, since all physics simulations the initial models for the simulation environment, including
perform the same basic job. However, as we discussed earlier, the Pioneer 3. We would also like to thank the Ogre3D
our system can potentially use hardware acceleration and developers and forum members, the TNL developers, and the
simulate much more complex systems. Ageia PhysX developers, for providing the core technologies
We also use a fully-featured 3D graphics engine for our for this work.
graphics system. This gives us a level of abstraction from R EFERENCES
the low-level graphics primitives that most other environments [1] B. P. Gerkey, R. T. Vaughan, K. Stoy, A. Howard, G. S. Sukhatmem,
do not have. This means that it is easier for us to add and M. J. Matarić, “Most valuable player: A robot device server for
realistic visual effects to our environments, such as smoke, distributed control,” in IEEE/RSJ International Conference on Intelligent
Robots and Systems, 2001, pp. 1226–1231.
transparency, and dynamic lighting conditions. It also allows [2] “Ogre3D open source graphics engine,”
us to adapt to different operating systems more easily, since [3] “PhysX physics engine,”
the engine can use either OpenGL or DirectX primitives, [4] “Open dynamics engine,”
[5] “TNL: The Torque Networking Library,”
depending on which is more efficient. [6] “COLLADA an open digital asset exchange schema for the interactive
The simulator is built on open standards, allowing robot, 3d industry,”
world, and object development to be carried out in a wide vari- [7] “Maya modeling, animation, and rendering system,”
ety of commercial modeling tools. This allows professionally- [8] “3D Studio Max,”
designed objects to be used easily in the simulation. [9] “Blender open source 3d graphics system,”
Finally, we have the ability to have human-controlled agents [10] “Softimage/XSI digital character software,”
in our worlds, using techniques common in video games. [11] “Activmedia pioneer 3 robot,”
While it is possible, in principle, to also do this in the other [12] D.J.Bruemmer and M. Anderson, “Intelligent autonomy for remote
environments, we have never seen it reported. Having mixed characterization of hazardous environments,” in Proceedings of the IEEE
International Symposium on Intelligent Control, Houston, TX, October
teams of robots and humans allows for a much greater range 2003.
of possible experiments using the simulation environment. [13] J. Wang, M. Lewis, S. Hughes, M. Koes, and S. Carpin, “Validating
USARsim for use in HRI research,” in Proceedings of the Human
Factors and Ergonomics Society 49th Annual Meeting, 2005, pp. 457–
[14] A. Jacoff, E. Messina, and J. Evans, “Experiences in deploying test
In this paper, we have described a mobile robot simula- arenas for autonomous mobile robots,” in Proceedings of the 2001
tion environment, built from technologies commonly used in Performance Metrics for Intelligent Systems (PerMIS) Workshop, Mexico
computer games. The simulator includes a realistic physics City, Mexico, 2001.
[15] D. Blank, D. Kumar, L. Meeden, and H. Yanco, “The pyro toolkit for
simulation and high-quality graphics rendering system. In ai and robotics,” AI Magazine, vol. 27, no. 1, pp. 39–50, Spring 2006.
addition to supporting simulated robots controlled using the [16] “Webots mobile robot simulation,”
Player API, we also support human-controlled avatars. This [17] C. Leger, Darwin2K: An Evolutionary Approach to Automated Design
for Robotics. Boston, MA: Kluwer Academic Publishers, 2000.
greatly increases the type of experiments that can be usefully [18] “OpenSim: An open 3d robotics simulator,”
carried out in the simulator. In particular, it allows us to http:///
conduct human-robot collaboration experiments. [19] “The Player/Stage project,”
[20] N. Koenig and A. Howard, “Design and use paradigms for gazebo, an
The simulator is still under active development at the time of open-source multi-robot simulator,” in IEEE/RSJ International Confer-
writing. In particular, we are focusing on adding more objects ence on Intelligent Robots and Systems (IROS 2004), 2004, pp. 2149–
(robots, sensors, avatars, and “props”) to the world, adding