A New Architecture for Games and Simulations Using GPUs

Mark Joselli, Cristina Nader Vasconcelos, Esteban Clua


Multi-thread architectures are the current trends for both PCs (multi-core CPUs and GPUs) and game consoles such as the Microsoft Xbox 360 and Sony Playstation 3. GPUs (Graphics Processing Units) have evolved into extremely powerful and flexible processors, allowing its use for processing different data. This advantage can be used in game development to optimize the game loop. As reported in the literature, GPGPUs have been used in processing some steps of the game loop, while most of the game logic is still processed by the CPU. This proposal differs by presenting an architecture designed to process practically the entire game loop using the GPU. Two test cases, a crowd simulation and a 2D game shooter prototype called GpuWars, are presented to illustrate the proposed architecture.


Digital Games; Game Architecture, GPGPU, Game Physics, Game AI, Flocking Boids


A. Anderson, W. G. III, and P. Schroder, "Quantum monte carlo on graphical processing units," Computer Physics Communications, vol. 177, no. 3, pp. 298-306, 2007.

I. Rudomin, E. Millan, and B. Hernandez, "Fragment shaders for agent animation using finite state machines," Simulation Modelling Practice and Theory, vol. 13, no. 8, pp. 741-751, 2005.

C. Muller, M. Strengert, and T. Ertl, "Adaptive load balancing for raycasting of non-uniformly bricked volumes," Parallel Computing, vol. 33, no. 6, pp. 406-419, 2007.

J. Krueger, "A GPU framework for interactive simulation and rendering of fluid effects," 2008. [Online]. Available: http://www.sci.utah.edu/publications/krueger08/GPU_framework.pdf

Intel, "Intel multi-core technology," 2009. [Online]. Avalible: http://www.intel.com/multi-core/

nVidia, "Technical brief: nVidia GeForce 8800 GPU architecture overview," 2006. [Online]. Available: http://www.nvidia.com/page/8800_tech_briefs.html

__, "nVidia CUDA compute unified device architecture," Programming Guide, 2008.

__, "nVidia GeForce 9800 GX2 specification," 2009. [Online]. Avalible: http://www.nvidia.com/object/product_geforce_9800_gx2_us.html

__, "nVidia CUDA compute unified device architecture documentation version 2.2," 2009. [Online]. Avalible: http://developer.nvidia.com/object/cuda.html

AMD, "AMD stream computing," 2008. [Online]. Avalible: http://ati.amd.com/technology/streamcomputing/firestream-sdk-whitepaper.pdf

J. D. Owens, D. Leubke, N. Govindaraju, M. Harris, J. Krger, A. E. Lefohn, and T. J. Purcell, "A survey of general-purpose computation on graphics hardware," Computer Graphics Forum, vol. 26, no. 1, pp. 80-113, 2007.

K. Group, "OpenCL - the open standard for parallel programming of heterogeneous systems," 2009. [Online]. Avalible: http://www.khronos.org/opencl/

B. Kadlec, H. Tufo, and G. Dorn, "Knowledge-assisted visualization and segmentation of geologic features using implicit surfaces," IEEE Computer Graphics and Applications, vol. PP, no. 99, Preprint, 2009.

P. Muyan-Ozcelik, J. D. Owens, J. Xia, and S. S. Samant, "Fast deformable registration on the GPU: a CUDA implementation of demons," in Proceedings of the 1st technical session on UnConventional High Performance Computing (UCHPC) in conjunction with the 6th International Conference on Computational Science and Its Applications (ICCSA), M. Gavrilova, O. Gervasi, A. Lagan, Y. Mun, and A. Iglesias, Eds., ICCSA 2008. Los Alamitos, California: IEEE Computer Society, 2008, pp. 223-233.

TunaCode, "CUVIlib: CUDA vision and imaging library," 2010. [Online]. Avalible: http://www.cuvilib.com/

nVidia, "CUDA zone," 2010. [Online]. Avalible: http://www.nvidia.com/object/ cuda_home_new.html

M. Harris, "gpgpu.org," 2010. [Online]. Avalible: http://www.gpgpu.org, 2010.

S. Green, "GPGPU physics," Siggraph07 GPGPU Tutorial, The 34th International Conference and Exhibition on Computer Graphics and Interactive Techniques, San Diego, California, USA, August 2007.

Havok, "Havok physics," 2009. [Online]. Avalible: http://www.havok.com/content/view/17/30/

L. Seiler, D. Carmean, E. Sprangle, T. Forsyth, M. Abrash, P. Dubey, S. Junkins, A. Lake, J. Sugerman, R. Cavin, R. Espasa, E. Grochowski, T. Juan, and P. Hanrahan, "Larrabee: A many-core x86 architecture for visual computing," ACM Transactions on Graphics, vol. 27, no. 3, 2008.

nVidia, "nVidia physx," 2009. [Online]. Avalible: http://www.nvidia.com/object/nvidia_ physx.html

M. Harris, "CUDA fluid simulation in nVidia physx," SIGGRAPH Asia 2009: Beyond Programmable Shading course, The 2nd ACM SIGGRAPH Conference and Exhibition in Asia, Yokohama, Japan, 2009.

E. Coumans, "Bullet physics library," 2009. [Online]. Available: http://www.bulletphysics.com

M. Joselli, E. Clua, A. Montenegro, A. Conci, and P. Pagliosa, "A new physics engine with automatic process distribution between CPU-GPU," in Proceedings of the 2008 ACM SIGGRAPH Symposium on Video games, 2008, pp. 149-156.

P. Kipfer, M. Segal, and R. Westermann, "Uberflow: a GPU-based particle engine," in Proceedings of the ACM SIGGRAPH Conference on Graphics Hardware, 2004, pp. 115-122.

J. Georgii, F. Echtler, and R. Westermann, "Interactive simulation of deformable bodies on GPU," in Proceedings of Simulation and Visualization, 2005, pp. 247-258.

J. R. da Silva Junior, E. W. G. Clua, A. Montenegro, M. Lage, M. de Andrade Dreux, M. Joselli, P. A. Pagliosa, and C. L. Kuryla, "A heterogeneous system based on GPU and multi-core CPU for real-time fluid and rigid body simulation," International Journal of Compu- tational Fluid Dynamics, vol. 26, no. 3, pp. 193-204, 2012.

J. R. da Silva Junior, M. Joselli, M. Zamith, M. Lage, E. Clua, and E. Soluri, "An architecture for real time fluid simulation using multiple GPUs," in Proceedings of SBGames, SBC, 2012.

K. N. Govindaraju, S. Redon, M. C. Lin, and D. Manocha, "CULLIDE: interactive collision detection between complex models in large environments using graphics hardware," in Proceedings of the ACM SIGGRAPH Conference on Graphics Hardware, 2003, pp. 25-32.

J. Shopf, J. Barczak, C. Oat, and N. Tatarchuk, "March of the Froblins: simulation and rendering massive crowds of intelligent and detailed creatures on GPU," ACM SIGGRAPH 2008: Advances in Real-Time Rendering in 3D Graphics and Games Course, New York, NY, USA, 2008, pp. 52-101.

E. Passos, M. Joselli, M. Zamith, J. Rocha, A. Montenegro, E. Clua, A. Conci, and B. Feijo ́, "Supermassive crowd simulation on GPU based on emergent behavior," in Proceedings of the VII Brazilian Symposium on Computer Games and Digital Entertainment, 2008, pp. 81-86.

E. B. Passos, M. Joselli, M. Zamith, E. W. G. Clua, A. Montenegro, A. Conci, and B. Feijo, "A bidimensional data structure and spatial optimization for supermassive crowd simulation on GPU," Computers in Entertainment (CIE), vol. 7, no. 4, p. 60, 2009.

M. Joselli, E. B. Passos, M. Zamith, E. Clua, A. Montenegro, and B. Feijo, "A neighborhood grid data structure for massive 3d crowd simulation on GPU," in Proceedings, Brazilian Symposium on Games and Digital Entertainment, pp. 121-131, 2009.

M. Joselli, E. B. Passos, J. R. S. Junior, M. Zamith, E. Clua, and E. Soluri, "A flocking boids simulation and optimization structure for mobile multicore architectures," in Proceedings of SBGames, 2012, pp. 83-92.

A. R. Silva, W. S. Lages, and L. Chaimowicz, "Improving boids algorithm in GPU using estimated self occlusion," in Proceedings of SBGames '08 - VII Brazilian Symposium on Computer Games and Digital Entertainment, 2008, pp. 41-46.

R. D. Chiara, U. Erra, V. Scarano, and M. Tatafiore, "Massive simulation using GPU of a distributed behavioral model of a flock with obstacle avoidance," in Proceedings of Vision, Modeling, and Visualization (VMV), 2004, pp. 233-240.

J. van den Berg, S. Patil, J. Sewall, D. Manocha, and M. Lin, "Interactive
navigation of multiple agents in crowded environments," in Proceedings of the 2008 Symposium on Interactive 3D graphics and games (I3D '08), ACM, New York, USA, 2008, pp. 139-147.

X. Jin, C. C. L. Wang, S. Huang, and J. Xu, "Interactive control of real-time crowd navigation in virtual environment," in Proceedings of the 2007 ACM Symposium on Virtual reality software and technology (VRST '07), ACM, New York, NY, USA, 2007, pp. 109-112.

nVidia, "Skinned instancing," 2008. [Online]. Avalible: http://developer.download.nvidia.com/SDK/10/direct3d/Source/SkinnedInstancing/doc/SkinnedInstancingWhitePaper.pdf

R. G. North, "Grand theft auto IV, rockstar games," 2008. [Online]. Avalible: http://www.rockstargames.com/IV/

M. Joselli, M. Zamith, L. Valente, E. W. G. Clua, A. Montenegro, A. Conci, B. Feijo ́, M. Dornellas, R. Leal, and C. Pozzer, "Automatic dynamic task distribution between CPU and GPU for real-time systems," in IEEE Proceedings of the 11th International Conference on Computational Science and Engineering, 2008, pp. 48-55.

M. Zamith, M. Joselli, L. Valente, E. Clua, A. Montenegro, R. C. P. Leal-Toledo, and B. Feijo, "A game loop architecture with automatic distribution of tasks and load balancing between processors," in Proceedings of SBGames 2009, pp. 5-8.

M. Joselli, M. Zamith, L. Valente, E. W. G. Clua, A. Montenegro, A. Conci, and P. Feijo ́, Pagliosa, "An adaptative game loop architecture with automatic distribution of tasks between CPU and GPU," in Proceedings of the VII Brazilian Symposium on Computer Games and Digital Entertainment, 2009, pp. 115-120.

M. Joselli, M. Zamith, L. Valente, E. W. G. Clua, A. Montenegro, R. Leal-Toledo, B. Feijo ́, and P. Pagliosa, "An architeture with automatic load balancing for real-time simulation and visualization systems," JCIS - Journal of Computational Interdisciplinary Sciences, Vol. 1, No. 3, pp. 207-224, 2010.

M. Joselli, M. Zamith, E. W. G. Clua, A. Montenegro, R. C. P. Leal-Toledo, L. Valente, and B. Feijo ́, "An architecture with automatic load balancing and distribution for digital games," in Proceedings of 2010 Brazilian Symposium on Games and Digital Entertainment (SBGAMES), IEEE, 2010, pp. 59-70.

V. Monkkonen, "Multithreaded game engine architectures," 2006. [Online]. Available: http://www.gamasutra.com/features/20060906/monkkonen_01.shtml

M. Joselli and E. Clua, "GpuWars: design and implementation of a GPGPU game," in Proceedings of 2009 VIII Brazilian Symposium on Games and Digital Entertainment (SBGAMES), IEEE, 2009, pp. 132-140.

M. Joselli, J. Ricardo da Silva, M. Zamith, E. Clua, M. Pelegrino, and E. Mendonca, "Techniques for designing GPGPU games," in Proceedings of Games Innovation Conference (IGIC), 2012, pp. 1-5.

V. Podlozhnyuk, "Parallel mersenne twister," 2007. [Online]. Avalible: http://developer.download.nvidia.com/compute/cuda/sdk/website/projects/MersenneTwister/doc/MersenneTwister.pdf

nVidia, "CUDA particles," 2008. [Online]. Avalible: http://developer.download.nvidia.com/compute/cuda/1_1/Website/projects/ particles/doc/particles.pdf

Microsoft, "Advanced particles," SIGGRAPH 2007: Real-Time Rendering in 3D Graphics and Games course, ACM, 2007.

M. Joselli, J. R. S. Junior, M. Zamith, E. Clua, and E. Soluri, "A novel data structure for particle system simulation based on GPU with the use of neighborhood grids," Proceedings of the GPU Computing Developer Forum 2012 (CSBC 2012 workshop), SBC, 2012.

P. Sarkar, "A brief history of cellular automata," ACM Computing Surveys, vol. 32, no. 1, pp. 80-107, 2000.

K. E. Batcher, "Sorting networks and their applications," in Proceedings of the Spring Joint Computer Conference (AFIPS '68), New York, NY, USA, April 30-May 2, 1968, ACM, pp. 307-314.

G. E. Blelloch, C. G. Plaxton, C. E. Leiserson, S. J. Smith, B. M. Maggs, and M. Zagha, "An experimental analysis of parallel sorting algorithms," Theory of Computing Systems, vol. 31, no. 2, pp. 135-167, 1998.

nVidia, "Bitonic sort demo," Tech Report, 2007. [Online]. Available: http://www.nvidia.com/content/cudazone/cuda_sdk/Data-Parallel_Algorithms.html#bitonic

D. H. Eberly, Game Physics. San Francisco, CA: Morgan Kaufmann Publishers, 2004.

D. M. Bourg and G. Seemann, AI for Game Developers. Sebastopol, CA: O'Reilly Media, 2004.

E. Dybsand, "A finite state machine class," in Game Programming Gems, M. Deloura, Eds. Hingham, MA: Charles River Media, 2000, pp. 237-248.

J. R. Rankin and S. S. Vargas, "FPS extensions modelling ESGs," in Proceedings of the 2009 Second International Conferences on Advances in Computer-Human Interactions (ACHI '09), Washington, DC, USA, IEEE Computer Society, 2009, pp. 152-155.

F. Li and R. J. Woodham, "Video analysis of hockey play in selected game situations," Image Vision Computing, vol. 27, no. 1-2, pp. 45-58, 2009.

C. W. Reynolds, "Flocks, herds and schools: A distributed behavioral model," in Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH '87, New York, NY, USA, ACM, 1987, pp. 25-34.

C. Reynolds, "Big fast crowds on ps3," in Proceedings of the 2006 ACM SIGGRAPH Symposium on Videogames, New York, NY, USA, ACM, 2006, pp. 113-121. [Online]. Available: http://doi.acm.org/10.1145/1183316.1183333

B. Creations, "Geometry wars retro evolve," 2009. [Online]. Available: http://www.bizarrecreations.com/games/geometry_wars_retro_evolved/

Q. E. Inc., "Every extend extra extreme," 2009. [Online]. Available: http://www.qentertainment.com/eng/2007/09/every_extend_extra_extreme.html

Full Text: PDF


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

IT in Innovation IT in Business IT in Engineering IT in Health IT in Science IT in Design IT in Fashion

IT in Industry (2012 - ) http://www.it-in-industry.com ISSN (Online): 2203-1731; ISSN (Print): 2204-0595