Sie sind auf Seite 1von 6

Graphic pipeline and implementation

preliminary version Zhenyu Ye z.ye@student.tue.nl October 31, 2008

Introduction

This report investigates the graphic pipelines in software rendering engine irrlicht and in dierent graphics processors. Two methods of rasterization and fragment shading, triangle based and tile based is investigated. The bottleneck of the graphic pipeline is suggested. The ISA of GPU is investigated. The memory hierarchy of the GPU is investigated. The compilation of the OpenGL library is investigated. The programming model and the limitation of CUDA is investigated. The graphic processor for handheld devices are investigated.

2
2.1

Architecture
NVIDIA GeForce 6800

According to [1], the NVIDIA GeForce 6800 uses a three-stage pipeline, which is vertex processing, fragment processing and pixel processing. The vertex processors output the triangles to the fragment processors, which performs rasterization in triangle by triangle base. The fragment processors output the interpolated fragments via a fragment crossbar to the pixel processors for pixel blending. The memory is partition into dierent banks corresponding to the partition of the screen. The pixel processing works in bank by bank base.

2.2

Tiled Based GPUs

The graphics pipeline in mobile GPUs use tile-based rendering[2][3][4]. The Mali architecute developed by ARM [5] , and the Larrabee architecture developed by Intel [6] also uses tilebased rendering technique.

3
3.1

Graphics pipeline
Graphics pipeline in NVIDIA GeForce 8800

This section compares the implementation of the graphics pipeline in irrlicht software renderer and in dierent hardwares. The analysis of graphics pipeline in irrlicht software engine is based on its source code. The analysis of graphics pipeline in GPUs is based on recent publications and technical reports from the vendor.

TU/e Electronic Systems

Report of Progress

October 31, 2008

3.2

Geforece 8800

A recent publication[7] unveils detailed information of Telsa architecture used in NVIDIA GeForce 8800, as shown in Figure 1. Tesla architecture contains dedicated texture units for bilinear ltering during the rasterization and fragment shading stage of the graphic pipeline. By using the dedicated texture processing hardware, NVIDIA calims the full-speed bilinear anisotropic ltering is nearly free on GeForce 8800[8]. To reduce energy consumption, NVIDIA set the clock frequencey of texture unit to 575 MHz, instead of 1.35 GHz in the streaming processors. A previous prole of a simple 3D application showed the bilinear ltering contains 70% of the total computation workload if the whole graphci pipeline is running on general processors.

Figure 1: Tesla architecture in NVIDIA GeForce 8800[7]. TPC: Texture/Processor Cluster; SM: Streaming Multiprocessor; SP: Streaming Processor; ROP: Raster Operation

According to [7][8][9] (note that these papers use dierent terms to describe the graphics pipeline), the mapping of DirectX 10 graphics pipeline into GeForce 8800 is showed in Figure 2. The detail of each stage is:

Vertex Shader : Position space transform. Perform color and texture coordinate generation. Geometry Shader : Deals with a whole primitive and its vertices. Breakdown polygons and large triangles into smaller triangles. Generate single-pass environment map, motion blur, stencil shadow polygons. Physical simulation of particle system.

TU/e Electronic Systems

Report of Progress
Preprocessed by Processed by SPs Input Assembler and and SFUs Vertex Distribution vertices primitives Pixel Distribution by tiles Vertex Shader

October 31, 2008

Processed by SPs and SFUs

Performed by Viewportclip /setup/raster/zcull units

Geometry Shader

Rasterization

Pixel Shader

Raster Operation

pixels

Texture lering by Texture Units Other eects by by SPs

by ROP Units

Figure 2: Mapping of graphics pipeline in NVIDIA GeForce 8800

Rasterization : This stage is called Rasterization for the purpose of consistance in terms. In GeForce 8800, this stage contains viewport and clipping, setup, raster and zculling. The viewport and clip units clip the primitives to view frustum. Setup units generates edge equation for rasterizer. Rasterizer generates all pixel tiles that are at least partially inside the primitive. Early zculling performs depth test to eliminate the number of fragments.

Pixel Shader (or called fragment shader) : Texture color processing and numerous shading eects for each triangle.

Raster Operation : Correspond to the Output Merger stage of the DirectX 10 pipeline. Z buer checking and pixel blending for transparent objects.

3.3

Graphics pipeline in Irrlicht software renderer

This section introduces the irrlicht software renderer. Irrlicht sorts the verteices and primitives into dierent types of nodes, and render these nodes in the order of: lights nodes, environment background, solid objects, shadows, transparent objects, and shader nodes. The renderer is chosen according the material and properties of the nodes. The function CT RT extureGouraudN oZ2 :: scanline bilinear is used in the environment backgound for rendering the sky. The function CT RT extureLightM ap2 M 4 :: scanlien bilinear is used in the solid objects for rendering the wall.

TU/e Electronic Systems

Report of Progress

October 31, 2008

Vertices Primitives Vertex Shader

Clipping

Rasterization

Pixel shader

Raster Operation

Pixels

Figure 3: Graphics pipeline of irrlicht rendering engine

Algorithm 1: Irrlicht(T ) input : Data structure T representing the information of all triangles and their vertices output: Pixels P to the frame buer for every type of nodes do for every type of materials do for every triangle do go throught the graphics pipeline as shown in Figure 3 return P

Programming model

This section introduce the infrastructure of the programming language and APIs for the GPUs.i

4.1

OpenGL

OpenGL is an API which roughly contains a frontend and a backend, as shown in Figure 4. Application uses the frontend to compile applicaiton program into an assembly-like language. This assembly-like language assumes the underlying hardware as a virtual machine with a number of states, which are visible to the application. The block diagram of the machine is shown in Figure 5, the detail of this machine is available at [10]. The OpenGL back end then convert this assembly-like language into the instruction set which can run on the GPUs. If the GPU can not support all the functions of the OpenGL, then OpenGL backend can run some of the functions in CPU instead.

References
[1] J. Montrym and H. Moreton, The geforce 6800, IEEE Micro, vol. 25, no. 2, pp. 4151, 2005. [2] AMD, Next-gen tile-based gpus, 2008. [Online]. Available: http://developer.amd.com/gpu assets/gdc2008 ribble maurice TileBasedGpus.pdf [3] T. Capin, K. Pulli, and T. Akenine-Moller, The state of the art in mobile graphics research, Computer Graphics and Application, IEEE, vol. 28, no. 4, pp. 7484, 2008.

TU/e Electronic Systems

Report of Progress

October 31, 2008

OpenGL Application

GLU

GLUT

Interface Library OpenGL OpenGL State Machine

Hardware OpenGL lib

Software OpenGL lib

GPU

Frame buer

Figure 4: Overall infrastructure of OpenGL API

[4] T. Akenine-Moller and J. Strom, Graphics processing units for handhelds, Proceedings of the IEEE, vol. 96, no. 5, pp. 779789, 2008. [5] ARM, Arm mali 3d graphics processing solution, 2006. [Online]. Available: http://www.arm.com/miscPDFs/16514.pdf [6] L. Seiler, D. Carmean, E. Sprangle, T. Forsyth, M. Abrash, P. Dubey, S. Junkins, A. Lake, J. Sugerman, R. Cavin, R. Espasa, E. Grochowski, T. Juan, and P. Hanrahan, Larrabee: a many-core x86 architecture for visual computing, ACM Trasaction on Graphics, vol. 27, no. 3, 2008. [7] E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym, Nvidia tesla: A unied graphics and computing architecture, IEEE Micro, vol. 28, no. 2, pp. 3955, 2008. [8] NVIDIA, Technical brief: Nvidia geforce 8800 gpu architecture overview, 2006. [Online]. Available: http://www.nvidia.com/object/IO 37100.html [9] D. Luebke and G. Humphreys, How gpus work, Computer, IEEE, vol. 40, no. 2, pp. 96100, 2007. [10] OpenGL, The opengl machine, 2006. [Online]. http://www.opengl.org/documentation/specs/version1.1/state.pdf Available:

[11] M. Segal and K. Akeley, The design of the opengl graphics interface, Silicon Graphics Computer Systems, Tech. Rep., 1994.

TU/e Electronic Systems

Report of Progress

October 31, 2008

Figure 5: Overview of the block diagram in OpenGL [11]

TU/e Electronic Systems

Das könnte Ihnen auch gefallen