Contents |
Since the beginning of the Personal Computer there are common interface to access the hardware. This was even true for the video adapters because in the past IBM compatible means compatible down to the register level. But this changed after IBM decided to stop adding more features and therefore every graphics adapter gets its own incompatible extensions. At first there were only new 2D operations but soon the starting to support 3D acceleration, too. The manufactures of these devices provide APIs to save the developers from hurdling around with the registers. But unfortunately the APIs were as incompatible as the register sets and therefore every application needs to be adapted for different hardware over and over again. Today the chips are still incompatible on the lowest level but drivers and the Direct3D runtime provides a common view: The Direct3D Pipeline.

As the Image shows the pipeline is divided into multiple stages. Three of them are programmable while the others provide a set of predefined functions. Independent of this difference all stages are controlled with the same IDirect3D10Device interface. To make it easier to build a link between a method and the stage it controls Direct3D 10 use two characters as prefix on these Methods.
| Prefix | Stage |
| IA | Input Assembler |
| VS | Vertex Shader |
| GS | Geometry Shader |
| SO | Stream Out |
| RS | Rasterizer |
| PS | Pixel Shader |
| OM | Output Merger |
Table: Device method prefixes
As the three shader stages are nearly identical some of the method names differs only at their prefix. Methods without these prefixes are not attached to any special stage. They are mostly responsible to create resources or invoke draw operations.
The first stage in the Direct3D 10 pipeline is the Input assembler. It is responsible to transfer the raw data from the memory to the following Vertex shader. To do this it can access up to 16 vertex buffers and a single index buffer. The transfer rules are encoded in an input layout object that we will discuss later. Beside of this format description the input assembler needs to know in which order the vertices or indices in the buffers are organized. Direct3D 10 provides nine different primitive topologies for this purpose. This information is passed along with the sampled vertex data to the following pipeline stages.

The whole input assembler is controlled with 4 methods that are all part of the device interface. IASetPrimitiveTopology let you select your primitive topology. To set the input layout you have to pass the already created object to the IASetInputLayout method.
If you geometry use an index buffer it need to be set with IASetIndexBuffer. As we will discuss later Direct3D 10 buffers are type less. Therefore the function requires additional format information. As it use the DirectX Graphics Infrastructure (DXGI) format here you could pass any format but only DXGI_FORMAT_R16_UINT (16 bit) and DXGI_FORMAT_R32_UINT (32 bit) will be accepted. Finally the method takes an offset from the beginning of the buffer to the element that should be used as the first index element during draw operations. Direct3D 10 requires that you specify this offset in bytes and not in elements that depends on the format.
The last set method is IASetVertexBuffers. It allows you to set one or more buffers with one call. As you can use up to 16 buffers at the same time you have to specify a start slot and the number of buffers you want to set. Then you have to provide a pointer to an array of buffer object interface pointer which the right number. Even if you want to set only one buffer you have to pass a pointer to the pointer of this single buffer. In comparison to the index buffer you don't need to provide any format information here. They are already store in the input layout. But you still need to provide the size of every vertex and the offset to the first vertex. As the vertex buffers are type less Direct3D 10 assume that they store bytes for both information's.
Every one of these four methods has a partner that allows you to get the current configuration of the input assembler. Instead of the Set their names contains a Get.
The vertex shader is the first programmable stage in the pipeline. It is base on the same common shader core as the other shaders. It can take up to 16 input register values from the input assembler to produce 16 output register values for the next pipeline stage. As the common shader core defines two more data sources you could not only set the shader object with VSSetShader. VSSetConstantBuffers will let you set one or more buffers that contain the constant values for the shader execution. To provide the other shader resources like textures VSSetShaderResources is used. Finally VSSetSampler let you set the sampler state objects that defines how read operations on the shader resources have to be done. All three methods take a start slot and the number of elements you want to change. Follow be a pointer to the first element of an array with the right number of elements of the necessary type. Again there is a Get method for every Set method.
The second shader unit is placed behind the vertex shader. Instead of taking a single vertex it gets the vertex data for a whole primitive. Depending on the selected primitive type this could be up to six full data sets. On the output side every geometry shader invocation generates a variable number of new vertices that can form multiple primitive strides.
As a shader this stage provides the same functions as the vertex shader. The only difference you will see is that the prefix changes from VS to GS.
The stream out unit that is attached to the geometry shader can be used as fast exist for all the previous work. Instead of passing the primitives to the rasterizer they are written back to memory buffers. There are 2 different options:
The stream output stage provides only one set method that let you define the target buffers. In comparison to the other units that provide multi slots SOSetTargets doesn't allow you to specify a start slot. Therefore you have to set all needed targets with one call that implicit starts with the first slot. Beside of the targets you have to provide an offset for every buffer that defines the position of the first written element. The method recognized an offset of -1 as request that new elements should be append after the last element that was written during a former stream out operation. This could be useful when you geometry shader produces a dynamic number of elements. As always this stage supports an get method too.
The rasterizer is responsible to generate the pixels for the different primitive types. The first step is a last translation form the homogenous clip space to the viewport. Primitives can remove based on a cull mode before they are converted into multiple pixels. But even if the geometry have survived so far an optional scissor test can reduced the number of pixels for the next stage. The 10 different states that control the rasterizer behaviors are bundled together to the first state object type.
This object is attached to the stage with a call to RSSetStage. As the stage object doesn't contain the viewports and scissor rectangles there are two more methods (RSSetViewports; RSSetScissorRects) to set these elements. In both cases you have to set all elements with a single call that always starts with the first slot. Most times Direct3D 10 will only use the elements on this slot but the geometry shader can select another one and pass this information forward.
You may not surprised to hear that there are get methods but this time their usage requires some additional knowledge. As the number of valid viewports and scissor rectangles could be vary you need a way to ask how many of them contain data. To save additional methods you will have to use the same method to query the number and the elements. If you don't know how many elements are currently stored you can pass an NULL pointer for the data store and the method will fill the number in the first parameter.
Every pixel that is outputted from the rasterizer goes ahead to the pixel shader. This last shader program is executed once per pixel. To calculate the up to 8 output color it can access up to 32 input registers. These are formed from the outputs of the vertex and geometry shader and interpolate from the primitives that was responsible for this pixel.
The last shader in the pipeline uses the same methods as the other two. This time the API use the PS prefix.
Finally the pixels will reach the output merger. This unit is responsible for the render targets and the depth stencil buffer. Both buffer types are controlled with a separated state object. While the render target blending operation contains 9 different states the depth stencil handling is configured with 14 states.
As there are two different stage objects in use the output merger provides with OMSetDepthStencilState and OMSetBlendState two methods to set them. The last method is used to set the targets for the output merger. Like with the stream out unit you will have to set all outputs including the depth stencil view with one single call to OMSetRenderTargets. The next call will override the current settings complete.
As Direct3 10 supports with the stream out unit an early exit and an optional geometry shader there are four different ways through the pipeline. In one case the dataflow is split apart.

As the only way to the stream out unit goes over the geometry shader you will always need one. To solve this in a situation where only a vertex shader is used you can create the necessary geometry shader based on the vertex shader.