0
101 Mar 02, 2012 at 14:06

Hi all,

Is it possible to create just one vertex shader with a big input vertex struct and selectively activate a subset of semantics varying just the input layout ? And also, does that make sense and is convenient in general?

The D3D11 CreateInputLayout function doesn’t prevent to do this, except that it will launch you a warning, but that warning is not blocking.

From the remarks:

If a data type in the input-layout declaration does not match the data type in a shader-input signature, CreateInputLayout will generate a warning during compilation. The warning is simply to call attention to the fact that the data may be reinterpreted when read from a register. You may either disregard this warning (if reinterpretation is intentional) or make the data types match in both declarations to eliminate the warning.

Now I have two quetions:

1) If that is theoretically possible, will my vertex input data be interpreted correctly from the Input Assembler ?
2) If yes, will this approach be efficent ? I mean normally I’ve always heard that people tend to keep the input struct for a vertex shader as small as possible to limit the vertex shader input size.

If both the two questions will be satisfied it would be possible to keep one shader and more input layouts that selectively will activate the semantics in the vshader input struct …. !

#### 18 Replies

0
139 Mar 02, 2012 at 20:51

If I understand correctly, you want to have a vertex shader with more input parameters than the vertex buffer has? E.g. the shader might read TEXCOORD0 and TEXCOORD1, but the vertex buffer only has TEXCOORD0?

I don’t think this makes sense. The shader still uses TEXCOORD1, so where should that data come from?

0
101 Mar 03, 2012 at 08:17

@Reedbeta

If I understand correctly, you want to have a vertex shader with more input parameters than the vertex buffer has? E.g. the shader might read TEXCOORD0 and TEXCOORD1, but the vertex buffer only has TEXCOORD0? I don’t think this makes sense. The shader still uses TEXCOORD1, so where should that data come from?

So, the basic idea is a way to minimize shader number/permutations what about the other way around? More input parameters in the vertex buffer and less in the input struct of the vertex shader ? Even though I can’t see how this could solve the problem, since in that case I’m still constrained in declaring a vertex shader for each permutation:

One with just position
One with position and texcoord0
One with position,texcoord,normal

Maybe that is useful only if I want to avoid to check that a mesh input params match the vertex shader input signature (e.g. If I have a vertex shader with just the position and I’ll bind a vertex buffer with 3 input parameters, only the position will be used in the shader and the other two coming from the vbuffer will be ignored). If this is possible I won’t need to check that the shader signature of the vshader referenced by the material referenced by a mesh will necessary match the inpuLayout (vertex buffer input params) …

0
139 Mar 03, 2012 at 18:43

Yes, the other way is fine; you can have extra components in the vertex buffer that are not used by the vertex shader. There may under some circumstances be a performance penalty involved in this (relative to having a vertex buffer with only the components you actually need), but the rendering will work correctly.

0
101 Mar 03, 2012 at 18:59

@Reedbeta

Yes, the other way is fine; you can have extra components in the vertex buffer that are not used by the vertex shader. There may under some circumstances be a performance penalty involved in this (relative to having a vertex buffer with only the components you actually need), but the rendering will work correctly.

How big/important is this performance penalty? I mean should I be careful or is an approach that is widely used?
Or should I check all the time with the profiler?

0
139 Mar 03, 2012 at 20:19

It varies depending on hardware, shaders, the scene, etc. Profiling is your best bet, as with any GPU performance question. It might not be an issue at all, as it’s only relevant if reading the vertices from memory is the bottleneck, i.e. you have to be both vertex-bound and memory-bandwidth-bound.

0
122 Mar 04, 2012 at 12:26

I think on some platforms the struct passed to the shader is expected to be packed.

  float4 position;
float2 uv;
float4 colour;


If you have a structure in memory like this ….

  float4 position;
float   user_parameter;
float2 uv;
float  another_user_parameter;
float4 colour;


Passing that to the shader would require you to shuffle stuff around in memory.

0
139 Mar 04, 2012 at 17:08

At least on PC/console GPUs you can specify the stride and offset for each component when you set up the vertex buffer - that information is what the “input layout” object in D3D represents. I don’t have any experience with mobile GPUs (for instance); maybe some of them aren’t able to do that.

0
101 Mar 09, 2012 at 09:33

@Reedbeta

Yes, the other way is fine; you can have extra components in the vertex buffer that are not used by the vertex shader. There may under some circumstances be a performance penalty involved in this (relative to having a vertex buffer with only the components you actually need), but the rendering will work correctly.

But if the other way is fine, the fact that I have to pass the byteCode of a compiled shader to the CreateInputLayout function it makes me wonder … ;)

HRESULT CreateInputLayout(
[in]   const D3D11_INPUT_ELEMENT_DESC *pInputElementDescs,
[in]   UINT NumElements,
[in]   SIZE_T BytecodeLength,
[out]  ID3D11InputLayout **ppInputLayout
);


And the next logical conclusion is that because I’m considering the input layout creation process decoupled from the vertex shader input signature, then when I’ll create the input layout I might need to keep a dummy vertex shader just to have the byteCode around.

a vertex shader like this could be used to have a valid bytecode:

float4 VS(in float4 pos : POSITION) : SV_Position
{
return float4(0.f,0.f,0.f,0.f);
}


The CreateInputLayout will generate a warning if the vertex input parameters doesn’t match that signature (just the POSITION), I’ll ignore the warning and I’ll have my inputLayout created.

Still, this makes me wonder that even though this approach could work it looks so unnatural to me !!! :(

I feel that might not be the right way to go …

0
139 Mar 10, 2012 at 08:08

Does it really give a warning if the vertex format you pass in contains more than the shader uses? That would be annoying, but you can ignore it if it is what you expect.

Anyway, I can’t tell you if this approach is “the right way to go” for you. You have to decide that after considering the possibilities.

0
101 Mar 10, 2012 at 13:14

@Reedbeta

Does it really give a warning if the vertex format you pass in contains more than the shader uses? That would be annoying, but you can ignore it if it is what you expect.

Anyway, I can’t tell you if this approach is “the right way to go” for you. You have to decide that after considering the possibilities.

Cool,
I made a test with input layout vertex attributes number > input shader semantics and it works !! If the shader input semantics are just 2 and the input layout has 3 of them, it just select the 2 of the input layout and it works! someone was saying that maybe It could have problem with some drivers … but It works on my nvidia 560!

and btw where did you find all this detailed informations ?

thanks for the infos

0
139 Mar 10, 2012 at 16:48

Cool! Glad it’s working. FYI, I doubt very much this will have problems on any drivers. It’s not really that difficult to ignore some data. :)

I got the terminology a bit wrong, it turns out; the preshader I was talking about is called a “fetch shader” (because it fetches data from the vertex buffers); the word “preshader” means something else (a kind of hoisting optimization performed by the D3D frontend). Anyway, there’s no specific place I learned about it. There are various whitepapers about GPU internals floating around the Web. Here’s one that mentions fetch shaders, for example - it’s about a bit older generation of AMD GPUs, but it’s reasonable to assume that their newer GPUs and probably NVIDIA GPUs also do similar things.

Also, if you haven’t read it, the A Trip Through the Graphics Pipeline articles are extremely informative and well worth reading.

0
101 Mar 10, 2012 at 17:27

@Reedbeta

Cool! Glad it’s working. FYI, I doubt very much this will have problems on any drivers. It’s not really that difficult to ignore some data. :)

I got the terminology a bit wrong, it turns out; the preshader I was talking about is called a “fetch shader” (because it fetches data from the vertex buffers); the word “preshader” means something else (a kind of hoisting optimization performed by the D3D frontend). Anyway, there’s no specific place I learned about it. There are various whitepapers about GPU internals floating around the Web. Here’s one that mentions fetch shaders, for example - it’s about a bit older generation of AMD GPUs, but it’s reasonable to assume that their newer GPUs and probably NVIDIA GPUs also do similar things.

Also, if you haven’t read it, the A Trip Through the Graphics Pipeline articles are extremely informative and well worth reading.

Thanks very very much, I think those articles are gold :D !

Now I can use one shader for more meshes regardless ;)

0
101 Mar 13, 2012 at 09:44

One last question Beg:

When I create my input layout how many of TEXCOORD[n] I can declare ? I bet the max number is hw specific, but it is also true that It won’t exist a vertex input element descriptor big enough to justify, say, more than 4 TEXCOORD as input ?

I ask this because I’m trying to understand how many element of that type I can declare in my enum to have a realistic number of them ! Also, the most complex input layout that I’ve seen was the one that was used for instancing …

Btw for now I’m considering a maximum of 6 for each type (i.e. TEXCOORD0 - TEXCOORD5, TANGENT0 - TANGENT5 etc. …)

0
139 Mar 13, 2012 at 17:28

TEXCOORD, TANGENT, etc. are just user-supplied labels at this point, so they can be arbitrary. It used to be that they mapped to specific hardware registers but nowadays all attributes are treated the same. You can actually make up your own names, like

void vs_main(
out float4 pos : SV_POSITION)
{
pos = foo;
}


This compiles in vs_4_0 or vs_5_0, not in lower profiles though (which have specific predefined semantic labels).

0
101 Mar 13, 2012 at 17:37

@Reedbeta

TEXCOORD, TANGENT, etc. are just user-supplied labels at this point, so they can be arbitrary. It used to be that they mapped to specific hardware registers but nowadays all attributes are treated the same. You can actually make up your own names, like

void vs_main(
out float4 pos : SV_POSITION)
{
pos = foo;
}


This compiles in vs_4_0 or vs_5_0, not in lower profiles though (which have specific predefined semantic labels).

But it’s still better to predefine them for backward compatibility and allow the user to define its own custom semantics beyond the ones that have already been “predefined” by me. I say this cause I don’t know on ps3 or ios how it’s going to be …

Sidetrack question: In lower profiles (I mean below vs_4_0 and vs_5_0) what were the specific predefined semantic labels ?? (If I can state that precisely I could predefine them too in my engine and allow custom semantics only on profiles greater or equal than 4_0 and 5_0, still mantaining the backward compatibility).

0
139 Mar 13, 2012 at 17:43

Ah, if you’re targeting non-D3D10-11 devices then yes, you’ll have to stay within the predefined set of semantics. :) In that case, TEXCOORD likely goes up to 7 or so (varies per device of course, no idea how many you’d get on iOS). COLOR, NORMAL, and TANGENT probably only have 0 and 1. But you can also use generic names like ATTR0, ATTR1, etc, which go up to however many attributes the hardware supports. Note that those are internally aliased to the position/color/texcoord/etc. attributes, e.g. ATTR0 is the same as POSITION, so best not to mix and match the ATTR ones with TEXCOORD and friends.

0
101 Mar 13, 2012 at 18:01

@Reedbeta

Ah, if you’re targeting non-D3D10-11 devices then yes, you’ll have to stay within the predefined set of semantics. :) In that case, TEXCOORD likely goes up to 7 or so (varies per device of course, no idea how many you’d get on iOS). COLOR, NORMAL, and TANGENT probably only have 0 and 1. But you can also use generic names like ATTR0, ATTR1, etc, which go up to however many attributes the hardware supports. Note that those are internally aliased to the position/color/texcoord/etc. attributes, e.g. ATTR0 is the same as POSITION, so best not to mix and match the ATTR ones with TEXCOORD and friends.

Where I can find those specs ? I was trying to look for shader model 3_0 to know how many semantics were predefined but i couldn’t find infos… plus I’m worried about other platforms like ps3 etc even though I guess for ps3 should be similar as it should support shader model 3_0.
Plus in the microsoft documentation http://msdn.microsoft.com/en-us/library/bb509647%28v=vs.85%29.aspx is mentioned that dx9 and dx10 support all those vertex shader input semantics, which are more than the ones you told me (concerning dx9 and therefore shader model 3_0).
Plus, it doesn’t say what’s the maximum allowed value of [n] after a given semantic (e.g. TEXCOORD[n]).

0
139 Mar 13, 2012 at 21:38

I found some information about it in the Cg docs. Here is the vs_3_0 list of semantics, but it doesn’t have the generic attribute mappings. I did find those on the vp40 page; vp40 is an OpenGL profile that I think is (roughly) equivalent with vs_3_0. I might have been mistaken: the generic “ATTR0” etc. might only be applicable to Cg/GLSL; I’m not sure whether they work in D3D9 HLSL.