Jump to content


shaders take way too long to compile dx11


5 replies to this topic

#1 rouncer

    Senior Member

  • Members
  • PipPipPipPip
  • 2718 posts

Posted 30 October 2012 - 02:42 PM

Im writing distance field raytracing, and it takes a for loop in the pixel shader to wind the rays out.

Ive got one piece of code inside the loop, and its a point line segment distance test. (cause im trying to get roads happening) this one test has already boosted compile time to about a minute!!! its horrible waiting for it every time.
(if i added path tracing it just wouldnt even compile)

is there anything i can do about this?
you used to be able to fit a game on a disk, then you used to be able to fit a game on a cd, then you used to be able to fit a game on a dvd, now you can barely fit one on your harddrive.

#2 Stainless

    Member

  • Members
  • PipPipPipPip
  • 575 posts
  • LocationSouthampton

Posted 30 October 2012 - 02:55 PM

early shader models don't have any loop logic. The compile stage will unroll the loop, this might be your problem.

What shader model are you compiling against?

#3 rouncer

    Senior Member

  • Members
  • PipPipPipPip
  • 2718 posts

Posted 30 October 2012 - 03:34 PM

shader model 5, isnt that the latest? Is there a way around this unrolling?
you used to be able to fit a game on a disk, then you used to be able to fit a game on a cd, then you used to be able to fit a game on a dvd, now you can barely fit one on your harddrive.

#4 Reedbeta

    DevMaster Staff

  • Administrators
  • 5305 posts
  • LocationBellevue, WA

Posted 30 October 2012 - 05:11 PM

You can hint to the compiler not to unroll it by writing [loop] before the loop, like:

[loop] for (int i = 0; i < 256; ++i)  ...

Even in Shader Model 5, if the iteration count is constant then the compiler may choose to unroll it. However, if the iteration count is not available at compile time (e.g. it's passed in through a constant buffer) then the compiler can't unroll the loop.

Some other hints you may be interested in: [unroll] hints to unroll a loop; [branch] or [flatten] before an if-statement hint to either implement it as a hardware branch or to evaluate both sides of the if and use a selector instruction to choose the result, respectively.
reedbeta.com - developer blog, OpenGL demos, and other projects

#5 rouncer

    Senior Member

  • Members
  • PipPipPipPip
  • 2718 posts

Posted 30 October 2012 - 05:18 PM

hmm, thanks alot for the help, but i tried to write loop there, but it didnt make a difference... heres a look at the shader...
its like it cant be unrolling it by what youve told me... but the strange thing is if i make the loop smaller it compiles quicker.

[source]
struct PSTO
{
float4 col0 : sv_target0;
float4 col1 : sv_target1;
};
PSTO PST(PS2_INPUT Input ) : SV_TARGET
{
// Compute the vector of the pick ray in screen space
float3 v;
v.x = ( Input.uv.x*2-1 ) / view._11;
v.y = ( -(Input.uv.y*2-1) ) / view._22;
v.z = 1.0f; //this stops fishbowl, keeping the z always 1, i think.
float3 dir, orig;
// Transform the screen space pick ray into 3D space
dir.x = v.x*proj._11 + v.y*proj._21 + v.z*proj._31;
dir.y = v.x*proj._12 + v.y*proj._22 + v.z*proj._32;
dir.z = v.x*proj._13 + v.y*proj._23 + v.z*proj._33;
orig.x = proj._41;
orig.y = proj._42;
orig.z = proj._43;
float3 rp=orig;
bool hit=false;

float4 outcol=float4(0,0,0,0);
float3 outrp;
int raysteps=256;
int i;
[loop]for(i=0;i<raysteps;i++)
{
if(hit==false)
{
float4 line1=chunk0.Sample(samLinear,float2(rp.x/1000/1024.0f,rp.z/1000/1024.0f)).rgba;
float4 line2=chunk1.Sample(samLinear,float2(rp.x/1000/1024.0f,rp.z/1000/1024.0f)).rgba;
float d1=get_distance_to_line(float2(rp.x,rp.z),line1);
float d2=get_distance_to_line(float2(rp.x,rp.z),line2);
if(d2<d1) d1=d2;
if(d1>100 && rp.y<100)
{
hit=true;
outcol=float4(1,0,1,1);
outrp=rp;
}

if(rp.y<=0)
{
hit=true;
outcol=float4(1,1,1,1);
outrp=rp;
}

float rayspeed;
if(d1>100) rayspeed=rp.y-100;
else rayspeed=rp.y;
if(d1>100)
{
if(d1-100<rayspeed) rayspeed=d1-100;
}
else
{
if(100-d1<rayspeed) rayspeed=100-d1;
}
if(rayspeed<1) rayspeed=1;
rp+=dir*rayspeed;
}
}
int bss=20;
float3 bs_start=outrp-dir*10;
float3 bs_stop=outrp+dir*10;
[loop]for(i=0;i<bss;i++)
{
bool hit=false;
rp=(bs_start+bs_stop)/2;
float4 line1=chunk0.Sample(samLinear,float2(rp.x/1000/1024.0f,rp.z/1000/1024.0f)).rgba;
float4 line2=chunk1.Sample(samLinear,float2(rp.x/1000/1024.0f,rp.z/1000/1024.0f)).rgba;
float d1=get_distance_to_line(float2(rp.x,rp.z),line1);
float d2=get_distance_to_line(float2(rp.x,rp.z),line2);
if(d2<d1) d1=d2;
if(d1>100 && rp.y<100)
{
hit=true;
outcol=float4(1,0,1,1);
}

if(rp.y<=0)
{
hit=true;
outcol=float4(1,1,1,1);
}
if(hit)
{
bs_stop=rp;
}
else
{
bs_start=rp;
}
}

PSTO outpt;
if(hit) outpt.col0=float4(rp.x,rp.y,rp.z,1);
else outpt.col0=float4(0,0,0,1);

outpt.col1=outcol;

return outpt;
}
[/source]
you used to be able to fit a game on a disk, then you used to be able to fit a game on a cd, then you used to be able to fit a game on a dvd, now you can barely fit one on your harddrive.

#6 Reedbeta

    DevMaster Staff

  • Administrators
  • 5305 posts
  • LocationBellevue, WA

Posted 30 October 2012 - 06:17 PM

Hmm, so I guess the compiler is ignoring the loop hint in this case. That makes me sad. :( But raysteps is still a constant 256 in your code. If you make raysteps a parameter instead and pass in the 256 from the main app, then the compiler will definitely make it a loop. It'll have to, because it won't know how many times to unroll it.
reedbeta.com - developer blog, OpenGL demos, and other projects





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users