Jump to content


HDR Texture Compression. How to.


22 replies to this topic

#1 AGPX

    Member

  • Members
  • PipPip
  • 44 posts

Posted 08 September 2005 - 10:23 PM

Hi to all,

I need to introduce HDR textures in my DX9 engine. Actually, IMHO the best texture format for HDR images is A16R16G16B16. The problem is that every texel occupies 8 bytes! An extremely huge quantity of memory.
Compression techniques are necessary. I looked at DX9 demos and I see a nice compression method called RGBE8. Basically, it rewrite every channel (r, g, b) in the form: m * 2^e. m represent the mantissa of the channel and e is the exponent (stored in the alpha channel). Exponent is common to all the three channels.
This method requires 4 bytes per texel, that is an improvement over the naive one.
Actually I can extend this compression up to 1,5 bytes per texel in the following way: I compress RGB channels with DXT1 compression (0,5 bytes per texel). Then, I use another texture to store the exponent uncompressed (this increase the precision because exponent precision is crucial). Total: 1,5 bytes per texel.
A quite good compression.
However, there is a problem: this method works only if you use point filtering. Other filtering methods, like the bilinearing filter, make the things go wrong. :sad:
I have tried to apply point filtering to exponent texture (to fetch it correctly) and bilinearing in the RGB channels... but this still expose too artifacts.
I have another idea: splitting the 16 bit per channel texture in two 8 bit per channel textures (low byte & high byte texture) and apply DXT1 compression to both. Total: 1 byte per texel! Problem: an error in the "low byte" texture is not very noticeable. But an error in the "high byte" texture (because DXT1 is a lossy compression as you know) is quite devastating... HELL! :angry:
A solution to the problem is to store the low byte texture compressed with DXT1 and the high byte texture uncompressed. Total: 3,5 bytes per texel. :huh:
Here bilinearing filters don't make problems.
Good quality, but It's not a great compression... I will be happy if I can compress it a bit more (up to 2 bytes per texel or less :).
I'm very happy if you cool guys can suggest me others methods.

Thanks in advance,

- AGPX

#2 geon

    Senior Member

  • Members
  • PipPipPipPip
  • 939 posts

Posted 09 September 2005 - 07:40 AM

It should work to use lower resolution for the exponent. I am not shure about the artifacs, but is should only be noticeable in high contrast edges.

#3 AGPX

    Member

  • Members
  • PipPip
  • 44 posts

Posted 09 September 2005 - 08:56 PM

geon said:

It should work to use lower resolution for the exponent. I am not shure about the artifacs, but is should only be noticeable in high contrast edges.

View Post


Actually I do the RGBE encoding with the following code:

int encode(float r, float g, float b,
         float &encodedR,
         float &encodedG,
         float &encodedB)
{
  const float maxComponent = max(max(r, g), b);
  const int exp = (int)ceilf(log2f(maxComponent));
  const float divisor = powf(2.0f, exp);
  encodedR = r / divisor;
  encodedG = g / divisor;
  encodedB = b / divisor;
  return exp;
}

So: r = encodedR * 2^exp, g = encodedG * 2^exp and b = encodedB * 2^exp;

I found that it works better if I return exp * 16 (instead of exp), but in high contrast edge some Halo effects appear.

Anyway, I dunno if is convenient to do bilinear filtering manually in the pixel shader...
This sound too nasty and slow... :dry:

The following image is a shot from my editor. I use RGBE normally (without multiply by 16):

Posted Image

The following is with the exponent multiplication:

Posted Image

The last show halo and artifacts (again with multiplication by 16):

Posted Image

Exponent is clipped to stay in the range [-8, 7] when multiplied by 16. So it lie in the range [-128, 112]. I also sum 128 to remove the sign ([0, 240]). So it can stay in a single byte. Yes, dynamic range is less than previous, but images look really better.

I don't understand the nature of the artifacts, at all. Halo should be due to bilinear filtering. The problem is that filtering should happen AFTER decoding and not before.

Is this what you mean for "use lower resolution for exponent"?

Thanks,

- AGPX

#4 AGPX

    Member

  • Members
  • PipPip
  • 44 posts

Posted 09 September 2005 - 09:52 PM

Hi again,

I have fixed the artifact problem. It's due to DXT1 compression of the RGB channels (not the exponent). However, halo still occurs... (thanks to mikez for his hints).

- AGPX

#5 geon

    Senior Member

  • Members
  • PipPipPipPip
  • 939 posts

Posted 09 September 2005 - 09:53 PM

AGPX said:

Is this what you mean for "use lower resolution for exponent"?


Not at all...


As you showed in your code, you can encode several values with the same exponent (hence RGBE). There is nothing stoping you from encoding even more values to a single exponent.

A neat usage of this would be to store the color / exponent in a separate texture, as mentioned, BUT! why store so many of those exponents? If you take a block of say 4x4 pixels, the exponents should generally be virtually the same. (Except in very high contrast.) So; just store the exponent once and reuse it.

If you use this solution, you must of course use no interpolation for the E channel, since the RGB values "expect" exactly the original exponent. The RGB channels should be fine to interpolate, though, expect for seams between 4x4 blocks. (This could be really ugly.)

One could, however, take this into account. Let's say you use linear interpolation. (Can we assume the interpolation will return exactly the same values for every card/driver? I think this is very well defined by D3D... ?) Then, sample down the original exponent texture first, and encode the RGB values based on the interpolated values. Theoretically, this should work perfectly in most cases. (You might need to run an extra pass on the downsampled exponent texture to give each pixel the highest value of its neighbours, so no interpolation gives less than needed max-RGB-value. Then the final color would allways stay too dark.)

Also, if you match the block-size to the DXT compression size (4x4), the RGB values should still compress nicely.

#6 AGPX

    Member

  • Members
  • PipPip
  • 44 posts

Posted 09 September 2005 - 10:03 PM

thanks to all repliers.

#7 AGPX

    Member

  • Members
  • PipPip
  • 44 posts

Posted 09 September 2005 - 10:32 PM

Actually, I have two textures: one for RGB and the other for exponent.
I have tried to apply bilinear filtering only on the RGB, disabling it in the exponent. Here the result:

Posted Image

The halo is more evident.

The problem is that mantissas related to two quite different exponent, cannot be interpolated linearly. :sad:

My research still continue...

Helps are always appreciated, thanks.

- AGPX

#8 davepermen

    Senior Member

  • Members
  • PipPipPipPip
  • 1306 posts

Posted 10 September 2005 - 04:24 PM

the only way working out correctly would be to do the linear sampling manually by doing 4 pointsamples...

on ps3.0 you could optimise it by determine if you actually are at an edge of the exponent, and do the full bilinear only as needed.

this way, you could gain much speed to get it near to pointsample performance.
davepermen.net
-Loving a Person is having the wish to see this Person happy, no matter what that means to yourself.
-No matter what it means to myself....

#9 AGPX

    Member

  • Members
  • PipPip
  • 44 posts

Posted 10 September 2005 - 08:29 PM

Hi again,

I will become crazy... without doubts. :wacko:

I have changed my method. I split my 16 bit integer texture (A16R16G16B16) in two 8-bit per channel texture (one for the low byte, and the other for the high byte).
The low texture is compressed with DXT3 compression. The other one is stored uncompressed (A8R8G8B8). According to my calculations, this method should be immune from bilinear filtering issue. And it is... but only with Reference Rasterizer! With HAL device, the thing don't works. :blink:
WHY??????

Take a small look to the pixel shader I used to reassemble the two 8 bit textures in a 16 bit one:

sampler2D lightMapLow;
sampler2D lightMapHigh;

float4 Expose(in float4 rgba, float exposure)
{
  return 1 - exp(-rgba * exposure);
}

float4 texHDR2D(in sampler2D texLow, in sampler2D texHigh, in float2 tex)
{
  return tex2D(texLow, tex) + tex2D(texHigh, tex) * 256.0;
}

float4 ps_main(float2 inTex: TEXCOORD0) : COLOR0
{
  return Expose(texHDR2D(lightMapLow, lightMapHigh, inTex), 1.2);
}

And now take a look at the results using Reference Rasterizer:

Posted Image

And the following is with HAL device on Sapphire Radeon 9800 Pro (128 Mb, 256 bit):

Posted Image

HOW THE HELL IS THIS POSSIBLE?

Please help me!

P.S.: I have tried to use A8R8G8B8 for low texture instead of DXT3, but NONE change!

P.S.2: Works well on HAL only if I switch from bilinear filtering to point filtering!

P.S.3: Here you can see the low & high textures generated by my lightmapper:

LOW:

Posted Image

HIGH:

Posted Image

The high is not totally 0, they have some pixel set to 1,2 and 3. The textures appears to be ok.

#10 AGPX

    Member

  • Members
  • PipPip
  • 44 posts

Posted 10 September 2005 - 11:27 PM

Hi,

all the problems are now solved.

I post here my solution to support other peoples with a similar problem.
Basically, the problem is due to the high interpolators imprecision.
Reference rasterizer is preciser than HAL device.
So basically, there are nothing to do: you have to perform POINT filtering and make bilinear filtering via pixel shader. Here the code to perform the filtering (gently posted to me by a guy on #flipcode. Thanks to you!)

Here the pixel shader:

float4 bilinear(in sampler2D s, in float2 t)
{
  const float2 wh = {256.0, 256.0};
  const float2 dwh = 1 / wh;
  float2 dxy = t * wh - floor(t * wh);
  float4 a = lerp(tex2D(s, t + float2(0,   0)), tex2D(s, t + float2(dwh.x,   0)), dxy.x);
  float4 b = lerp(tex2D(s, t + float2(0, dwh.y)), tex2D(s, t + float2(dwh.y, dwh.y)), dxy.x);
  return lerp(a, b, dxy.y);
}

This way, also the RGBE encoding works well, so I switch to it, because it requires less storage memory. Oh, well, at least... only if I can compress it with DXT1-5 without introduction of artifacts... I will investigate on that tomorrow and, finally, I think that I'll write a tutorial on HDR compression... could be helpful, until hardware manufacturer will introduce compression for 16 bit textures! (to say all the truth some 16-bit FourCC formats exists, but are largely not implemented).

Thanks to all for help. :D

#11 kusma

    Valued Member

  • Members
  • PipPipPip
  • 163 posts

Posted 11 September 2005 - 09:25 PM

thanks a lot for the posts, really nice to see that someone else is struggeling with bilinear interpolation of hdr-textures ;)

#12 Axel

    Valued Member

  • Members
  • PipPipPip
  • 119 posts

Posted 12 September 2005 - 12:55 AM

Quote

Reference rasterizer is preciser than HAL device.
Should read: Reference rasterizer is preciser than ATi HAL device. :rolleyes:

#13 Reedbeta

    DevMaster Staff

  • Administrators
  • 5309 posts
  • LocationSanta Clara, CA

Posted 12 September 2005 - 01:59 AM

Yup. ATI skimps on precision, using only 24-bit floats for internal pipeline. But nVidia uses the full 32 bits.
reedbeta.com - developer blog, OpenGL demos, and other projects

#14 bonzaj

    New Member

  • Members
  • Pip
  • 6 posts

Posted 13 September 2005 - 03:27 PM

Hello. Really nice topic :). And what with RGBE cubemaps?

#15 Axel

    Valued Member

  • Members
  • PipPipPip
  • 119 posts

Posted 13 September 2005 - 08:47 PM

Quote

Yup. ATI skimps on precision, using only 24-bit floats for internal pipeline. But nVidia uses the full 32 bits.
AFAIK the interpolators are FX8 only.

#16 Reedbeta

    DevMaster Staff

  • Administrators
  • 5309 posts
  • LocationSanta Clara, CA

Posted 13 September 2005 - 10:37 PM

Yeah, the bilinear interpolation in the texture fetch may well be only 8-bit. But even in a pixel shader, the temp registers are only 24-bit rather than 32-bit.
reedbeta.com - developer blog, OpenGL demos, and other projects

#17 Axel

    Valued Member

  • Members
  • PipPipPip
  • 119 posts

Posted 13 September 2005 - 10:49 PM

No, I mean I'm pretty sure that the interpolators on Radeons are not floating point. The bilinear interpolation is only 6 bit btw...

#18 bonzaj

    New Member

  • Members
  • Pip
  • 6 posts

Posted 13 September 2005 - 10:57 PM

yes but what with bilinear filtering on cubemaps? How to fetch the neightbours??

#19 bonzaj

    New Member

  • Members
  • Pip
  • 6 posts

Posted 14 September 2005 - 12:02 AM

ok I've got it - It's on ATI's page.

#20 kusma

    Valued Member

  • Members
  • PipPipPip
  • 163 posts

Posted 14 September 2005 - 09:58 AM

bonzaj said:

ok I've got it - It's on ATI's page.

a link would rule.. ;)





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users