187 replies to this topic

### #41Nick

Senior Member

• Members
• 1227 posts

Posted 06 November 2004 - 08:50 AM

Hi Mario,

I agree that my article has close to zero practical use nowadays. The real reason I wrote it and posted it here was for the reasons explained in the first paragraph. These kind of algorithms are forgotton or simply unknown by many, while I think they are interesting for all graphics programmers. In the worst case, it's just a small math excercise, showing that there are alternatives to the classic scanline based approach.

Imagine that someone would like to optimize his terrain renderer, by identifying regions that are inside/ouside the frustum. Although this would be done in 3D, the algorithm would actually be equivalent to rasterization. Terrain vertices are the pixels, frustum planes are the half-space functions. Also the inside/outside test would use the same approach, and just like here it can be optimized by working on bigger regions first.

So, the article's primary function is not to try to let everybody start writing his own software renderer. As you say, this is not productive, and the market for software renderers is tiny or next to non-existant. But it's still very useful as a learning experience. I also don't fully agree with your optinion about the algorithmical optimization. It's clear that computing coverage of 8x8 blocks, 64 pixels, is a good shortcut to avoid rasterizing the whole bounding rectangle. I forgot to mention this in the article, but the size of the optimal block was determined experimentally. I did mention that further optimizations are best done at the assembly level, implying that further profiling would be required.

Kind regards,

Nicolas

### #42Nick

Senior Member

• Members
• 1227 posts

Posted 24 November 2004 - 03:58 PM

Bug & fix:

A couple days ago I detected a bug in my renderer. When drawing a very detailed terrain, there were some 8x8 pixel blocks flashing on the horizon. After some analysis I discovered that it's a bug in this rasterizer, when using very small triangles.

The explanation turned out to be very obvious: The half-space function (x2 - x1) * (y - y1) - (y2 - y1) * (x - x1) >= 0 assumes that x2 - x1 and y2 - y1 are never both zero. However, this does happen with tiny triangles since the fixed-point precision is limited to 1/16. When these deltas are zero, the half-space function (corrected for top-left fill convention) becomes true for any pixel, resulting in a whole 8x8 block to be filled, while actually none of the pixels should be filled.

Fixing this is very straight-foward. For every edge we have to check that the deltas aren't both zero. We compute them already anyway, so there's virtually no extra work. Tiny triangles like this don't have to be drawn. They fall 'between' pixels so they won't even cause gaps in high-detail geometry. For the interested: a bit more triangles can be rejected this way...

### #43davepermen

Senior Member

• Members
• 1306 posts

Posted 29 November 2004 - 12:29 AM

hey, you need an athlon64 to test your work on?..

i got one now:D
davepermen.net
-Loving a Person is having the wish to see this Person happy, no matter what that means to yourself.
-No matter what it means to myself....

### #44Kettlebell

New Member

• Members
• 6 posts

Posted 19 December 2004 - 08:24 AM

Just a question about the code Nick.

What is your implementation of 'iround'?

Cheers,
Tim

### #45Nick

Senior Member

• Members
• 1227 posts

Posted 21 December 2004 - 11:00 AM

Hi Kettlebell,

iround() just performs a round to integer operation. I implemented it like this:
inline int iround(float x)
{
int t;

__asm
{
fld  x
fistp t
}

return t;
}


### #46Kettlebell

New Member

• Members
• 6 posts

Posted 22 December 2004 - 03:34 PM

Many thanks Nick.

Previously I was doing an (int) cast on all the floats, and I've since discovered that it's a very inefficient way to be doing it (on the microsoft compiler).

I've integrated your code to replace my scanline algorithm, and I can see a very clear performance boost. Awesome, awesome work.

Looking forward to any future articles.

Tim

### #47Nick

Senior Member

• Members
• 1227 posts

Posted 23 December 2004 - 02:59 AM

Kettlebell said:

Previously I was doing an (int) cast on all the floats, and I've since discovered that it's a very inefficient way to be doing it (on the microsoft compiler).
All x86 compilers have this suboptimal behaviour. The problem is that that standard rounding mode for x86 is round to nearest, while (int) cast requires round towards zero. So the cast requires two rounding mode changes, which is a rather slow operation.

Quote

I've integrated your code to replace my scanline algorithm, and I can see a very clear performance boost. Awesome, awesome work.
Great! What kind of project are you working on?

Quote

Looking forward to any future articles.
There will definitely come more...

### #48Kettlebell

New Member

• Members
• 6 posts

Posted 24 December 2004 - 05:56 AM

Nick said:

Great! What kind of project are you working on?

I'm working on my own software renderer. It's quite an undertaking, but I've so far managed to put in clipping, directional and point lights and flat shading (made a lot better thanks to your help).

I'm now working on putting in gouraud shading. Since I'm here I have a quick question for you Nick. Earlier on in the thread you made reference to calculating du/dx, du/dy etc for interpolation of texture coordinates.

I've read through part 1 of Chris Heckler's tutorial on perspective texture mapping. Though he didn't mention it explicitly in the tutorial, a sorting of the y-values is required prior to the calculation of the triangle gradients right?

Thanks for the help, have a Merry Christmas!

Tim

### #49efortier

New Member

• Members
• 5 posts

Posted 25 December 2004 - 09:52 PM

Quote

I agree that my article has close to zero practical use nowadays.
I beg to differ :). Most PocketPC devices available today (with the exceptions of the new Axim and a few other) have no 3D chip. Nor does the Zodiac gaming device and many other platforms as well.

And processing 8x8 blocks also has another hidden advantage, especially true on devices with no video memory and/or 3D chips. It makes better use of the memory cache. Processing pixels in sequence, like the scanline method, does not use the cache efficiently (on PocketPc devices at least) but using block does. I suspect that processing 16x16 on some devices would yield even better results than 8x8. Unless the polygons are very small.

It's true that people are starting to forget about software rasterizer. Nowaday it's all about shaders, normal mapping, etc. I'm having a real hard time learning.

Your article really inspired me and I decided to start working on a new engine to replace my old one (based on LaMothe's book). The technique you presented is much clearer compared to the tedious setup you need to do when doing scanline.

--Eric

### #50Nick

Senior Member

• Members
• 1227 posts

Posted 26 December 2004 - 10:02 AM

Kettlebell said:

I'm working on my own software renderer. It's quite an undertaking, but I've so far managed to put in clipping, directional and point lights and flat shading (made a lot better thanks to your help).
Cool! Is it for educational purposes or do you intend to achieve something more than that? Either way good luck!

Quote

I'm now working on putting in gouraud shading. Since I'm here I have a quick question for you Nick. Earlier on in the thread you made reference to calculating du/dx, du/dy etc for interpolation of texture coordinates.

I've read through part 1 of Chris Heckler's tutorial on perspective texture mapping. Though he didn't mention it explicitly in the tutorial, a sorting of the y-values is required prior to the calculation of the triangle gradients right?
No, sorting is no longer required. Vertices have to be passed in counter-clockwise order. This guarantees the half-space functions are oriented correctly, and also allows to consistently compute the gradients.

### #51efortier

New Member

• Members
• 5 posts

Posted 27 December 2004 - 02:55 AM

Quote

No, sorting is no longer required. Vertices have to be passed in counter-clockwise order. This guarantees the half-space functions are oriented correctly, and also allows to consistently compute the gradients.
If it's not too much to ask, would it be possible to see an example of the right way to find the gradients? I've been at this fulltime for the past two days, and I can't seem to find the right way to get those gradients...

Man, you really have to try your hands at 3D stuff to learn to appreciate the likes of you, Nick ( and Carmack too I guess :) ).

:wacko:

--Eric

### #52Kettlebell

New Member

• Members
• 6 posts

Posted 27 December 2004 - 08:33 AM

efortier said:

If it's not too much to ask, would it be possible to see an example of the right way to find the gradients? I've been at this fulltime for the past two days, and I can't seem to find the right way to get those gradients...

Hi Eric, have you checked out the tutorials by Chris Hecker?

http://www.d6.com/us...er/misctech.htm

Check out the perspective correct texture mapping series, part1.

Inside you'll find a discussion of the triangle gradients. His explanation of the gradients makes sense, but I get messed up when I draw a different triangle. Maybe you could lend us a hand Nick? :)

The formulas are still in there, and it was pretty easy to integrate 1/z buffering into Nick's rasterizer. All of this sure beats Lamothe's tedious scanline implementation.

Good luck mate,
Tim

### #53Kettlebell

New Member

• Members
• 6 posts

Posted 27 December 2004 - 08:49 AM

Nick said:

Cool! Is it for educational purposes or do you intend to achieve something more than that? Either way good luck!

Only for educational purposes. I'm using Lamothe's book. He's got some of the most hideous code I've ever seen, which is probably a good thing since it forces one to implement everything on their own. But the bits where he's explaining a concept are easy enough for a beginner like myself to grasp.

He could have easily cut the page count in half though.

### #54efortier

New Member

• Members
• 5 posts

Posted 27 December 2004 - 09:14 AM

Quote

Hi Eric, have you checked out the tutorials by Chris Hecker?
Thanks Tim, I'll check it out. I need to implement an affine mapper, but I suppose it can't hurt.

Quote

He could have easily cut the page count in half though.
Why the hell does he list page after page of prototypes?? A few listings I understand, but function names? What's the use?

I also emailed him a few times with errors in his book. Some of them are pretty apparent (typos), but some others are insidious (math stuff).

Last time I emailed with a list of errors, he didn't even send me a thank you reply (I had sent him around 15 errors I found!) or ask for any clarification...

--Eric

### #55Nick

Senior Member

• Members
• 1227 posts

Posted 27 December 2004 - 05:24 PM

efortier said:

If it's not too much to ask, would it be possible to see an example of the right way to find the gradients? I've been at this fulltime for the past two days, and I can't seem to find the right way to get those gradients...
The easiest and most robust way is to use the plane equation.

Let's say we want to interpolate some component z linearly across the polygon (note that z stands for any interpolant). We can visualize this as a plane going through the x, y and z positions of the triangle, in 3D space. Now, the equation of a 3D plane is generally:

A * x + B * y + C * z + D = 0

From this we can derive that:

z = -A / C * x - B / C * y - D

Note that for every step in the x-direction, z increments by -A / C, and likewise it increments by -B / C for every step in the y-direction. So these are the gradients we're looking for to perform linear interpolation. In the plane equation (A, B, C) is the normal vector of the plane. It can easily be computed with a cross product.

Now that we have the gradients, let's call them dz/dx (which is -A / C) and dz/dy (which is -B / C), we can easily compute z everywhere on the triangle. We know the z value in all three vertex positions. Let's call the one of the first vertex z0, and it's position coordinates (x0, y0). Then a generic z value of a point (x, y) can be computed as:

z = z0 + dz/dx * (x - x0) + dz/dy * (y - y0)

Once you've computed the z value for the center of the starting pixel this way, you can easily add dz/dx to get the z value for the next pixel, or dz/dy for the pixel below (with the y-axis going down).

I hope this helps! Don't hesitate to ask for further clarification.

### #56efortier

New Member

• Members
• 5 posts

Posted 27 December 2004 - 05:33 PM

Thanks Nick, that makes sense. What I still don't understand, given the context of the half-space rasterizing technique you presented, is how to get the proper deltas (dx,dy,du,dv) if the vertices are not y-sorted. I should mention that I am implementing an affine mapper (first).

LaMothe sorts the vertices and then checks for flat-top and flat-bottom. I am unable to come up with the right way to find the deltas if you drop these two steps.

Any suggestions?

Thanks again,

--Eric

### #57Nick

Senior Member

• Members
• 1227 posts

Posted 27 December 2004 - 10:58 PM

efortier said:

LaMothe sorts the vertices and then checks for flat-top and flat-bottom. I am unable to come up with the right way to find the deltas if you drop these two steps.
Computing the (A, B, C) normal of the plane takes care of everything. It's done with a cross product. Component-wise it looks like this:

A = (z3 - z1) * (y2 - y1) - (z2 - z1) * (y3 - y1)
B = (x3 - x1) * (z2 - z1) - (x2 - x1) * (z3 - z1)
C = (x2 - x1) * (y3 - y1) - (x3 - x1) * (y2 - y1)

See, a cross product is independent of which two edge vectors of the triangle you choose, but the sign depends on the order. So as long as they are in counter-clockwise order it doesn't matter how you number them.

Good luck!

### #58efortier

New Member

• Members
• 5 posts

Posted 28 December 2004 - 04:01 AM

Hi Nick,

Thanks for the explanation. I got the affine texture mapping working! Here's what I do, let me know if I understood correctly:

- Use counter-clockwise polygon ordering

- Transform visible polygons from object all the way through screen space

- Using the plane equation you showed above, replace z1-z3 with the value I want to interpolate (u, v etc)

- dx is then -A / C, and dy = -B / C

- in the half-space loop, use dx and dy to interpolate my value

- x1-x3 and y1-y3 used in the rasterizer should be in screen space

How does it looks? That is very straightforward, and adding new values to interpolate is easier than before.

My problem is that I don't have a very strong math background, so I can't seem to think in term of math when trying to solve a problem, but instead in term of language and programming...

Thanks for your help. I'll be benchmarking both the scanline and half-space rasterizer on the PocketPC devices I have soon, and I'll post my findings.

--Eric

### #59Kettlebell

New Member

• Members
• 6 posts

Posted 29 December 2004 - 03:24 AM

That was so easy to understand, thanks Nick.

You know, you should probably consider writing a book on all of this. You are very good at explaining things clearly.

Tim

### #60Nick

Senior Member

• Members
• 1227 posts

Posted 29 December 2004 - 09:27 AM

Kettlebell said:

That was so easy to understand, thanks Nick.

You know, you should probably consider writing a book on all of this. You are very good at explaining things clearly.