I'm in the process of implementing a software render engine for a low-power device.
Do any of you people have implemented C-Buffers, Span-Buffers or some other culling schemes? IMHO:
- Z-Buffers are simple but have too much overdraw and memory requirements.
- Span-Buffers need no pre-sorting of polygons and have zero overdraw, but need some memory and are harder to implement.
- C-Buffers need pre-sorting, are simple to implement and need almost no memory.
I don't really know what to choose...
Z-Buffer, C-Buffer, Span-Buffer, ...?
Started by rarefluid, Jun 11 2007 01:55 PM
10 replies to this topic
#1
Posted 11 June 2007 - 01:55 PM
#2
Posted 11 June 2007 - 09:05 PM
What low-power device are you talking about exactly? My laptop with Core 2 Duo is low-power too. ;)
The choice of algorithm also depends on the resolution and the number of polygons. Also of influence is how much control you have over the appliction side. If you're rendering a BSP then most likely a C-buffer is the best choice... unless you also have other more dynamic geometry to render. If you're implementing OpenGL|ES then a z-buffer is likely the only option.
Also, what are the performance and quality requirements? Is it acceptable to have some artifacts (cfr. painter's algorithm)?
The choice of algorithm also depends on the resolution and the number of polygons. Also of influence is how much control you have over the appliction side. If you're rendering a BSP then most likely a C-buffer is the best choice... unless you also have other more dynamic geometry to render. If you're implementing OpenGL|ES then a z-buffer is likely the only option.
Also, what are the performance and quality requirements? Is it acceptable to have some artifacts (cfr. painter's algorithm)?
#3
Posted 11 June 2007 - 09:28 PM
The requirements are for Gameboy Advance'ish hardware meaning:
- few polygons (max. 1k?), resolution max. 240x160
- Not really memory for a decent-resolution z-Buffer (though z-values are nice-to-have...). Actually not much memory at all (96k vram, 256+32k ram :) )...
- Subpixel-correctness (like 2-4 bits or something)
- Low power (16MHz), so overdraw is expensive depending on per-pixel operations
- should handle BSPs (easy) als well as dynamic objects
We have:
- many registers
- RISC, conditional instructions
- no hardware-divide, sqrt!
...painters algorithm sounds quite wasteful to me...
I know this has probably all been done before, but now I want to do it ;)
- few polygons (max. 1k?), resolution max. 240x160
- Not really memory for a decent-resolution z-Buffer (though z-values are nice-to-have...). Actually not much memory at all (96k vram, 256+32k ram :) )...
- Subpixel-correctness (like 2-4 bits or something)
- Low power (16MHz), so overdraw is expensive depending on per-pixel operations
- should handle BSPs (easy) als well as dynamic objects
We have:
- many registers
- RISC, conditional instructions
- no hardware-divide, sqrt!
...painters algorithm sounds quite wasteful to me...
I know this has probably all been done before, but now I want to do it ;)
#4
Posted 11 June 2007 - 10:08 PM
rarefluid said:
...painters algorithm sounds quite wasteful to me...
So a c-buffer is probably going to be your best bet. For BSPs just render from front to back. For other geometry sort polygons front to back. Both can make use of the c-buffer so it's a pretty uniform way of rendering.
The c-buffer itself can be implemented either with spans (requires some basic memory management), or 1 bit per pixel (requires a fast way to count bits).
#5
Posted 11 June 2007 - 10:24 PM
The GBA has no count-leading-zeros instruction, which is bad especially after reading Nils' article on math tricks with fixed-point...
I wanted to take a shot at the c-buffer, but I'm afraid that it needs pefect front-to-back sorting or subdivision of polygons to avoid of artifacts. Is this only an issue if you have overlapping polygons?
And thanks for all the replies Nick! :)
I wanted to take a shot at the c-buffer, but I'm afraid that it needs pefect front-to-back sorting or subdivision of polygons to avoid of artifacts. Is this only an issue if you have overlapping polygons?
And thanks for all the replies Nick! :)
#6
Posted 11 June 2007 - 10:36 PM
hi rarefluid,
You would be surprised how much good looking 3d graphics you'll get with just painters algorithm. That was the only way to get things on the screen on the ps-one, and it did worked quite good.
Your hardware seems to be really low level. I would suggest that you try c-buffer and rely on c-buffer friendly geometry. The Z and S-buffer overhead is significant if you only run on 16mhz.
btw - if you have no clz but fast memory accesses (likely on a 16mhz machine) you can do a bit of table work. Take a look at this method: http://graphics.stan...RightMultLookup
Nils
You would be surprised how much good looking 3d graphics you'll get with just painters algorithm. That was the only way to get things on the screen on the ps-one, and it did worked quite good.
Your hardware seems to be really low level. I would suggest that you try c-buffer and rely on c-buffer friendly geometry. The Z and S-buffer overhead is significant if you only run on 16mhz.
btw - if you have no clz but fast memory accesses (likely on a 16mhz machine) you can do a bit of table work. Take a look at this method: http://graphics.stan...RightMultLookup
Nils
My music: http://myspace.com/planetarchh <-- my music
My stuff: torus.untergrund.net <-- some diy electronic stuff and more.
My stuff: torus.untergrund.net <-- some diy electronic stuff and more.
#7
Posted 11 June 2007 - 10:43 PM
It's hard to avoid all artifacts. However, given the platform, I think it's going to be acceptable. Remember that the original Unreal game used polygon sorting... :yes:
Does the processor have a way to shift a register and take a conditional jump depening on the bit shifted out? Or maybe a 'test' instruction which can single out a bit?
Does the processor have a way to shift a register and take a conditional jump depening on the bit shifted out? Or maybe a 'test' instruction which can single out a bit?
#8
Posted 11 June 2007 - 11:26 PM
Nick said:
Does the processor have a way to shift a register and take a conditional jump depening on the bit shifted out? Or maybe a 'test' instruction which can single out a bit?
C++ addict
-
Currently working on: the 3D engine for Tomb Raider.
-
Currently working on: the 3D engine for Tomb Raider.
#9
Posted 12 June 2007 - 07:07 AM
Shifting is a pseudo-instruction on the ARM. You can do a move-register-to-register while specifiying a shift-value. The shift and rotate instructions update the carry flag with the last bit shifted out.
It also has a TeST instruction (non-destructive AND) and a Test for EQuality (non-destructive XOR).
It also has a 16bit mode (Thumb) you can switch to and from as you like with quite powerful instructions too.
For the interested: http://eceserv0.ece....t_reference.pdf
@Nils: That page has some excellent tricks! :D
It also has a TeST instruction (non-destructive AND) and a Test for EQuality (non-destructive XOR).
It also has a 16bit mode (Thumb) you can switch to and from as you like with quite powerful instructions too.
For the interested: http://eceserv0.ece....t_reference.pdf
@Nils: That page has some excellent tricks! :D
#10
Posted 12 June 2007 - 09:53 AM
Hi,
I recently implemented a software renderer which runs on Symbian mobile phones.
I implemented a simple Z buffer in our system, it really doesn't take up that much memory (for a 176x208 screen memory used == 72k) and it works fast enough that one of our games runs about 30fps on some mobiles.
I recently implemented a software renderer which runs on Symbian mobile phones.
I implemented a simple Z buffer in our system, it really doesn't take up that much memory (for a 176x208 screen memory used == 72k) and it works fast enough that one of our games runs about 30fps on some mobiles.
#11
Posted 12 June 2007 - 10:23 AM
With 96k of VRAM on GBA you can do:
240x160 indexed mode with backbuffer, 21k left
240x160 16bit mode w/o backbuffer, 21k left
160x128 indexed mode with backbuffer, 57k left (16bit backbuffer possible)
160x128 16bit mode with backbuffer, 16k left
There isn't even space for some hierarchical z-Buffer tricks...
And IMHO a z-buffer should at least be only one memory access per pixel if your're wasting so much memory and bandwith on it... :)
240x160 indexed mode with backbuffer, 21k left
240x160 16bit mode w/o backbuffer, 21k left
160x128 indexed mode with backbuffer, 57k left (16bit backbuffer possible)
160x128 16bit mode with backbuffer, 16k left
There isn't even space for some hierarchical z-Buffer tricks...
And IMHO a z-buffer should at least be only one memory access per pixel if your're wasting so much memory and bandwith on it... :)
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users












