# OpenGL & Transparency

38 replies to this topic

### #21Vilem Otte

Valued Member

• Members
• 345 posts

Posted 27 March 2012 - 11:06 AM

Actually the only simple solution to this is the "real" solution through path tracing . In real physics there is nothing like transparency, shadow maps, etc. - there are just photons that can be reflected, refracted or absorbed - thats all

The technology is IMO going the right way, as GPUs begin to be general computing processors (note that half of my renderer now is written in OpenCL and in C++ on CPU) - so yes, the tech is going the right way - generall computing processors are good, of course that means that in future whole OpenGL, D3D and possibly (hopefully) others (even those using ray tracing) will be written in C/C++ on GPUs (like OpenCL is for example).
My blog about game development (and not just game development) - http://gameprogramme...y.blogspot.com/

If you don't know how to speed up application, go "roarrrrrr!", hit the compiler with the club and use -O3 :D

### #22Alienizer

Member

• Members
• 435 posts

Posted 27 March 2012 - 12:45 PM

A jock of all trades is a master of none. So I think this rule should also apply to the GPU. Imagine a microchip that is designed to only do progressive photon mapping in real time? Imagine using a 3D design software that works with such an IC? Editing your materials and moving/creating objects in fully rendered mode? That would be so cool.

Anyway, back to that transparency problem. How do you trully do transparency on the GPU? On that is 100% correct, even when zooming, rotating and panning?

### #23Vilem Otte

Valued Member

• Members
• 345 posts

Posted 27 March 2012 - 05:30 PM

Depth peeling
There is a solution to do full transparency through depth peeling (you actually render scene N times, where N equals number of "transparent layers", each time discarding pixels with depth lesser and equal than the actual value in depth buffer in previous layer).

This solution is good and do order independent transparency without any problem, but! Imagine the second posted screen by TheNut - you'd need to render the scene 4 times into 4 different buffers (thats overkill) - so this solution is practically useless for games (as it needs huge amount of fillrate and memory).

So I don't think this one will help.

Hacky way (note, needs ninja for the word "hacky" here )
In my current game in-development there is quite a bit need for specific transparency solution - so I tried several solutions and wasn't happy with any (couldn't do the cool stuff with any). Note that I'm extensively using deferred shading in it. I could use ray-tracer, but it has other stuff to do (it actually does just reflections and VPN spawning - raytracing that is quite heavy itself, doing refractions = transparency would made it a lot slower) ... I'm also running out of ideas, but what prooved really useful was doing just a single depth-peel pass with half the resolution (to save fillrate, still looks good, even for glass) -> It is still very heavy, but I think that with a little fine-tuning I'll be able to put it to good use. Of course, this limits everything (two glasses, one before second, will look wrong) -> but it has more advantages:
1.) GI and ray-traced stuff is visible in and on transparent objects
2.) Doesn't need to do any (re)sorting, you render all time front-to-back
3.) It is more effective than it seems to be
4.) When more quality needed (or Radeon 7990 present), one can add depth-peel layer(s)

So it really depends on what you actually need! Just summarize what u need for your project and try to come up with solution.
My blog about game development (and not just game development) - http://gameprogramme...y.blogspot.com/

If you don't know how to speed up application, go "roarrrrrr!", hit the compiler with the club and use -O3 :D

### #24Alienizer

Member

• Members
• 435 posts

Posted 27 March 2012 - 07:30 PM

I see what you mean, quite involved just to do transparency.

I do not have a specific need, but for example, if I was adding to my game a glass maze, with hundreds of walls of different hights and colors and png-tranparency materials? Then what? that would be just about impossible to make such game in OpenGL? and have to use real-time raytracer or something?

### #25Reedbeta

DevMaster Staff

• 5307 posts
• LocationBellevue, WA

Posted 27 March 2012 - 08:56 PM

It's easy for programmers to say "why can't they just make a microchip to do XYZ", but hardware designers also operate under constraints and trade-offs. They are trying to do the best they can with the design and manufacturing technology available. There are many things they would like to do but it's just not practical / feasible with current tech.

Programmers have a similar relationship to hardware designers as artists have to programmers. One has to play within the sandbox built by the other, and sometimes they get frustrated that the sandbox is too small, or the sand is too rough and not sticky enough or something, and they think "imagine what I could do if only had better sand / a lot more sand! Why should I have to compromise my creativity with the arbitrary limitations of this sandbox?" And then it's easy to be angry at the sandbox builder. But the limitations aren't arbitrary - the one who built the sandbox isn't lazy or trying to gyp you; she'd love to give you a bigger sandbox or better sand, but she can't do that without compromising something else, as she has to work within her own practical limitations.

So, the artist may be frustrated that the programmer can't give him more polygons, more textures, etc. but the programmer knows her code must run in real-time, within a certain memory limit etc. And she may be frustrated that the hardware designer can't give her more memory, faster shaders, etc. but the HW guy knows his chip must run without taking too much power or getting too hot, must behave consistently, must not cost too much to manufacture, can't break the speed of light, etc. And the HW guy might be frustrated that the scientists can't design a smaller / faster / more reliable / less power-hungry transistor, that the fabs can't manufacture ICs more quickly / cheaply, and so on and so forth...

Enough rant. Back on the subject(s) - it's worth noting that special-purpose raytracing hardware has been (and is still being) tried. They've made progress, but so far it does not work well enough to bring to market. There has also been a lot of research on software ray tracing on the GPU, and there's been good progress there as well. The key problem with parallel raytracing is lack of coherence in memory access when rays diverge, so the "secret sauce" in all these things is finding a way to efficiently regroup rays to regain coherence at certain stages in the process.

And as for transparency - there are many ways to handle it with more or less degree of correctness, such as per-object sorting, depth peeling (already mentioned), A-buffer (linked list of surfaces per pixel, sorted and composited - can be done with compute shaders on GPUs today), adaptive transparency (computes a approximate depth-to-opacity mapping for each pixel), and inferred lighting. All these approaches have trade-offs - the main one being that the more accurate methods are very slow - probably too slow for a real game application unless you have a very simple world - and the more approximate methods are faster. So, pick your poison based on what best fits your use-case!
reedbeta.com - developer blog, OpenGL demos, and other projects

### #26Alienizer

Member

• Members
• 435 posts

Posted 27 March 2012 - 10:08 PM

That's just my point Reddbeta. It's like cars, they know how to do it so it last 100 years, but they don't, because of money, they want return business. Same for hardware, I've learn that while working for a firm in Florida back in the 90s. They already had designed for much better hardware, but they only give one bit every year, so the users keep buying. Just like cpu cores, Intel already made a 80+ cores prototype, but we're only going to see the 2, 4, 8, 16 ,32, 64,128 cores in 1, 2, 3, 4, 5, ,6 ,7 years, or more.

Everything the big guys do is not because they don't know how, or don't have the mean to do it, it's about the money, making sure that our technolody today is dinosaur in a year and have to buy the letest, year after year. That's what I'm angry about.

Say, I build a 1x1 sandbox for you this year, next year, I'll make you a 1x2, then the year after a 2x2, so I keep getting money from you. But if I were to make you the best that can be built, a 50x50, you would not buy again for another 10 years, until I I have the technology to make a 100x100.

http://www.physorg.c...cores-chip.html

and we're not going to see this technology anytime soon are we?

Money is the root of all evil

### #27Reedbeta

DevMaster Staff

• 5307 posts
• LocationBellevue, WA

Posted 27 March 2012 - 10:27 PM

Imagine if they did bring a 1,000-core processor to market. How much would they have to charge for it to break even on the cost of manufacturing? And recoup the startup costs of creating the manufacturing process? How many people would buy it then? The higher the price, the fewer the customers, so there very well might not be any price that would make that a viable market proposition.

Sure, they may have already designed a processor that isn't going to hit the market for 5 years, or 10 years, but I think it's a mistake to assume they're cynically holding it back to drag out our dependence on them, or something. More likely, they're holding it back because it wouldn't be profitable to bring it to market now. They have to make at least as much money as they're spending, so the company can continue to operate and not go bankrupt, etc. If you think that's evil, then that's your right, but I think that's a very naive viewpoint. What are they supposed to do - sell it at a loss, rack up massive debt and then go out of business in a couple years?

(And FYI, from the article it sounds like on that 1,000-core processor, the cores are just special-purpose video decoding units or something, not general-purpose CPUs. Bit of a different thing. There are GPUs on the market today with over 1,000 ALUs on them, which is a closer comparison.)
reedbeta.com - developer blog, OpenGL demos, and other projects

### #28Vilem Otte

Valued Member

• Members
• 345 posts

Posted 28 March 2012 - 12:49 AM

Reedbeta has a point! Of course, if one tries enough hard, he can achieve real time ray tracing (as a writer of Ray Tracing series here on DevMaster.net knows - he inspired lots of people to write real time ray tracer ).

Also I'd like to note that it everything is for money in the first, second and last place. Of course here comes a good thing - as long as there is another company doing the same thing, there is progress (simple sample, fictional):
1.) I release CPU for $400 with 100GFlops 2.) I design better CPU with 1000GFlops, I'm going to release 10 years from now 3.) I profit for 3 months 4.) My "very beloved" friend comes, releasing CPU for$400, but with 200 GFlops
5.) I don't profit in 3 months -> so I release a new CPU (based on owned and designed one) that has 300 GFlops
6.) And goes on...
7.) And on...
...
n.) I release CPU designed in step 2 actually 1.5 year from beginning
n+1.) Have to design even better CPU now

As you can see, concurrency makes progress - e.g. as long as there will be another company, the stuff will develop with huge speed, when the concurrency doesn't exist, it's bad, very bad (be honest - how much improvement have we seen in Windows since version 95 ... don't count the prettier icons, security (as this one just needed improvement due to "exterior effects" = spreding the internet) and even more interrupting messages? ... of course one can use Linux like me (btw. does ninja smiley look really that good to everyone like to me?))

Same applies for everything! Be it game development, api development, hardware, cars industry, phone industry, ... or even toilet industry.
My blog about game development (and not just game development) - http://gameprogramme...y.blogspot.com/

If you don't know how to speed up application, go "roarrrrrr!", hit the compiler with the club and use -O3 :D

### #29Alienizer

Member

• Members
• 435 posts

Posted 28 March 2012 - 02:13 AM

I do agree with that 100%, competition is a must for inovation, but, because of money, everything slows down. Take the food industry for example, the content shrink, but the box stays the same size, just way thinner. Fillers are added to the food and on and on. There are always new buyers, it's not like everyone in the world will buy the 100GFP within 3 months and that's it no more buyers for 3 years. Even if they do, these companies will be billionairs in 3 months, and will have plenty of money and time to make a 400GFP within a few years and be billionairs again.

I guess we are all right, depending on the view perspective (yeah I love that icon)

### #30TheNut

Senior Member

• Moderators
• 1699 posts
• LocationThornhill, ON

Posted 28 March 2012 - 03:21 AM

You're only as fast as your slowest component and humans are sadly the slowest component. I'm sure AMD and Intel wouldn't mind putting faster chips on the market every X days/weeks/months, but due to the way manufacturing works nothing would ever hit the market. Machines build machines and it takes time to get that up and running in remote places. Once they're finally up and running, obviously it's in the company's best interest to work them to the grave. The longer they are in use, the more they pay themselves off and the prices of hardware drops. When new tech comes out, the cycle continues. If you could eliminate the bottlenecks of manufacturing, you would effectively escalate production dramatically while keeping prices down. Who knows, perhaps the evolution of 3D printing could solve these problems.

Sure it's nice to have 1000 cores, but having 6 or 8 cores isn't that bad. I have loads of fun with my hexacore and I have 4 other boxes to tap into with distributed computing + GPGPU. This is all research stuff, but it's my fun and that's what counts When it comes to games though, don't think you need to deliver the cream of the crop. Entertainment is the #1 focus. Everything else is just bonus. I can flood you with a list of basic looking games, but they offer tremendous amounts of entertainment. Don't try to go for ray-traced image quality on day 1
http://www.nutty.ca - Being a nut has its advantages.

### #31Alienizer

Member

• Members
• 435 posts

Posted 28 March 2012 - 03:31 AM

Yeah I suppose so. I just want the ultimate even if it doesn't exist yet, but I know it's on the drawing board. It's like a candy in the box and can't have it just yet

Now, they should work on a StarTrek "replicator" so we can have those 1000 cores CPU yesterday for \$10

### #32geon

Senior Member

• Members
• 939 posts

Posted 28 March 2012 - 08:08 AM

The problem of the 1000 core CPU is not so much the cost of manufacturing or the market, but that we as programmers have no clue how to use it. So far we have pretty much assumed a single core, single thread programming model when designing software, because that's the hardware we have had.

Massive concurrency and parallelism is the bleeding edge of research. The problem is that most algorithms, and perhaps how we have become used to thinking about programming in the last 50 years simply don't fit the 1000 core architecture.

Already, when we run our code on merely 2 cores and a handful of threads, it is almost completely unavoidable to shoot ourselves in the feet, arms, heads and shoulders. Every time in a different order.

Actually keeping 1000 cores busy will be a very hard problem. Likely it will require new languages, or at least something else than the C, C++, C# we have now.

Purely functional languages could be useful, since they can evaluate parts of expressions independently on it's own core. Immutable data structures could enable memory to be safely shared between cores. The problem is, these are features found in languages like Haskell, where excessive memory use and garbage collection makes high performance, tight constraints code close to impossible.

On the hardware side, there are already lots-of-cores processors available, like this one:
http://www.intellasy...id=60&Itemid=75

It's designed to run FORTH on 40 cores, each one a complete computer with rom, ram, and io. Now, designing code for running on each code has to be done manually. I'm not sure if the cores can be reprogrammed dynamically, but I don't think so.

### #33Stainless

Member

• Members
• 581 posts
• LocationSouthampton

Posted 28 March 2012 - 08:18 AM

Well it's not all linear, we actually have lost a lot.

Back in about 1990-1995 we had real time raytracing, it ran on a transputer farm running Tao OS.

http://www.uruk.org/emu/Taos.html
http://en.wikipedia.org/wiki/Tao_Group

But it never took off, the main product from all our work was a JVM that ran 147 times faster than Sun's

Then we lost parallelism for a long time, before it started to come back in the form of GPU's

The real problem was that you had to be GOOD, I mean REALLY GOOD to write fast code in Occam. GPU's are much easier to code for as the parallel nature of them is kind of hidden from the programmer.

Now for the replicator to work, we need to understand the relationship between energy and mass. We know the high level relationship (Albert said E = mC^2), but we don't know the relationship at the quantum level.

So if you want to get your wishes, donate money to science and let them find the Higgs

@geon

That's really interesting, I already have code that converts Forth into glsl, I wonder if I can go the other way .....

### #34Vilem Otte

Valued Member

• Members
• 345 posts

Posted 28 March 2012 - 10:04 AM

Quote

Sure it's nice to have 1000 cores, but having 6 or 8 cores isn't that bad.
Yes! There is also another problem with 1000 cores... there are other issues - limited memory and memory access (one can't fit whole 3D scene in cache) - this can mess things up. Really accessing same block of memory from 4 cores is slow, imagine you'd be accessing it from 1000 cores (of course one can use duplicates of data in separated memory for each group of cores - smells like distributed computing).

Quote

Actually keeping 1000 cores busy will be a very hard problem. Likely it will require new languages, or at least something else than the C, C++, C# we have now.
Not necessarily with C/C++/C# - we just would need better way to work with them (a library, or extending the language could do the job).

Using haskell is cool. Coding useful stuff in it that actually isn't that slow (just about 2 or 3 times than in C) makes one feels like boss . Not that I can write that good code in haskell.
My blog about game development (and not just game development) - http://gameprogramme...y.blogspot.com/

If you don't know how to speed up application, go "roarrrrrr!", hit the compiler with the club and use -O3 :D

### #35geon

Senior Member

• Members
• 939 posts

Posted 28 March 2012 - 12:49 PM

Vilem Otte, on 28 March 2012 - 10:04 AM, said:

there are other issues - limited memory and memory access (one can't fit whole 3D scene in cache) - this can mess things up. Really accessing same block of memory from 4 cores is slow, imagine you'd be accessing it from 1000 cores (of course one can use duplicates of data in separated memory for each group of cores - smells like distributed computing).

Perhaps memory "broadcast" where several cores read the bus at the same time? Sounds like it would cause horrible locking issues, though.

Vilem Otte, on 28 March 2012 - 10:04 AM, said:

Using haskell is cool. Coding useful stuff in it that actually isn't that slow (just about 2 or 3 times than in C) makes one feels like boss . Not that I can write that good code in haskell.

Right! Haskell is pretty nice when you get in the right mindset for it. I wrote a simple Othello/Reversi min/max AI in it and it worked ok. About as fast as my C implementation I'd say. (I never benchmarked them, nor was the C version fast to begin with.)

It really requires you to rethink a lot of algoritm's and data structures though. I couldn't use mutable arrays like in C, so I came up with a representation of the board as a map of a tuple (the board coordinate) to an "enum" (don't know the actual name for it in Haskell).

I remember reading a paper from Naughty Dog (or perhaps Epic?), discussing the features of the next generation of programming languages for games. The author suggested a mix of paradigms with purely functional features for the math, etc.

### #36Vilem Otte

Valued Member

• Members
• 345 posts

Posted 28 March 2012 - 02:49 PM

I actually considered doing game scripts in functional paradigm - it is really easy to implement and work with it, although I'm a bit feared what would any person that should work with this said about it.

For example simple definitions of enemies in C-like scripting as:
struct enemy_t
{
skinned_model enemy_mdl;
float life;
}
void enemy_t::onhit(float dmg)
{
life -= dmg
if(life < 0)
{
life = 0.0f
enemy_mdl.setanim(NULL);
enemy_mdl.setragdoll();
}
}


could be reduced to:
data Enemy = Enemy (Model mdl) (Float life)

Enemy_onHit (Enemy mdl life) dmg
| life - dmg < 0 = (Enemy (ragdoll mdl) 0)
| otherwise = (Enemy mdl (life - dmg))


My blog about game development (and not just game development) - http://gameprogramme...y.blogspot.com/

If you don't know how to speed up application, go "roarrrrrr!", hit the compiler with the club and use -O3 :D

### #37geon

Senior Member

• Members
• 939 posts

Posted 28 March 2012 - 03:14 PM

Vilem Otte, on 28 March 2012 - 02:49 PM, said:

data Enemy = Enemy (Model mdl) (Float life)

Enemy_onHit (Enemy mdl life) dmg
| life - dmg < 0 = (Enemy (ragdoll mdl) 0)
| otherwise = (Enemy mdl (life - dmg))


Hmm. Haskell for game scripting... Why not?

I'm all for functional style programming, but I'd go for a more conventional scripting language. Perhaps Lua or Python. Even Javascript (V8 ?) would work, and it's a pretty nice language that a lot of people already know.

### #38Vilem Otte

Valued Member

• Members
• 345 posts

Posted 28 March 2012 - 11:37 PM

There even is something called HINT - allows you to add haskell interpreter to your applications (considered cool ). I hope I'll find some time in next few weeks to try it out a little for game-like scripting.

My blog about game development (and not just game development) - http://gameprogramme...y.blogspot.com/

If you don't know how to speed up application, go "roarrrrrr!", hit the compiler with the club and use -O3 :D

### #39Stainless

Member

• Members
• 581 posts
• LocationSouthampton

Posted 29 March 2012 - 08:37 AM

Quote

Actually keeping 1000 cores busy will be a very hard problem. Likely it will require new languages, or at least something else than the C, C++, C# we have now

Automatic load balancing has been around for ages, we did in Taos.

The problem comes when the hardware does not support cross core comms in an efficient way.
If core 1 is waiting for a calculation to finish on core 2, you want that to happen as quickly and simply as possible.

#### 2 user(s) are reading this topic

0 members, 2 guests, 0 anonymous users