Ray tracing library interface

B20d81438814b6ba7da7ff8eb502d039
1
Vilem_Otte 117 Sep 21, 2013 at 20:34 raytracing library

Hello,

this one will be one of the longer ones, but I would be glad if someone could answer me. For now I have my ray tracing library in quite good state. It supports quite a bit of features (several different acceleration structures, both CPU and GPU ray tracing, textures, etc.). Now I would like to push it to state, where another user (programmer) could work with this library. How to design an interface for a library like this?

So far I’ve thought about this. User creates a scene (either static or dynamic one) with description of which acceleration structure he wants to use (BVH, SBVH, KdTree, etc.). At start (or for dynamic at runtime) generates and sets triangle buffers which he wants to draw (along with materials, this section is though to-be-thought-off). At the end of setup he calls acceleration structure creation for a given scene.

So far the scene management (this is okay as far as I’m concerned).

Now the bigger problem, I would like to either allow user to trace single ray or better a batch of rays through the scene. So user creates ray-buffer, and calls it’s traversal, receiving the results of ray cast. Also a way to get important data from ray cast result (like color at given hit, position, normal, etc. etc.).

How does it sound, isn’t it missing something? Any more clever approaches?

Note. take this as a little brain dump and hint, comment, ask - just to keep my brain busy thinking about “How to do this to make a usable library”.

7 Replies

Please log in or register to post a reply.

A8433b04cb41dd57113740b779f61acb
1
Reedbeta 167 Sep 22, 2013 at 02:55

You might want to look into things like OptiX or other raytracing APIs to see how they handle the interface.

6837d514b487de395be51432d9cdd078
4
TheNut 179 Sep 22, 2013 at 04:27

I can throw my 2 cents in. In my engine, I kept my tracer system as easy as possible. I don’t force the user into micro-managing calculations and memory. In fact, most of the time the user only has to write custom shaders to render the results. The core functionality is rarely touched, but the option is there to customize it with new calculations and such. I won’t discuss the task scheduling system I have, but the raytracer breaks down into 2 main components

Caster The role of the caster is to manage one or many rays designed for colouring a single pixel. Typically the first ray comes from the camera, which is automatically handled behind the scenes (but that’s something you can customize if necessary). Each ray will dispatch an event to let the delegate know everything you could possibly want to know about the ray. Some of it is calculated in the sample ray implementation, others can be customized (such as energy falloff and such), and information about the object the ray intersected with (position, uv, normal, material) is requested from the scene. From here, you can tell the caster that you want to fire another set of rays. As many as you want and for whatever purpose. So if you wanted to calculate ambient occlusion, you would fire off several dozen secondary rays to sample from. This works by registering child rays with the caster (which you can also group for organizing), and associate them with your current ray (the parent). The caster will then queue and schedule those rays for work. Those child rays can in turn create more child rays. When all work items are done (all child rays completed), the caster returns to the parent ray for further processing. This will continue all the way up to the first ray casted by the camera. Once all that’s done, the data is passed over to the renderer for colouring the pixel.

Renderer The renderer is similar to a fragment shader. Being software based though, it’s a lot more robust. Shaders are organized in a graph structure. It’s very similar to the node system in Blender (as well as my texgen tool). Shaders are linked to other shaders so they can be post-processed. Most of this is all done in parallel, so one pixel will flow through the entire graph; however there are special nodes that can request to wait for the entire image to complete before running. This is done for example when you need to use convolution filters or such.

And that’s pretty much the gist of it. It’s lightweight, the event-delegate system allows me to customize key parts without dealing with inheritance or writing tons of boilerplate code. The sample implementations are easy to pick up and customize (because I tend to forget code I write :). You can also check out OptiX like Reed suggested, although IMO it’s not a high calibre API. nVidia is a company of researchers that crave performance and low level programming. I find their APIs reflect that and take away the essence of simplicity.

B5262118b588a5a420230bfbef4a2cdf
0
Stainless 151 Sep 22, 2013 at 10:27

I think the caster and renderer can have any API you feel comfortable with, after all it’s your code and once you have an API in place you like, you could always write a high level API over the top of it so end users can have a very simple interface.

For me the key point of making it useable will be asset importing.

I have three uses for a good raycaster.

1) Traditional triangle rasterisation . (meshes imported for any standard 3d format) 2) Volume rendering (assets imported as textures) 3) Voxel rendering (sparse voxel octrees or similar)

And that’s just me !

Other people may want to be able to render standard primitives (sphere’s, cube’s, Icosakaiheptagon’s…) Or render distance fields, or equations, or dreams of electric sheep.

If you can come up with a plugin architecture for asset importing, I think you would have a massively useful tool.

6837d514b487de395be51432d9cdd078
1
TheNut 179 Sep 22, 2013 at 14:59

That’s the 3rd part of the raytracer system I didn’t go into detail about. Fundamentally, the raytracer should be independent of the scene you use, be it triangles, mathematical objects, voxels, or whatever. In my engine the caster has a pointer to a scene interface class. It has a single method to go fetch object data for the current ray. Beyond that, you could do that any way you want. Write your own scene with a mathematically defined sphere in the centre, or go all out and hook it into your game world to render a high quality screenshot. You could add that kind of support in your framework, but I think that should be something optional. I wouldn’t want to force someone to use my scene management because that’s a second component they would have to adopt and it’s a pretty big one too. I would see that more as a second framework you could download. People who are looking for a complete package can get both, otherwise just download the raytracer engine and plug it into your framework/engine.

B20d81438814b6ba7da7ff8eb502d039
2
Vilem_Otte 117 Sep 22, 2013 at 23:24

Okay, so I finally got to answer.

Basically my ray tracer is just a triangle ray tracer, high performance one. Even though adding different primitives wouldn’t be that hard for me (although it would hit performance), it is also NOT necessary or needed for my purpose. So I guess it is scene dependent (triangular meshes only) - but for quite obvious reason, speed.

Second thing, it is independent of scene management. You basically draw triangle buffers with applied transformation matrices, the ray tracer copies the data (re-arranging them in more usable format), and on these it builds the acceleration structure (either SplitBVH or SAH KdTree - but you can choose which one you want to use).This is basically how it works for static scenes.

For dynamic scenes it sadly works the same way - although it builds different acceleration structure (as for now SahBVH but with highly optimized guessed SAH, in future I’d like to use HLBVH for this … also BIH might be interesting here, I have some implementation lying around, so I might as well add that one - again you can choose which one to use).

So I guess it is independent of game scene management. As for copying the data, I know it uses redundant memory like this - but performance gain from using better layout for triangles is just way too high that I’m willing to sacriface some memory and memcpyies - also implementing faster memcpy is next on my todo list.

All in all I really like TheNut idea, separating the ray caster and renderer. Where basically renderer might not even be part of the library and should be another library build upon the ray caster one.

As for now the library grows and I hope to be able to show you some basic example in terms of few days.

Now to the purpose (slightly off topic). As I’m creating a proof-of-concept game (no more details provided) using some less complex renderer highly utilizing ray tracing for certain graphics effects - I go for “tons of speed”. Basically the library has been working inside the engine, but I decided to re-factor it a bit and separate the library from the engine (so possibly some other guys could try it, if and when I release the library to public).

A8433b04cb41dd57113740b779f61acb
0
Reedbeta 167 Sep 23, 2013 at 00:56

I’d want to be able to save and reload the optimized internal structures for static geometry (the acceleration structure plus whatever internal structure you store the triangles in) so I can generate it once, save it to disk and reload it with no extra memory copies or setup.

Also, do you support instancing at all? I’d think this would be an important part of a high-performance pro raytracer, for reducing memory overhead of repeated parts of scenes.

B20d81438814b6ba7da7ff8eb502d039
0
Vilem_Otte 117 Sep 23, 2013 at 06:05

The basic functionality to save/load the acceleration structure + scene (in exactly same “scheme” like it is in memory) is done.

As for instancing, it is on my TODO list (although I basically have everything prepared - like that acceleration structure’s node can be another, transformed, acceleration structure).

As for now, I still try to beat Arauna in terms of performance on CPU (although I’m pretty close, but only with SIMD ray packets - which is basically the same think Arauna does to keep the speed).

On the GPU, my goal would be to get close to Aila & Laine results, although that is not even possible (they use architecture dependent code on NVidia, I use architecture independent OpenCL) - but I think I’m quite close under the circumstances.