Jump to content


Inline functions


  • You cannot reply to this topic
11 replies to this topic

#1 gardon

    Valued Member

  • Members
  • PipPipPip
  • 282 posts

Posted 28 October 2007 - 04:17 AM

I'm making a 3D engine, and I want to wrap all DirectX work into the engine to make life easier. One of the things I'm going to do is wrap simple function calls into calls of my own in the engine:

Say I want to call the

IDirect3DDevice9->SetTransform(D3DTS_WORLD, &mtxWorld)

method, but I don't want the user to have to worry about the Device Object, or the SetTransform method. To fix this, I would create a function like so:

SetWorldPosition(D3DXMATRIX position)
{
     Device->SetTransform( D3DTS_WORLD, &position );
}


and could call it from the game like this:


Engine->SetWorldPosition( m_pPlayer->Position() );


The Player doesn't know about the matrix, only that the player has a position, so he would pass it subconsciously.


Now, I want my engine as fast as possible. If I were to declare a function like this, would it waste time since it has to call another function inside of it's body (SetWorldPosition() -> SetTransform() )? Would it be faster if I let them handle the device and SetTransform function by themselves? (so just SetTransform() instead of both functions)

If I 'inline' this function, will that help solve my problem?


Thanks,

-Gardon

#2 J22

    Member

  • Members
  • PipPip
  • 92 posts

Posted 28 October 2007 - 06:36 AM

Inlining will help in builds where inlining is enabled (namely non-debug builds). However, you shouldn't extensively rely on inline function optimizations, because this potentially makes you debug builds run much slower forcing you to use non-debug builds more regularly during development. I have been in projects where debug build runs say 5x slower than release build making it impossible to do any sensible work with it. Many developers think inlining makes the call free thus bloat their code with things like setter/getter functions, but keep that in mind.

#3 gardon

    Valued Member

  • Members
  • PipPipPip
  • 282 posts

Posted 28 October 2007 - 06:38 AM

Thanks J22.

How about on the topic of engine speed? What's the overhead of calling an additional function every once in a while?

I think what I"m going to try to do is to add code segments to each function, to allow it to compute more during the call (rather than just the SetTransform() method, for example). We'll see how it goes.

#4 Reedbeta

    DevMaster Staff

  • Administrators
  • 4979 posts
  • LocationBellevue, WA

Posted 28 October 2007 - 07:42 AM

How often do you need to set the world matrix? Something like once per frame? The overhead of an additional function call won't be noticeable on that scale whether it's inlined or not. Don't worry about optimizing things that take up < 1% of your CPU time anyway - just write them in the way you want to write them, and don't worry about performance.

On the other hand, if this situation arises with something like setting shader parameters, i.e. something that happens hundreds or thousands of times a frame, the overhead could add up to something significant.

In any case, you should profile the application and determine what's actually taking up the most time, then focus your optimization efforts on that. Programmers making guesses about what's eating performance are usually wrong - not because they're incompetent programmers, just because modern architectures are complicated enough that it's extremely difficult to predict how code will perform without running it and seeing.
reedbeta.com - developer blog, OpenGL demos, and other projects

#5 J22

    Member

  • Members
  • PipPip
  • 92 posts

Posted 28 October 2007 - 11:10 AM

Well, I think programmers too often excuse themselves from thinking optimizations & performance early because they claim that optimization & profiling can be done during a later stage in development. Here's actually good article I just recently read related to that: http://www.acm.org/u...24_fallacy.html

So, if you for example utilize bad programming practices, it doesn't show up as a spike in your profiler because badly performing code is evenly distributed over your entire codebase.

I don't know if I can give you any good guideline when you should inline functions and when not though. Personally I just make a call based on the expected frequency of the function call, the complexity of the function and how well it may potentially be optimized when inlined (parameter passing & inter-function optimizations)

#6 Nils Pipenbrinck

    Senior Member

  • Members
  • PipPipPipPip
  • 597 posts

Posted 28 October 2007 - 01:17 PM

Modern compilers know pretty good what to inline and what not to. With whole program optimization passes it's not even nessesary anymore to mark functions for inlining and write them into header files instead of c-files. The compiler will inline and specialize functions across different translation units.

Forcing functions to inline can (and will in most cases) even degrade performance since that increases the code size. This may reduce the efficiency of the code-cache. Think about it: Is it worth it to save lets say 10 calls but cause the instruction cache to page in a kilobyte of code (and throw out another kilobyte that you may need some ticks later)? I'd say the memory traffic is the dominant cost here.

There are cases where the compiler is to dumb to do the job, and there are some clever programmers out there who know what to inline and what not to. However, those effords might give you only a performance boost of lets say 5% in the end. Imho it's not worth to worry about it. Better spend your time improving your algorithms, data-access patterns and work on the general structure of your code.

Quote

So, if you for example utilize bad programming practices, it doesn't show up as a spike in your profiler because badly performing code is evenly distributed over your entire codebase.

I 100% agree with you on that. An unclever choosen algorithm might never get a peak in the performance chart if it' burried under bad performing code.

I've seen game-code where a good part of the cpu time was wasted in needless memory managment. They had lots of LUA script running and decided to pass all parameters from and to LUA via strings. The conversion function created and destroyed string-objects on the heap so frequently that memory managment became the number one bottleneck.

As far as I know the game never got released. The string thing was just one of many problems. It was so hopelessly bad written that a refactoring would have taken more time than a rewrite from scratch.
My music: http://myspace.com/planetarchh <-- my music

My stuff: torus.untergrund.net <-- some diy electronic stuff and more.

#7 J22

    Member

  • Members
  • PipPip
  • 92 posts

Posted 28 October 2007 - 02:12 PM

I don't think the major reason for inlining is to avoid the call instructions, but rather allow compiler to make better optimizations when the code is inlined. Let say for simple vector operations (mul, add, dot, etc.) the code potentially gets substantially optimized when compiler can optimize the code for specific vector operation sequence. Or if you think about getter/setter functions (car->set_speed(10.0f); ... void car::set_speed(float speed) {m_speed=speed;}), the non-inlined function call overhead of passing values to functions, etc. is much more code executed and fetched to cache than when the value is assigned directly to a specific memory location when inlined.

But agreed that whole program optimizations can cure things when available, though it can add quite a bit to your link times.

#8 .oisyn

    DevMaster Staff

  • Moderators
  • 1822 posts

Posted 28 October 2007 - 02:59 PM

While I don't necessarily agree with the phrase "Premature optimizations are the root of all evil"™, I don't think you should spend your time on determining what functions should be inlined or not. Think about overall design, what goals you have and what the fastest ways are to get to that goals (in terms of number of operations, rather than actual code execution speed). To go back to the example of shader parameters someone brought up earlier, a function setting a single parameter is nice and clean, but it would probably benefit you if you have a structure containing all the parameters which you can set in one go.

Also, from the code you have given, I think your "engine" is too low level. Game code doesn't care about setting world matrices and vertex buffers, or doing drawprim calls. It just wants to render a model using a set of bone matrices. And that model, as far as the game is concerned, is just an abstract object rather than the actual implementation containing the necessary vertex buffers, index buffers, textures and shader parameters.

So you should be able to do
Engine->DrawModel(playerModel, playerBoneMatrices);
// or:
playerModel->Draw(playerBoneMatrices);
and let the engine implementation take care of the rest, rather than
Engine->SetBoneMatrices(playerBoneMatrices);
Engine->SetVertexBuffer(playerModelVertexBuffer);
Engine->SetIndexBuffer(playerModelIndexBuffer);
Engine->DrawIndexedPrimitive(...);

C++ addict
-
Currently working on: the 3D engine for Tomb Raider.

#9 J22

    Member

  • Members
  • PipPip
  • 92 posts

Posted 28 October 2007 - 03:34 PM

Any seasoned programmer doesn't have to spend time in thinking which functions should be inlined more than they need to think if function should be virtual, what name they choose for a variable, how to comment code, etc. If it doesn't flow naturally from you, you might just need some practice.

There are multiple different levels of abstractions in a game engine and I think having world matrix setup like this (or any d3d/opengl level calls) is reasonable in the lowest level. Well, at least I have all the d3d calls wrapped to inline functions to avoid having to do manual checks of error codes for each one for example.

#10 .oisyn

    DevMaster Staff

  • Moderators
  • 1822 posts

Posted 28 October 2007 - 05:18 PM

J22 said:

There are multiple different levels of abstractions in a game engine and I think having world matrix setup like this (or any d3d/opengl level calls) is reasonable in the lowest level. Well, at least I have all the d3d calls wrapped to inline functions to avoid having to do manual checks of error codes for each one for example.
Well of course, but you won't be calling that an "Engine", right? :)
C++ addict
-
Currently working on: the 3D engine for Tomb Raider.

#11 J22

    Member

  • Members
  • PipPip
  • 92 posts

Posted 28 October 2007 - 05:30 PM

I call it a wrapper which is part of the engine :)

#12 gardon

    Valued Member

  • Members
  • PipPipPip
  • 282 posts

Posted 28 October 2007 - 06:20 PM

Thanks everyone. I wasn't expecting this many replies, but the more the better!

Basically I'm writing an engine for myself, to make life easier in designing my game. I'd like to make it high level for certain things (like LoadMesh("mesh.x")) that I can set general flags for to take the burden off having huge amounts of code for optimizing the mesh, readjusting the attribute buffer, error checking, etc, within my code.

so I guess you could call it a wrapper, rather than an engine, but I'm looking to make it as high level as possible without compromising flexibility. We'll see how it goes.





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users