» dev performance optimization
This site relies heavily on Javascript. You should enable it if you want the full experience. Learn more.

dev performance optimization

acl(admin devvvv vvvvgroup)

Performace Optimization Tips

GDC 2005 presentations

    http://developer.nvidia.com/object/gdc_2005_presentations.html

locking & parallel processing

The most obvious solution would be to lock the back buffer for each frame in Direct3D (analogous to calling glFinish() in OpenGL). This ensures that all pending graphics commands are completed by the GPU before the CPU moves on. However, this completely removes any potential for asynchronous processing, as the CPU is unable to process the next frame until the current frame has finished rendering.

A better solution is double-buffered texture locking. This is a generalization of locking the back-buffer. At the end of your frame you render a single triangle to a tiny (2x2) texture, then read the contents of your texture. So far this solution is equivalent to locking the back-buffer, and suffers the same kind of stalls. It ensures that the GPU never gets more than 1 frame ahead of the CPU.

Now generalize it: use two tiny textures and alternately render to them and alternately lock them:

Render frame 1

Render a triangle to texture 0

Lock and read texture 1

Render frame 2

Render a triangle to texture 1

Lock and read texture 0

Render frame 3

Render triangle to texture 0

Lock and read texture 1

Render frame 4

Render a triangle to texture 1

Lock and read texture 0

...

Now, the GPU does not get stalled; it also never gets more than 2 frames ahead of the CPU. Lag is up to one frame, but overall efficiency is higher since the GPU is always busy (if you are GPU bound). You can further generalize it to use triple-buffered textures, and you may even be able to insert multiple sync points per frame to get finer control over lag.

A second solution is to use DirectX 9's Asynchronous Query functionality (analogous to using fences in OpenGL). At the end of your frame, insert a D3DQUERYTYPE_EVENT query into your rendering stream. You can then poll whether the GPU has reached this event yet by using GetData. As in 1) you can thus ensure (i.e., busy wait w/ the CPU) that the CPU never gets more than 2 frames ahead of the GPU, while the GPU is never idled. Similarly it is conceivable to insert multiple queries per frame to get finer control over lag.

anonymous user login

Shoutbox

~1d ago

bjoern: Yo peeps! I am looking for a job/project starting July. For contact info check: vvvv specialists available for hire

~1mth ago

joreg: Summer Season 23 vvvv workshops are now ready for sign-up: https://thenodeinstitute.org/vvvv-intermediates-summer-2023/

~1mth ago

schlonzo: yeah! shader input pins now also visible, while the variable it not used!

~1mth ago

benju: Job opportunity, teaching Sounddesign for New Media purposes in Berlin (6hrs/week): https://www.letteverein.berlin/wp-content/uploads/2023/03/Ausschreibung_MIA_LK_6_UStd._Sounddesign_NEU.pdf

~1mth ago

joreg: vvvv gamma 5.0 is out! Please read all about it in the release notes: https://visualprogramming.net/blog/2023/vvvv-gamma-5.0-release

~2mth ago

domj: Coming to LPM next weekend? Learn more about one of the first full vvvv gamma apps, Schéma! https://liveperformersmeeting.net/editions/2023-muenster/program/detail/schema-talk/

~2mth ago

joreg: Want to get started with #vvvv? Check this 12 session beginner online course starting May 8th: https://thenodeinstitute.org/courses/vvvv-beginner-class-summer-2023/

~3mth ago

mediadog: @ggml Yup, lots. Only used in 4.x, haven't tried in 5.x yet: https://www.unrealengine.com/marketplace/en-US/product/simple-udp-tcp-socket-client