VVVV Pipeline in DirectX 11 is somehow similar to the one in DirectX9 (with some differences).
From the time that your patch is evaluated to the time you finally have something on the screen, a lot of different processing (happening magically under the hood) happens.
This processing is divided in different stages:
So now GPU pipeline is Asynchronous, let's see what it means using this simple example:
Now let's consider for a minute that we have a synchronous pipeline:
Update stage uploads Quad/Sphere and DynamicBuffer to your GPU (if not already).
Now the rendering goes to constant, which issue a draw command. Since we imagine our pipeline as synchronous that means (CPU wise)
our GPU would work like that:
As you can clearly see this would be totally inefficient, both of your processing units would permanently wait for each other. This is called idle time.
So instead pipeline works this way, calls to the Pipeline are asynchronous, which means when you send a graphics command, it is not processed right away. Instead this is stored in a command buffer. Your GPU will then consume commands from this buffer when it feels it needs too.
What it now means, if we take our previous example, that your CPU is immediately free to make more work once commands are sent. And your GPU gets commands and process them at some point.
Now that means your GPU might effectively draw you Quad while your CPU is issuing draw calls for Sphere number 5.
So an important aspect is: you don't know when the GPU is processing commands (please note that order is of course preserved). And technically you don't really want to know, letting both units do their job is much easier.
There are a few ways to instruct your GPU to immediately process commands in his buffer:
Now you have both of your units running happily in parallel. There is an issue.
If we look at the ConstantInstanced, which is fed by a Dynamic buffer.
We need to upload our transformations to the buffer. Which is done in 3 stages:
One very important part is that Map command will also instruct the GPU that it's not allowed to use this resource (since we are currently writing on it).
What it means is that if your GPU's next command is a draw call using that buffer, it will wait until Unmap is called before to proceed.
This is called a stall. It means your GPU will go idle until upload is done.
Now please note that this works both ways, if your GPU is currently using the resource for a draw call and you call Map, your CPU will wait for the GPU to instruct that it is done with it before to proceed. This is also a stall, but this time on the CPU side.
Now since most time we upload resources during update stage, this generally minimizes the risks of having it happening (at the cost of an extra stage).
Now we have a few nodes which create an exception : Data Retrievers.
Sometimes let's say we need to access a GPU resource and bring it back to our CPU (Simple Readback).
What we need to do is:
This is also done, using Map, but this time we tells that we want to read.
What now happens is:
Let's take this simple (but oh so common) example:
Now what happens in there.
We have an Info node. To output it's data, the node needs to have all the above fully evaluated (which is handled by vvvv graph), but also fully updated and rendered. And we are still at evaluation stage, since we need to output data as output pins.
So here we create what is called an "Early Render"
Basically our info node (during evaluate), will instruct the DX11 pipeline that it needs this resource ready.
Our vvvv pipeline will call Update/Render using Info as a starting point. So all of the above will be processed. Since we also need the runtime to fully complete the draw calls, our Info node (which doesn't do much), will actually under the hood provoke quite a big chain of actions.
By doing this, and since our node still need to wait for that data, we lose the benefit of having our CPU and GPU work concurrently.
That's why your node will suddenly appear to eat a lot of CPU when you run in Debug Mode, which is misleading since thispart would have been rendered anyway.
Please of course note that the dx11 rendering knows that this part has already been rendered, so when you reach update/render it will not process it a second time. This means that performance loss can be fairly minimal or very bad (it can really depend on lot of factors). But mostly it also makes rendering path unpredictable (which removes opportunities for internal optimization).
So to make it simple, reading back resources has a lot of implications, it needs to be handled with care. Please note in some cases this is needed and not avoidable.
In the case of Info, this should never be needed anymore in dx11 (except for debug purposes).
In some cases, reading back data for CPU usage is unavoidable (send some stream compaction results via network for example, export model/texture).
Please note that retrieving average brightness of a texture to connect to a Damper is NOT a use case ;)
If data is not needed within the same frame, write a node that waits for presentation stage, and Map your resource after that (since your pipeline will be fully executed, so you will minimize the chance of a big fat stall).
anonymous user login