How does vsg solve performance bottlenecks? #1349
-
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 3 replies
-
I'm afraid you've dived into low level before we understand the wider context of what you are trying to do. It's far more helpful to start from top down rather than bottom up. What do you mean by EDA software? |
Beta Was this translation helpful? Give feedback.
-
Could try the vsg::Instrumentation, specifically the vsg::Profiler version, see vsginstrumentation or any other example that assigns instrumentation. This will output high level stats on CPU and GPU costs to console or a file if you assign one. At this point I have no idea what you're only getting 10fps while the screenshot is simple enough I'd expect 10's of thousands of FPS with vysnc off. You aren't running app through a virtual machine or across a network or anything? How many nodes do you have in your scene? How many state and bind vertex etc. calls? How many vertices and primitives? |
Beta Was this translation helpful? Give feedback.
-
99 million vertices is quite a lot for a 3050, so I wouldn't expect amazing performance if you can't cull anything. It's almost certainly going to be better to split your draw calls and set up cull nodes so you only need to pay for the few hundred/thousand that you're showing at once. That doesn't mean one draw call per circle, but there'll be a middle ground that performs much better than what you're seeing now where the CPU overhead from more draw calls and more frustum intersection tests is well worth the reduction in work for the GPU. As for why one of your huge draw calls is being attributed all the time cost, and none of the others are, it's unclear. The obvious things to blame (e.g. being able to skip rasterising more primitives or fewer samples passing the depth and stencil tests) have comparable numbers in the RenderDoc screenshot, so it might just come down to the timing values being misleading. As GPUs are free to do things like reorder draw calls as long as the observable effects are the same, sometimes it's hard to say which time was spent on which draw calls. Finally, I wouldn't put too much faith in the figures that the Windows Task Manager gives for GPU performance. The overall load is typically a mix of several factors, so if one aspect of the GPU is so overloaded that the rest is work-starved, you can get really low usage figures. For your specific case, you might expect vertex processing or the rasteriser to be entirely busy, but the texture units and render outputs to be waiting with nothing to do most of the time. Sometimes you can avoid this problem by changing one of the dropdowns above one of the graphs to view a different metric, but that only covers the things Microsoft have decided to expose a graph for. You might get more useful information from your graphics driver or a third-party tool like MSI Afterburner - at the minimum, you'll be able to see whether the GPU's running at idle or load clock speeds, and know whether you'll need to persuade the driver that this is a high-performance application. |
Beta Was this translation helpful? Give feedback.
99 million vertices is quite a lot for a 3050, so I wouldn't expect amazing performance if you can't cull anything. It's almost certainly going to be better to split your draw calls and set up cull nodes so you only need to pay for the few hundred/thousand that you're showing at once. That doesn't mean one draw call per circle, but there'll be a middle ground that performs much better than what you're seeing now where the CPU overhead from more draw calls and more frustum intersection tests is well worth the reduction in work for the GPU.
As for why one of your huge draw calls is being attributed all the time cost, and none of the others are, it's unclear. The obvious things to blame (e.g…