You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In my profiles, bezier tessellation is consuming considerable CPU time, while the GPU is mostly idle.
I've tried optimizing bezier tessellation on the front-end with little success.
I was wondering about the feasibility of supporting path tessellation (nvg__flattenPaths) on the GPU. This would involve a change to the front-end, but could be easily made compatible with other back-ends.
Bezier tessellation on the GPU could work as follows. The input is a series of Bezier segments to be flattened. A compute kernel is responsible for flattening. Each segment is processed by a thread. If the segment is flat (either by a flag on the segment, or a test if the flag isn't set), then the segment is outputted. If not flat, two segments are outputted according to bezier subdivision. Atomic operations would be used to append to the output buffer (note that this wouldn't preserve segment order, and that might be an issue. If it is, then a prefix-sum could be used to generate output indices).
This kernel would be run in multiple passes over the curves according to the maximum subdivision level (hardcoded at 10 in the front-end). Buffers would be sized to accept the maximum number of segments (2^10).
The back end would have to handle generating segments for strokes, line-cap, line-end, etc.
The other GPU path rendering approaches I'm aware of involve sophisticated fragment shaders, so this may be easier to implement.
It may also be worth looking at https://github.com/linebender/piet-metal. It currently doesn't seem to handle curves on the GPU (Beziers are subdivided on the CPU), but takes a distance-field approach to path rendering.
MetalNanoVG follows the style of the corresponded NanoVG project. To do what you suggested here, we need to touch the original NanoVG code and MetalNanoVG won't share the same nanovg.c anymore. How about implementing such improvement in NanoVG/OpenGL first?
Unfortunately, OpenGL doesn't have compute kernels, so doing it the way I described isn't possible with OpenGL. If they switched to a Vulkan backend, then it's possible.
I think the changes to nanovg.c might be minimal. I'd be more concerned about the amount of time required to implement this.
In my profiles, bezier tessellation is consuming considerable CPU time, while the GPU is mostly idle.
I've tried optimizing bezier tessellation on the front-end with little success.
I was wondering about the feasibility of supporting path tessellation (
nvg__flattenPaths
) on the GPU. This would involve a change to the front-end, but could be easily made compatible with other back-ends.Bezier tessellation on the GPU could work as follows. The input is a series of Bezier segments to be flattened. A compute kernel is responsible for flattening. Each segment is processed by a thread. If the segment is flat (either by a flag on the segment, or a test if the flag isn't set), then the segment is outputted. If not flat, two segments are outputted according to bezier subdivision. Atomic operations would be used to append to the output buffer (note that this wouldn't preserve segment order, and that might be an issue. If it is, then a prefix-sum could be used to generate output indices).
Here is some pseudocode:
This kernel would be run in multiple passes over the curves according to the maximum subdivision level (hardcoded at 10 in the front-end). Buffers would be sized to accept the maximum number of segments (2^10).
The back end would have to handle generating segments for strokes, line-cap, line-end, etc.
The other GPU path rendering approaches I'm aware of involve sophisticated fragment shaders, so this may be easier to implement.
It may also be worth looking at https://github.com/linebender/piet-metal. It currently doesn't seem to handle curves on the GPU (Beziers are subdivided on the CPU), but takes a distance-field approach to path rendering.
See also http://hhoppe.com/ravg.pdf and https://www.microsoft.com/en-us/research/wp-content/uploads/2005/01/p1000-loop.pdf.
The text was updated successfully, but these errors were encountered: