could path tessellation be done using the GPU? #18

wtholliday · 2019-05-16T13:28:43Z

In my profiles, bezier tessellation is consuming considerable CPU time, while the GPU is mostly idle.

I've tried optimizing bezier tessellation on the front-end with little success.

I was wondering about the feasibility of supporting path tessellation (nvg__flattenPaths) on the GPU. This would involve a change to the front-end, but could be easily made compatible with other back-ends.

Bezier tessellation on the GPU could work as follows. The input is a series of Bezier segments to be flattened. A compute kernel is responsible for flattening. Each segment is processed by a thread. If the segment is flat (either by a flag on the segment, or a test if the flag isn't set), then the segment is outputted. If not flat, two segments are outputted according to bezier subdivision. Atomic operations would be used to append to the output buffer (note that this wouldn't preserve segment order, and that might be an issue. If it is, then a prefix-sum could be used to generate output indices).

Here is some pseudocode:

kernel void subdivideBezier(device BezierSegment* input,
                                               device BezierSegment* output,
                                               device atomic_uint& outCount,
                                               uint gid [[ therad_index_in_grid ]]) {

  BezierSegment out[2];
  int n=1;

  BezierSegment s = input[gid];

  if(s.flat) {
     out[0] = s;
  } else if(flattnessTest(s)) {
     s.flat = true;
     out[0] = s;
  } else {
     out[0] = leftChild(s);
     out[1] = rightChild(s);
     n = 2;
   }

   int i = atomic_add(&outCount, n);
   output[i] = out[0];
   if(n == 2) output[I+1] = out[1];
}

This kernel would be run in multiple passes over the curves according to the maximum subdivision level (hardcoded at 10 in the front-end). Buffers would be sized to accept the maximum number of segments (2^10).

The back end would have to handle generating segments for strokes, line-cap, line-end, etc.

The other GPU path rendering approaches I'm aware of involve sophisticated fragment shaders, so this may be easier to implement.

It may also be worth looking at https://github.com/linebender/piet-metal. It currently doesn't seem to handle curves on the GPU (Beziers are subdivided on the CPU), but takes a distance-field approach to path rendering.

See also http://hhoppe.com/ravg.pdf and https://www.microsoft.com/en-us/research/wp-content/uploads/2005/01/p1000-loop.pdf.

The text was updated successfully, but these errors were encountered:

olliwang · 2019-05-16T17:57:44Z

MetalNanoVG follows the style of the corresponded NanoVG project. To do what you suggested here, we need to touch the original NanoVG code and MetalNanoVG won't share the same nanovg.c anymore. How about implementing such improvement in NanoVG/OpenGL first?

wtholliday · 2019-05-16T18:22:57Z

Unfortunately, OpenGL doesn't have compute kernels, so doing it the way I described isn't possible with OpenGL. If they switched to a Vulkan backend, then it's possible.

I think the changes to nanovg.c might be minimal. I'd be more concerned about the amount of time required to implement this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

could path tessellation be done using the GPU? #18

could path tessellation be done using the GPU? #18

wtholliday commented May 16, 2019

olliwang commented May 16, 2019 •

edited

Loading

wtholliday commented May 16, 2019 •

edited

Loading

could path tessellation be done using the GPU? #18

could path tessellation be done using the GPU? #18

Comments

wtholliday commented May 16, 2019

olliwang commented May 16, 2019 • edited Loading

wtholliday commented May 16, 2019 • edited Loading

olliwang commented May 16, 2019 •

edited

Loading

wtholliday commented May 16, 2019 •

edited

Loading