Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ffx_variable_shading.h compute shader #4

Open
pemgithub opened this issue Oct 6, 2022 · 0 comments
Open

ffx_variable_shading.h compute shader #4

pemgithub opened this issue Oct 6, 2022 · 0 comments

Comments

@pemgithub
Copy link

Can you explain how the compute shader and related threadgroups work?

From C++ code:

FFX_VariableShading_GetDispatchInfo(data, AdditionalShadingRates(), w, h);
pCmdLst->Dispatch(w, h, 1);
// coarse tiles are potentially 2x2, so each thread computes 2x2 pixels
if (cb->tileSize == 8)
{
    //each threadgroup computes 4 VRS tiles
    numThreadGroupsX = FFX_VariableShading_DivideRoundingUp(vrsImageWidth, 2);
    numThreadGroupsY = FFX_VariableShading_DivideRoundingUp(vrsImageHeight, 2);
}

From VRSImageGenCS.hlsl (static const uint FFX_VariableShading_ThreadCount1D = 8;)

[numthreads(FFX_VariableShading_ThreadCount1D, FFX_VariableShading_ThreadCount1D, 1)]
void mainCS(
    uint3 Gid  : SV_GroupID,
    uint3 Gtid : SV_GroupThreadID,
    uint  Gidx : SV_GroupIndex)
{
    FFX_VariableShading_GenerateVrsImage(Gid, Gtid, Gidx);
}

From ffx_variable_shading.h:

// sample source texture (using motion vectors)
while (index < FFX_VariableShading_SampleCount)
{
    int2 index2D = 2 * int2(index % FFX_VariableShading_SampleCount1D, index / FFX_VariableShading_SampleCount1D);
    float4 lum = 0;
    lum.x = FFX_VariableShading_GetLuminance(baseOffset + index2D + int2(0, 0));
    lum.y = FFX_VariableShading_GetLuminance(baseOffset + index2D + int2(1, 0));
    lum.z = FFX_VariableShading_GetLuminance(baseOffset + index2D + int2(0, 1));
    lum.w = FFX_VariableShading_GetLuminance(baseOffset + index2D + int2(1, 1));
    ...
    index += FFX_VariableShading_ThreadCount;

For example, suppose we have an image that's 1080x3200 with 8x8 tiles, so the VRS image is 135 x 400. Suppose our example supports up to 2x2 coarse pixel size, so numThreadGroupsX is 68, numThreadGroupsY is 200. Why is it 68x200? 4 VRS tiles (2x2 tiles) per thread. Each threadgroup is 8x8 threads and computes 4 VRS tiles (2x2 VRS tiles is 16x16 pixels) (each thread gets luminance for a 2x2 pixels x 8x8 threads = 16x16 pixels). I think that sort of makes sense...

However, what is while (index < FFX_VariableShading_SampleCount) doing? while (index < 100) { do stuff then index += 64 }. What is this while loop doing? Is it okay to comment it out?

static const uint FFX_VariableShading_ThreadCount1D = 8;
static const uint FFX_VariableShading_SampleCount1D = FFX_VariableShading_ThreadCount1D + 2; // 10
static const uint FFX_VariableShading_SampleCount = FFX_VariableShading_SampleCount1D * FFX_VariableShading_SampleCount1D; // 100
static const uint FFX_VariableShading_ThreadCount = FFX_VariableShading_ThreadCount1D * FFX_VariableShading_ThreadCount1D; // 64
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant