On SLM usage with multi work groups #4176

GHGmc2 · 2021-07-24T14:27:23Z

GHGmc2
Jul 24, 2021

Take SIMD8 as an example, the max number of work groups on a DualSubSlice(DSS) is: (16EU * 8 Thread * SIMD8 / wg_size 512) = 2.

Say a DSS has 64KB SLM, and a workgroup needs 40KB SLM. Does the runtime know the max number of workgroups can be scheduled on an DSS is 1 (because 40 *2 > 64), or the developer must make sure not to use SLM size larger than 32KB for one workgroup on the application side?

bader · 2021-07-26T09:47:48Z

bader
Jul 26, 2021
Maintainer

According to my understanding the runtime takes into account SLM size restrictions for work scheduling, but you can easily verify this with a simple app. Right?

NOTE: by "runtime" here I mean low-level runtime as Level Zero, OpenCL, etc., so it's better to forward this question to https://github.com/intel/compute-runtime/ project. DPC++ runtime just propagates the information from user to the low-level runtime, but work-item/work-group mapping to HW is done by low-level runtime.

1 reply

GHGmc2 Jul 26, 2021
Author

Got it, thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On SLM usage with multi work groups #4176

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

On SLM usage with multi work groups #4176

GHGmc2 Jul 24, 2021

Replies: 1 comment · 1 reply

bader Jul 26, 2021 Maintainer

GHGmc2 Jul 26, 2021 Author

GHGmc2
Jul 24, 2021

Replies: 1 comment 1 reply

bader
Jul 26, 2021
Maintainer

GHGmc2 Jul 26, 2021
Author