Replies: 1 comment 1 reply
-
According to my understanding the runtime takes into account SLM size restrictions for work scheduling, but you can easily verify this with a simple app. Right? NOTE: by "runtime" here I mean low-level runtime as Level Zero, OpenCL, etc., so it's better to forward this question to https://github.com/intel/compute-runtime/ project. DPC++ runtime just propagates the information from user to the low-level runtime, but work-item/work-group mapping to HW is done by low-level runtime. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Take SIMD8 as an example, the max number of work groups on a DualSubSlice(DSS) is: (16EU * 8 Thread * SIMD8 / wg_size 512) = 2.
Say a DSS has 64KB SLM, and a workgroup needs 40KB SLM. Does the runtime know the max number of workgroups can be scheduled on an DSS is 1 (because 40 *2 > 64), or the developer must make sure not to use SLM size larger than 32KB for one workgroup on the application side?
Beta Was this translation helpful? Give feedback.
All reactions