Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using kernel specific max work group size instead of device max work … #542

Merged
merged 3 commits into from
Jul 11, 2024

Conversation

fengyuan14
Copy link
Contributor

…group size.

Max work group size of kernel is not a static and device related only property. It now in SYCL depends on driver/compiler implementation. Device max work group means the probable max work group size allowd by the device. But actual max work group size depends on driver/compiler implementation, like compilaton optimization. Using kernel specific max work group size could get actual max work group allowed correctly. For example, on Xe, if compiler chooses SIMD16 and large GRF (32 HW threads per SS), the actual max work group size will be 512 (16 * 32), not 1024 queried by device::info::max_work_group_size.

@fengyuan14 fengyuan14 requested a review from gujinghui July 6, 2024 13:13
…group size.

Max work group size of kernel is not a static and device related only property. It now in SYCL depends on driver/compiler implementation.
Device max work group means the probable max work group size allowd by the device. But actual max work group size depends on
driver/compiler implementation, like compilaton optimization. Using kernel specific max work group size could get actual max work group
allowed correctly. For example, on Xe, if compiler chooses SIMD16 and large GRF (32 HW threads per SS), the actual max work group size
will be 512 (16 * 32), not 1024 queried by device::info::max_work_group_size.

Signed-off-by: Feng Yuan <feng1.yuan@intel.com>
@majing921201
Copy link
Contributor

majing921201 commented Jul 8, 2024

I have a question about syclMaxWorkItemsPerEU API, which defined by us. We use them to compute actual work group size per EU, when compiler doesn't give us correct number before. Now they seems duplicate, right ?

@fengyuan14
Copy link
Contributor Author

I have a question about syclMaxWorkItemsPerEU API, which defined by us. We use them to compute actual work group size per EU, when compiler doesn't give us correct number before. Now they seems duplicate, right ?

Yes, we can make syclMaxWorkItemsPerEU and syclMaxWorkItemsPerTile actual. Since we asked compiler for simd width of the kernel.

@majing921201 majing921201 reopened this Jul 8, 2024
@fengyuan14 fengyuan14 added this pull request to the merge queue Jul 11, 2024
Merged via the queue into main with commit 0253fb9 Jul 11, 2024
2 checks passed
@fengyuan14 fengyuan14 deleted the fy/ker-wg-size branch July 11, 2024 02:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants