Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use stricter host buffer alignment (64B) required by modern CPUs. #121

Merged

Conversation

pioto1225
Copy link
Contributor

No description provided.

@pioto1225
Copy link
Contributor Author

Hi,

Current clpeak does not show full D2H transfer potential of Arc GPU due to stricter alignment requirements on the host.
Using Core Ultra 9 285k Arc 770 is struggling to achieve any reasonable D2H bandwidth (1.6GBps) 14th gen Raptor-lake was better but still subpar (8GBps):

$ ./clpeak --transfer-bandwidth

Platform: Intel(R) OpenCL Graphics
  Device: Intel(R) Arc(TM) A770 Graphics
    Driver version  : 24.39.31294.12 (Linux x64)
    Compute units   : 512
    Clock frequency : 2400 MHz

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 20.25
      enqueueReadBuffer               : 1.69
      enqueueWriteBuffer non-blocking : 21.28
      enqueueReadBuffer non-blocking  : ^C

With the fix:

$ ./clpeak --transfer-bandwidth

Platform: Intel(R) OpenCL Graphics
  Device: Intel(R) Arc(TM) A770 Graphics
    Driver version  : 24.39.31294.12 (Linux x64)
    Compute units   : 512
    Clock frequency : 2400 MHz

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 20.58
      enqueueReadBuffer               : 21.01
      enqueueWriteBuffer non-blocking : 21.86
      enqueueReadBuffer non-blocking  : 22.39
      enqueueMapBuffer(for read)      : 20.38
        memcpy from mapped ptr        : 26.46
      enqueueUnmap(after write)       : 21.94
        memcpy to mapped ptr          : 27.08

For more information please refer to: intel/compute-runtime#775
Please consider merging.

@krrishnarraj krrishnarraj merged commit f66ac42 into krrishnarraj:master Nov 25, 2024
@krrishnarraj
Copy link
Owner

Thanks for the find

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants