ULP requirements for fp16 divide #1278

lakshmih · 2024-11-05T21:25:19Z

Referring to OpenCL C spec on ULP requirements

ULP requirements for single precision divide (x/y) and reciprocal(1.0/x) are ≤ 2.5 ulp

However, for half precision these are defined as needing to be 'correctly rounded'

We would like to propose that these be defined with specific (lower) ULP, following the pattern for other built-ins.
We would further like to set a precision requirement of <= 1 ulp for both of these cases for fp16.

Double precision ULP for these cases also suffer from the same discrepancy (specific ULP for float, correctly rounded for double) so these should be reviewed as well.

svenvh · 2024-11-06T15:39:48Z

As an additional consideration, for fp32 the specification defines the CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT value for the CL_DEVICE_SINGLE_FP_CONFIG query:

CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT - divide and sqrt are correctly rounded as defined by the IEEE754 specification.

In case implementations want to keep advertising a correctly rounded divide for half/double, we could consider extending the CL_DEVICE_HALF_FP_CONFIG and CL_DEVICE_DOUBLE_FP_CONFIG with CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT. That may also require revisiting the sqrt ULP requirements.

bashbaug · 2024-11-11T18:57:44Z

Discussed in the November 5th teleconference. The decisions we need to make are:

Should we reduce the precision requirements for an fp16 divide? (Leaning towards “yes”.)
If we reduce the precision requirements for an fp16 divide, what should we reduce it to? <=1ULP? <=0.5ULP? Something else?
Do we need a build option like “-cl-fp32-correctly-rounded-divide-sqrt” to get back to a more precise fp16 divide?
Do we want to relax the precision requirements for any other fp16 operations, or is this just an issue with the fp16 divide? Consider 1/x in particular.
Do we want to relax the precision requirements for any fp64 operations?

kpet · 2024-11-12T17:49:31Z

See related CTS issue: KhronosGroup/OpenCL-CTS#1996

lakshmih · 2024-11-12T19:06:47Z

Agreed on WG call 11/12 to:

Fix fp16 CTS tests for divide, reciprocal and sqrt to use the correctly rounded reference values. Covered in Mobica issues correctly rounded divide test for half is not using a correctly rounded reference OpenCL-CTS#1996 Re-enable reciprocal test in math_brute_force suite OpenCL-CTS#2145 and test for fp16 sqrt is not using a correctly rounded reference OpenCL-CTS#2146
Once this is done, all vendors will run CTS to check if they pass and if not, what ULP values are acceptable
Based on above results and discussion, amend the spec to reflect changes to fp16 ULP values

bashbaug added this to OpenCL specification maintenance Nov 11, 2024

github-project-automation bot moved this to To do in OpenCL specification maintenance Nov 11, 2024

bashbaug moved this from To do to Needs WG discussion in OpenCL specification maintenance Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ULP requirements for fp16 divide #1278

ULP requirements for fp16 divide #1278

lakshmih commented Nov 5, 2024

svenvh commented Nov 6, 2024

bashbaug commented Nov 11, 2024

kpet commented Nov 12, 2024

lakshmih commented Nov 12, 2024

ULP requirements for fp16 divide #1278

ULP requirements for fp16 divide #1278

Comments

lakshmih commented Nov 5, 2024

svenvh commented Nov 6, 2024

bashbaug commented Nov 11, 2024

kpet commented Nov 12, 2024

lakshmih commented Nov 12, 2024