Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FP Denorm ACT #36

Open
13 tasks
jjscheel opened this issue Sep 27, 2023 · 46 comments
Open
13 tasks

FP Denorm ACT #36

jjscheel opened this issue Sep 27, 2023 · 46 comments
Assignees

Comments

@jjscheel
Copy link
Contributor

jjscheel commented Sep 27, 2023

Technical Group

Architecture Test SIG

ratification-pkg

all FP extensions

Technical Liaison

Allen Baum

Task Category

Arch Tests

Task Sub Category

  • gcc
  • binutils
  • gdb
  • intrinsics
  • Java
  • KVM
  • ld
  • llvm
  • Linux kernel
  • QEMU
  • Spike

Ratification Target

Legacy

Statement of Work (SOW)

Component names:
D,F,Q, Zfh, Zfinx, Zdinx, Zhinx

Requirements:
Add directed test cases for FP ops that generate a normalized result, but also set the underflow flag

Origin
For an obscure IEEE FP reason, it is possible to generate a normalized FP result while still setting the underflow
flag. This is because the underflow is set after rounding – but for some corner cases, two roundings can occur.
This happens when a denormalized number , after rounding, becomes the largest denorm.
When converted to a denorm format, a second rounding (if the result is odd) occurs, which can round up and
overflow for some rounding modes, and it becomes normalized.

Test Details of Required Tests

For each FP op (FADD, FSUB, FMUL, FDIV, FSQRT, FMUL[N][ADD/SUB},
                             FCVT.S.D, FCVT.D.Q, FCVT.H.x
                             (any FPCVT from a longer FP format to a shorter FP format)

Choose operands that result in raw values for each FP format (before any rounding) of
   EXP = -bias, (is 127 for single, 2047 for double, etc)
   SGN= 0 and1
   MANT = all 1s & all1-1 (including hidden bit) for each FP format
   All 4 combinations of guard and sticky bits

Deliverables:
Since the real requirement is generation of an internal rounded result of all 1s, which is not architecturally visible, we can’t write the coverage except to verify the output is either the smallest normalized number or the largest denorm.

  • Assembly language tests that meet the TestFormatSpec,
  • Coverage models using riscv-ctg YAML formatted schema Acceptance criteria

Acceptance Criteria:

  • Tests pass using the riscof framework

Projected timeframe: 3 months

SOW Signoffs: (delete those not needed)

  • Task group liaison sign-off
  • Development partner sign-off
  • ACT SIG sign-off (if ACT work)

Waiver

  • Freeze
  • Ratification

Pull Request Details

TBD

@jjscheel
Copy link
Contributor Author

@ptprasanna, please review this SOW and ask any questions here. The text came from @allenjbaum. He will be your primary resource on this item.

We should discuss more in our next meeting.

@jjscheel
Copy link
Contributor Author

Updated typos in the description section.

@jjscheel
Copy link
Contributor Author

Comments from @allenjbaum via email:

I'm wondering if we should include a requirement to show that the intermediate result is in the required range (e.g. with a manual calculation). That's pretty easy for most of these on RV32 (with a hex calculator) except divide and sqrt perhaps,
but more difficult with longer word lengths than hex calculators normally support.

Actually, the tricky bit (sic) is to see that guard/round bits have all 4 combinations

@jjscheel jjscheel changed the title FP denorm SOW FP Denorm Oct 10, 2023
@jjscheel jjscheel changed the title FP Denorm FP Denorm ACT Oct 10, 2023
@ptprasanna
Copy link

@jjscheel and @allenjbaum, Had a detailed discussion with @anuani21 on this, looks like we do have the tests available and running on the given combination. It's just that the underflow flag which we failed to set on fcsr to acknowledge the result is a denorm, is bit of a gap in the development, which is been caught now. @anuani21 is working on it to fix this up.

@jjscheel
Copy link
Contributor Author

jjscheel commented Nov 3, 2023

Thanks, @ptprasanna! Please keep working with @allenjbaum and the sig-arch-test group.

@allenjbaum
Copy link

allenjbaum commented Nov 6, 2023 via email

@jjscheel
Copy link
Contributor Author

@ptprasanna, any chance on an update here?

@anuani21
Copy link

@jjscheel Test generated for F,Zfh,Zfinx and Zhinx extensions and the execution is successful.Generated test case will produce a normalized result and also set an underflow flag.

Tests and coverage reports for F,Zfh,Zfinx and Zhinx are placed in the below link
https://gitlab.com/ptprasanna/actreports/-/tree/main/FP_denorm?ref_type=heads

@allenjbaum
Copy link

allenjbaum commented Nov 29, 2023 via email

@anuani21
Copy link

@allenjbaum, I am generating test case for D, Zdinx and Q extension for the given combination. Along with the coverage report for these extensions, I will raise a PR in appropriate riscv github.

@allenjbaum
Copy link

allenjbaum commented Dec 11, 2023 via email

@anuani21
Copy link

@jjscheel, Test generated for D extensions and the execution is successful.Generated test case will produce a normalized result and also set an underflow flag.

I am generating test case for Zdinx and Q extension for the given combination.Along with coverage report for these extensions, I will a raise a PR in appropriate riscv github.

@allenjbaum
Copy link

allenjbaum commented Jan 23, 2024 via email

@anuani21
Copy link

anuani21 commented Jan 23, 2024 via email

@allenjbaum
Copy link

allenjbaum commented Jan 23, 2024 via email

@anuani21
Copy link

@jjscheel, Here are the updates from IITM,

Zdinx- Three operand instructions like (fmadd,fmsub,fnmadd,fnmsub) test generated but few coverpoints are not met up.I am fixing this issue.
Q extension- yet to start the test.

@jjscheel
Copy link
Contributor Author

@anuani21, I have set the state of the project to "Planning". I'd appreciate some outlook as to when the work will be done and confirmation that this is a "Medium" sized project (3-6 months of work).

@anuani21
Copy link

No progress made since last Call. Was busy on Debug ACT for Native Triggers.

@anuani21
Copy link

anuani21 commented Sep 3, 2024

No progress made since last Call. Was busy on Debug ACT for Native Triggers.

@jjscheel
Copy link
Contributor Author

jjscheel commented Sep 3, 2024

Thanks. Understand.

@anuani21
Copy link

@jjscheel,

Updates from IITM,

1.Zdinx extension - Few instructions are pending.
2. Q-extension - yet to start the test.

We will restart the work by this week.

@davidharrishmc
Copy link

I see a directory of fadd tests that say "underflow flag set". Do any of these cases actually set the underflow flag? The ones I spot checked in fadd_b3 are of the form a - a, which produces an exact 0 result.

https://gitlab.com/ptprasanna/actreports/-/tree/main/FP_denorm/RV32F-denorm/fadd/RV32F-fadd-all?ref_type=heads

I'm having trouble thinking of a specific floating-point addition that would set the underflow flag. This only happens when the (possibly intermediate) result is subnormal and the result is inexact. All the addition operations I can think of that produce subnormal results are exact.

I'm confident fsqrt can't set the underflow flag because the result is never subnormal.

According to the RISC-V unprivileged spec, fadd/fsub/fsqrt never set the underflow flag.

image

I agree that mul, div, and convert operations can set the underflow flag and have the interesting corner case of doing so even when the final result is the minimum normalized number.

@davidharrishmc
Copy link

I’m also concerned that the wrong convert test cases are being generated in
https://gitlab.com/ptprasanna/actreports/-/tree/main/FP_denorm?ref_type=heads

I see RV32F/fcvt.{w/wu}.s and RV32Zfh-denorm/fcvt.{w/wu}.h and RV64D-denorm/{fcvt.{l/lu}.d, none of which should produce underflow flag per the RISC-V spec Tables 19-22.
I don’t see any of the converts that should be able to set an underflow flag.

@jordancarlin
Copy link

Additionally, based on the Sail log, it seems that something might have gone wrong in the test generation. Looking at the RV32F fadd_b3 test suite, the first test case results in the following instructions being run by the Sail model:

mem[X,0x80000114] -> 0xAF87
mem[X,0x80000116] -> 0x0001
[74] [M]: 0x80000114 (0x0001AF87) flw ft11, 0(gp)
mem[R,0x80010010] -> 0x7F222105
f31 <- 0xFFFFFFFF7F222105


mem[X,0x80000118] -> 0xAF07
mem[X,0x8000011A] -> 0x0041
[75] [M]: 0x80000118 (0x0041AF07) flw ft10, 4(gp)
mem[R,0x80010014] -> 0xFF222105
f30 <- 0xFFFFFFFFFF222105


mem[X,0x8000011C] -> 0x0113
mem[X,0x8000011E] -> 0x0020
[76] [M]: 0x8000011C (0x00200113) addi sp, zero, 2
x2 <- 0x00000002


mem[X,0x80000120] -> 0x1073
mem[X,0x80000122] -> 0x0031
[77] [M]: 0x80000120 (0x00311073) csrrw zero, fcsr, sp
CSR fcsr -> 0x00000000
CSR fcsr <- 0x00000002 (input: 0x00000002)


mem[X,0x80000124] -> 0xFFD3
mem[X,0x80000126] -> 0x01EF
[78] [M]: 0x80000124 (0x01EFFFD3) fadd.s ft11, ft11, ft10, dyn
f31 <- 0xFFFFFFFF00000000


mem[X,0x80000128] -> 0x2273
mem[X,0x8000012A] -> 0x0030
[79] [M]: 0x80000128 (0x00302273) csrrs tp, fcsr, zero
CSR fcsr -> 0x00000002
x4 <- 0x00000002


mem[X,0x8000012C] -> 0xA027
mem[X,0x8000012E] -> 0x01F0
[80] [M]: 0x8000012C (0x01F0A027) fsw ft11, 0(ra)
mem[0x80012514] <- 0x00000000


mem[X,0x80000130] -> 0xA223
mem[X,0x80000132] -> 0x0040
[81] [M]: 0x80000130 (0x0040A223) sw tp, 4(ra)
mem[0x80012518] <- 0x00000002

Sail log

This seems to be writing 0x02 to fcsr(which corresponds to the underflow bit) before the actual fadd.s operation. Since the floating point flags are sticky, this will persist until it is cleared. It is not cleared before the fadd.s operation, so the underflow flag will appear set when it is checked after the operation even if the fadd.s does not set it. Instead of setting fcsr to 0x02, it probably needs to be cleared to 0x0 before each test to ensure previous results do not mask the flags.

@allenjbaum
Copy link

allenjbaum commented Oct 15, 2024 via email

@allenjbaum
Copy link

allenjbaum commented Oct 15, 2024 via email

@davidharrishmc
Copy link

@allenjbaum do you mean it should be csrrc, not csrrs or csrsw?

@davidharrishmc
Copy link

The flags are subtle. In a system with floating-point exceptions (such as x86), the underflow exception is raised for tiny results. However, the underflow flag is only raised for tiny results that are inexact. RISC-V has an underflow flag but no underflow exception.

See the IEEE 754-2019 spec at Section 7.5:
In addition, under default exception handling for underflow, if the rounded result is inexact — that is, it
differs from what would have been computed were both exponent range and precision unbounded — the
underflow flag shall be raised and the inexact (see 7.6) exception shall be signaled. If the rounded result is
exact, no flag is raised and no inexact exception is signaled. This is the only case in this standard of an
exception signal receiving default handling that does not raise the corresponding flag. Such an underflow
signal has no observable effect under default handling.

@allenjbaum
Copy link

allenjbaum commented Oct 15, 2024 via email

@davidharrishmc
Copy link

I would suggest csrw fflags, x0

This will clear all the flags, as you suggest, without impacting the rounding mode.

But implementation details aside, does anyone believe that fadd should raise the underflow flag, and if so, could you give a specific pair of inputs that should do so? @anuani21 can you describe the theory of operation about how the inputs were chosen in the test cases and whether you believe any of the test cases are raising the underflow?

@allenjbaum
Copy link

allenjbaum commented Oct 15, 2024 via email

@davidharrishmc
Copy link

davidharrishmc commented Oct 15, 2024 via email

@allenjbaum
Copy link

allenjbaum commented Oct 15, 2024 via email

@allenjbaum
Copy link

allenjbaum commented Oct 15, 2024 via email

@davidharrishmc
Copy link

davidharrishmc commented Oct 16, 2024 via email

@allenjbaum
Copy link

allenjbaum commented Oct 16, 2024 via email

@davidharrishmc
Copy link

davidharrishmc commented Oct 16, 2024 via email

@allenjbaum
Copy link

allenjbaum commented Oct 16, 2024 via email

@davidharrishmc
Copy link

davidharrishmc commented Oct 16, 2024 via email

@allenjbaum
Copy link

allenjbaum commented Oct 16, 2024 via email

@davidharrishmc
Copy link

davidharrishmc commented Oct 16, 2024 via email

@allenjbaum
Copy link

allenjbaum commented Oct 16, 2024 via email

@anuani21
Copy link

@jjscheel,

As per Allen and David discussion, I need to do some changes in the test in all FP extensions.Probably, I will raise a PR by next week.

@allenjbaum
Copy link

allenjbaum commented Oct 29, 2024 via email

@anuani21
Copy link

@jjscheel,

Need to do some changes for three operand instruction test cases in all FP extensions.I will raise a PR soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Blocked
Development

No branches or pull requests

6 participants