Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Power op PCC drop issue #13874

Merged
merged 1 commit into from
Oct 24, 2024
Merged

Power op PCC drop issue #13874

merged 1 commit into from
Oct 24, 2024

Conversation

umadevimcw
Copy link
Contributor

@umadevimcw umadevimcw commented Oct 16, 2024

Ticket

#8593

Problem description

PCC drop for power op

What's changed

  • Reference code has issue due to bf16 conversion hence update the test accordingly
  • In TT we explicitly added a condition to return NaN if the values are negative. In torch for BF16 Nan is represented in large values instead of NaN which will be causing PCC drop hence removed the conversion part in torch ref code in this file

Explained the reason why change is required in this -
#8593 (comment)

Checklist

@@ -15,7 +15,7 @@
def run_pow_tests(input_shape, dtype, dlayout, in_mem_config, output_mem_config, data_seed, device):
torch.manual_seed(data_seed)

x = torch.Tensor(size=input_shape[0]).uniform_(-100, 100).to(torch.bfloat16)
x = torch.Tensor(size=input_shape[0]).uniform_(-100, 100)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • should we move this test out of non_working_unit_tests folder?
  • previously we were testing bfloat16 pow operation, now it's pow operation for float32, right?
    If true, should we just update range for x (...uniform_(-5, 5).to(torch.bfloat16) for example) to be a better match for previous test intention?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we minimize the range again the values will be a larger number if it is bf16 in torch, then if we don't do conversion the result will be nan. In TT version of pwer of we are explicitly checking for negative value, if the value is negative it will return nan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In TT we explicitly added a condition to return NaN if the values are negative. In torch for BF16 Nan is represented in large values instead of NaN which will be causing PCC drop hence removed the conversion part in torch ref code in this file

Copy link
Contributor

@eyonland eyonland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this test.

@umadevimcw umadevimcw merged commit fe9560b into main Oct 24, 2024
7 checks passed
@umadevimcw umadevimcw deleted the umadevimcw/power_bug branch October 24, 2024 08:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants