You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not a bug in how Hatchet is reading the data, but users may be confused with some of the spot caliper data. Tracking this caliper discussion here.
This case can happen is if node N (and its subgraph) occurs on only a subset of ranks.
Caliper computes the metrics from the records it has, e.g. if some node N exists on 4 out of 8 ranks it computes the average (and min) for only those 4 records, whereas the result for the root would be based on all 8 ranks.
One of the issues here is maintaining compatibility with existing Spot data. If we change the way Caliper computes the min/max/avg, it'll change the metric name and we won't be able to compare new with old data anymore - not just in hatchet but also in the Spot web GUI.
The issue is that in the Average tree, F6 is 5x larger than its parent, F1. I do not understand how that is possible mathematically, as the global sum of F1 should include the global sum of F6, and therefore ave_F1 >> ave_F6 (the division by num_procs should not change that)
Not a bug in how Hatchet is reading the data, but users may be confused with some of the spot caliper data. Tracking this caliper discussion here.
This case can happen is if node N (and its subgraph) occurs on only a subset of ranks.
Caliper computes the metrics from the records it has, e.g. if some node N exists on 4 out of 8 ranks it computes the average (and min) for only those 4 records, whereas the result for the root would be based on all 8 ranks.
One of the issues here is maintaining compatibility with existing Spot data. If we change the way Caliper computes the min/max/avg, it'll change the metric name and we won't be able to compare new with old data anymore - not just in hatchet but also in the Spot web GUI.
The issue is that in the Average tree, F6 is 5x larger than its parent, F1. I do not understand how that is possible mathematically, as the global sum of F1 should include the global sum of F6, and therefore ave_F1 >> ave_F6 (the division by num_procs should not change that)
The text was updated successfully, but these errors were encountered: