-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add topology option #354
base: main
Are you sure you want to change the base?
add topology option #354
Conversation
@xrmzju Hi! Thanks for your submission! Can you tell me a little more about why you want to disable collection of NUMA topology information? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also love to know about the usecase. I can guess in some scenario the collection of topology data fails? If my guess is right I'd rather fix this usecase.
DisableNodeCaches bool | ||
DisableNodeAreas bool | ||
DisableNodeDistances bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this (kinda unprecedented) level of detail or can we just have a single option controlling everything?
DisableNodeCaches: option.EnvOrDefaultDisableNodeCaches(), | ||
DisableNodeAreas: option.EnvOrDefaultDisableNodeAreas(), | ||
DisableNodeDistances: option.EnvOrDefaultDisableNodeDistances(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to say to keep consistent with other options are handled even though we should later perhaps refactor. But I'd be consistent meantime.
@@ -46,32 +46,37 @@ func topologyNodes(ctx *context.Context) []*Node { | |||
nodeID, err := strconv.Atoi(filename[4:]) | |||
if err != nil { | |||
ctx.Warn("failed to determine node ID: %s\n", err) | |||
ctx.Err = err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this usecase makes sense, but I'd rather add a new function or enhance the Warn
helper to also set err
, without exposing the field (just yet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry i did not get your point, could you give a example code please?
Yes, your guess is correct. There was a failure in collecting memory area for a certain collection of topology data, but no error was raised. As a result, my program continued to run with incorrect topology information. In my scenario, I only require the CPU NUMA topology information and I do not need to consider the NodeCaches, NodeAreas, and NodeDistances. so i made the modification above |
Thanks for clarifying. Could you please share a description of the hardware on which the collection fails? E.g was it a regular NUMA x86 machine? Perhaps you were using (relatively) new technology like CXL? Or was it arm? In general I'm reluctant to add so fine control about collection of information - adds too many knobs and makes the code less regular, so I'd like to learn more about the usecase. |
@xrmzju thanks for sharing. I'll try to think about a more generic solution. I'll get back ASAP. |
DisableNodeCaches
,DisableNodeAreas
,DisableNodeDistances
when collect topology