-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Geometry API Rethink #102
Comments
Hey, these sound like my kind of questions! Would love to be in the loop if there are further discussions here.
|
Definitely be interested to talk more! It might be worth having a telecon with interested parties at some point... although also at some point you'll just be here. ;)
Yes, exactly. Similar to how Athena++ writes its loops over k,j and then an inner loop over a pencil. We've had mixed success with this too. On GPU doing things this way introduces a dependence on meshblock size, since you need to saturate the GPU.
Yes---I actually did a little experiment found the latter to be about a factor of 2 faster... what's not clear is how much of a speedup that translates to in Phoebus. It also requires writing some variable packing machinery in phoebus because I currently leverage
I'm pretty proud of the analytic machinery I've come up with... I think performance with it is pretty good, and it's easy to add new spacetimes/transformations thanks to a template-based hierarchy. But you're right that predictable performance may be more valuable than optimal performance---I think that's the real reason to move away from analytic coordinate systems. |
I am interested in listening to the discussion (at least), if this becomes a meeting or con-call. I don't know if it's relevant, but I had an experience in early GPGPU days, where I discovered that I got significantly better performance doing redundant computations, rather than trying to cache intermediate values. That was a specific thing, though; I wouldn't assume it applies to your code. |
Yeah I think that may be the case, @jti-lanl currently I don't think the cached machinery is much faster than the analytic formulae. |
This issue is motivated by a discussion with @jdolence . A few related questions arose about geometry:
GEOM_ELEM,k,j,i
. But perhaps given the way we use the geometry package,k,j,i,GEOM_ELEM
would be more performant.I don't think we currently have any answers, so we decided to punt for now and move forward. But I wanted to have these questions written down so we can revisit them later.
The text was updated successfully, but these errors were encountered: