Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance user experience with MPI #997

Merged
merged 5 commits into from
Oct 31, 2024

Conversation

abussy
Copy link
Contributor

@abussy abussy commented Aug 21, 2024

This PR aims at enhancing the user experience when using MPI parallelization by not stopping when n_ranks > n_kpt.

The current solution of stopping execution with an error message is not optimal. Indeed, for an arbitrary system, it is not trivial to know the number of irreducible K-points in advance. This is particularly annoying when calculations are launched in an automated fashion, where the only safe bet is to not use MPI at all.

The solution proposed here is simple. When a calculation is started with more MPI ranks than K-points, a new MPI communicator is created. This communicator has as many ranks as there are K-points, while the remaining processes exit the program.

While this may lead to idling CPU time, I believe that not crashing improves the experience. Moreover, a warning is printed describing the situation to the user, so that they can optimize their next run. I would also argue that this is not worse than under-parallelizing in some instances where the number of K-point is not a multiple of the number of MPI ranks (maybe we should also issue a warning in such a case?).

I also took the opportunity to fix testing with the :mpi tag. The current test on parse(Bool, get(ENV, "CI", "false")) in the PlaneWaveBasis creation, which essentially disables MPI testing on the CI, has been removed. Instead, all tests involving kgrids with a single k-point have received the :dont_test_mpi tag. The number of MPI ranks for MPI testing is also hardcoded to 2: because all tests are run as a single calculation, killing processes when n_ranks > n_kpt would make the tests hang. This also fixes local testing via Pkg.test("DFTK"; test_args = ["mpi"]).

Copy link
Member

@mfherbst mfherbst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea, thanks for the PR.

Two comments:

  • Calling exit from within PlaneWaveBasis is super unexpected and can easily lead to spurious bugs. I'm thinking of a case where one uses MPI for DFTK and to run other things and all of a sudden part of the processes in the global communicator are just gone and the program hangs or does undefined stuff.
  • Instead I propose to actually provide a mechanism to do the appropriate splitting of the communicator outside of a PlaneWaveBasis call. What I'm thinking is to essentially internally call build_kpoints to determine the length(kcoords_global) plus a helper function to do the "shrinking" of the communicator plus potentially calling exit from there. The user would then explicitly call 3 functions. The one to determine the actual numer of k-points, then the split and exit function and then PlaneWaveBasis( ) with the communicator of the appropriate length.

Regarding the tests: I had in mind for a long time to actually switch the logic. Instead of "disabling" MPI tests where needed I think it's better to annotate those, which do support MPI instead. (I.e. instead of a :dont_test_mpi have a simple :mpi on the counter-set). If it's too much work, leave it as is and we do a follow-up ...

@mfherbst
Copy link
Member

mfherbst commented Aug 21, 2024

BTW the test failures in the "normal" tests seem fake (because of a too tightly chosen test tolerance, I think), but the cancelled MPI test points to an issue in your implementation (could be related to the exit() btw.)

@abussy
Copy link
Contributor Author

abussy commented Aug 22, 2024

Calling exit from within PlaneWaveBasis is super unexpected and can easily lead to spurious bugs. I'm thinking of a case where one uses MPI for DFTK and to run other things and all of a sudden part of the processes in the global communicator are just gone and the program hangs or does undefined stuff.

Yes, I agree actually. In fact, I was facing this exact problem when trying to run the tests with more than 2 MPI ranks.

Ideally, for a robust behavior, the extra MPI ranks should not exit the program, but wait at a synchronization barrier at the end of execution. I am not quite sure how to implement that in a light way, without an arbitrarily large if block after the creation of the sub-communicator. I will give it some thoughts.

There are 2 important questions to be addressed on the form I think:

  • Should an input script look any different when a parallel run is intended?
  • Should the user have to think about creating a sub-communicator or setup parallelization when writing their input?

I would tend to say no to both of the above. I'll try to come up with something along these lines.

@abussy
Copy link
Contributor Author

abussy commented Aug 23, 2024

The latest commit brings major simplifications to this MPI issue. Instead of building sub-communicators and killing processes in a dangerous fashion, I propose to duplicate some k-points over the empty MPI ranks. The weights are adjusted in order to keep exactness of the results. A warning is issued as well.

This is the lightest fix I could think of. It is also safe and robust, and the user does not have to think about parallelism in their script. The only potential issue I see is: if the user queries for the k-point mesh after the calculation, they could get a non irreducible k-mesh with duplicates. One possible solution would be to add an extra field to the basis with the original mesh, or write a function that always returns the irreducible mesh.

Regarding the tests: I had in mind for a long time to actually switch the logic. Instead of "disabling" MPI tests where needed I think it's better to annotate those, which do support MPI instead. (I.e. instead of a :dont_test_mpi have a simple :mpi on the counter-set). If it's too much work, leave it as is and we do a follow-up ...

I don't mind changing from :dont_test_mpi to :mpi tags in the tests. However, if this PR is accepted, I believe all tests could actually be run in parallel (to be checked by me). In such a case, we could get rid of this tag completely.

BTW the test failures in the "normal" tests seem fake (because of a too tightly chosen test tolerance, I think), but the cancelled MPI test points to an issue in your implementation (could be related to the exit() btw.)

In my experience, it seems that the anderson.jl test is particularly unstable.

@mfherbst
Copy link
Member

I'm in favour of this. @abussy Please update and then we can merge this !

@antoine-levitt Objections ?

@mfherbst
Copy link
Member

@abussy The only thing that we have to be careful with is unfolding the k-point mesh when not using symmetries and during IO operations.

@antoine-levitt
Copy link
Member

I haven't looked in details but it looks reasonable, go ahead!

@abussy
Copy link
Contributor Author

abussy commented Oct 25, 2024

Notes:

  • Added a n_irreducible_kpoints filed to the PlaneWaveBasis --> used in I/O so that only irreducible k-points are considered
  • Added irreducible_kcoords() and irreducible_kweights() functions for the same reasons
  • BZ unfolding currently not implemented with MPI, so duplicated k-point won't affect it. Added an extra error message in case MPI support gets implemented
  • Rebased for compatibility with master branch

Comment on lines 286 to 287
@warn("Attempting to parallelize $n_kpt k-points over $n_procs MPI ranks. DFTK does " *
"not support processes empty of k-point. Some k-points were duplicated over the " *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That line is a bit long.

Comment on lines 581 to 582
for i_dupl in basis.n_irreducible_kpoints+1:length(basis.kweights_global)
for i_irr in 1:basis.n_irreducible_kpoints
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use = when loops are over integers.

@@ -5,6 +5,7 @@ import MPI
Number of processors used in MPI. Can be called without ensuring initialization.
"""
mpi_nprocs(comm=MPI.COMM_WORLD) = (MPI.Init(); MPI.Comm_size(comm))
mpi_rankid(comm=MPI.COMM_WORLD) = (MPI.Init(); MPI.Comm_rank(comm))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused function ... remove or test

src/symmetry.jl Outdated
@@ -414,6 +414,10 @@ function unfold_array(basis_irred, basis_unfolded, data, is_ψ)
if !(basis_irred.comm_kpts == basis_irred.comm_kpts == MPI.COMM_WORLD)
error("Brillouin zone symmetry unfolding not supported with MPI yet")
end
if basis_irred.n_irreducible_kpoints < mpi_nprocs(basis_irred.comm_kpts)
#Note: if this routine is ever generalised for MPI, need special care for duplicated KP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space after #. May also be a bit too long.

test/PlaneWaveBasis.jl Outdated Show resolved Hide resolved
src/input_output.jl Outdated Show resolved Hide resolved
# Assume that duplicated k-points are appended at the end of the kcoords/kweights array
for i_dupl in basis.n_irreducible_kpoints+1:length(basis.kweights_global)
for i_irr in 1:basis.n_irreducible_kpoints
if maximum(abs.(basis.kcoords_global[i_dupl]-basis.kcoords_global[i_irr])) < eps(T)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments:

  • maximum(abs, ... ) is better (no allocation).
  • Hmm. This mechanics may be a little brittle. I'd be much more defensive here, e.g. make sure each i_irr only matches a single kpoint, e.g. by using only in combination with findall or so. Similarly I'd put an assertion that the weights add up to the total electron count or so (see how it's done in other places, the correct number depends on the computational setup).

Also too long a line 😄.

src/PlaneWaveBasis.jl Outdated Show resolved Hide resolved
src/PlaneWaveBasis.jl Outdated Show resolved Hide resolved
src/PlaneWaveBasis.jl Outdated Show resolved Hide resolved
src/PlaneWaveBasis.jl Outdated Show resolved Hide resolved
@abussy
Copy link
Contributor Author

abussy commented Oct 30, 2024

Implemented most concerns raised by @mfherbst during his review. I'd like to raise two points for discussion:

  1. I gave some thoughts to changing the :dont_test_mpi test tag to :mpi. I think that, by default, DFTK should be MPI compatible. Any method of function not implemented for MPI is then an exception, and using the :dont_test_mpi test flag makes that clear. I am afraid that changing to :mpi instead would lead to a lower impact of the tag.

  2. When reworking the irreducible_kweights_global() function in PlaneWaveBasis.jl, I realised that in some cases, duplicated k-points already exist (even in non-parallel runs), and I refrained from using functions such as findall(). In particular, when calculating band structures with the compute_bands() function, the K-point path might go through special points multiple times, on purpose. As a result, I kept the logic of looking for replicated k-points in basis.kcoords_global[basis.n_irreducible_kpoints+1:end] explicitly. I might possibly change nomenclature though, as the original set of K-points is not necessarily the irreducible one.

@mfherbst
Copy link
Member

mfherbst commented Oct 30, 2024

I gave some thoughts to changing the :dont_test_mpi test tag to :mpi. I think that, by default, DFTK should be MPI compatible. Any method of function not implemented for MPI is then an exception, and using the :dont_test_mpi test flag makes that clear. I am afraid that changing to :mpi instead would lead to a lower impact of the tag.

Hmm. Yes, but I think that's ok, because it's fine if only the core is tested with MPI and not the rest. But let's think about this some more and leave as is for now.

, I realised that in some cases, duplicated k-points already exist (even in non-parallel runs),
...
In particular, when calculating band structures with the compute_bands() function,

True, good point.

But still there is a problem. The coordinates can be the same if the spin is different. I'm not sure you handle this proplerly right now ?

Update: Ah no, this is before the entire spin business.

src/PlaneWaveBasis.jl Outdated Show resolved Hide resolved
src/PlaneWaveBasis.jl Show resolved Hide resolved
@mfherbst mfherbst merged commit ea0ffe4 into JuliaMolSim:master Oct 31, 2024
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants