Skip to content

Commit

Permalink
Merge branch 'develop'
Browse files Browse the repository at this point in the history
  • Loading branch information
mccoys committed Nov 19, 2024
2 parents 8d36780 + 8581e3a commit 3989d92
Show file tree
Hide file tree
Showing 49 changed files with 1,116 additions and 1,067 deletions.
2 changes: 2 additions & 0 deletions doc/Sphinx/Understand/GPU_offloading.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ the announced exaflopic supercomputers will include GPUs.
* Cartesian geometry in 1D, 2D and in 3D , for order 2
* Diagnostics: Field, Probes, Scalar, ParticleBinning, TrackParticles
* Moving Window
* Boundary conditions for Fields: Periodic, reflective and silver-muller are supported (no PML or BM)
* Boundary conditions for Particles: Periodic, Reflective, thermal, remove and stop are supported

* A few key features remain to be implemented (AM geometry, ionization, PML, envelope,
additional physics), but the fundamentals of the code are ported.
Expand Down
6 changes: 2 additions & 4 deletions doc/Sphinx/Use/GPU_version.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,6 @@ This page contains the links of this documentation to compile and run SMILEI on

----

Known issues
^^^^^^^^^^^^
Important note:

2D and 3D runs may crash with A2000 & A6000 GPUs (used in laptops and worstations respectively,
they are not 'production GPUs' which are designed for 64 bits floating point operations )
The biggest challenge to execute SMILEI on an accelerator is the correct installation of the openmpi library. It needs to be compiled with nvc++ after configuring (ie. ./configure --options) with the appropriate options specific to your system
20 changes: 20 additions & 0 deletions doc/Sphinx/Use/install_linux_GPU.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ First, make sure you have a recent version of CMAKE, and the other libraries
to compile Smilei on CPU as usual. In particular, for this example, you
need GCC <= 12.

The installation protocol showed below uses the openmpi included in nvhpc. This approach often results in segfault at runtime (note that nvidia will remove openmpi from nvhpc in the future).
The "proper" way, which is much harder, consists in installing openmpi compiled with nvhpc (

Make a directory to store all the nvidia tools. We call it $NVDIR:

.. code:: bash
Expand Down Expand Up @@ -72,3 +75,20 @@ To run:
source nvidia_env.sh
smilei namelist.py
As an example of a "simple" openmpi installation
Openmpi dependencies such as zlib, hwloc and libevent should first be compiled with nvc++

.. code:: bash
export cuda=PATH_TO_YOUR_NVHPC_FOLDER/Linux_x86_64/24.5/cuda
wget https://download.open-mpi.org/release/open-mpi/v4.1/openmpi-4.1.5.tar.gz
tar -xzf openmpi-4.1.5.tar.gz
cd openmpi-4.1.5
mkdir build
cd build
CC=nvc++ CXX=nvc++ CFLAGS=-fPIC CXXFLAGS=-fPIC ../configure --with-hwloc --enable-mpirun-prefix-by-default --prefix=PATH_TO_openmpi/openmpi-4.1.6/build --enable-mpi-cxx --without-verb --with-cuda=$cuda --disable-mpi-fortran -with-libevent=PATH_TO_libevent/libevent-2.1.12-stable/build
make -j 4 all
make install
Because of the complexity of the configure for openmpi, we recommend using your supercomputer support to use smilei on GPUs.
6 changes: 3 additions & 3 deletions src/Collisions/BinaryProcesses.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@ void BinaryProcesses::calculate_debye_length( Params &params, Patch *patch )
// compute debye length squared in code units
patch->debye_length_squared[ibin] = 1./inv_D2;
// apply lower limit to the debye length (minimum interatomic distance)
double rmin2 = pow( coeff*density_max, -2./3. );
double rmin2 = 1.0 / cbrt( coeff*density_max * coeff*density_max ) ;
if( patch->debye_length_squared[ibin] < rmin2 ) {
patch->debye_length_squared[ibin] = rmin2;
}
Expand Down Expand Up @@ -292,8 +292,8 @@ void BinaryProcesses::apply( Params &params, Patch *patch, int itime, vector<Dia
double dt_corr = every_ * params.timestep * ((double)ncorr) * inv_cell_volume;
n1 *= inv_cell_volume;
n2 *= inv_cell_volume;
D.n123 = pow( n1, 2./3. );
D.n223 = pow( n2, 2./3. );
D.n123 = cbrt(n1*n1);
D.n223 = cbrt(n2*n2);

// Now start the real loop on pairs of particles
// See equations in http://dx.doi.org/10.1063/1.4742167
Expand Down
2 changes: 1 addition & 1 deletion src/Collisions/CollisionalIonization.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,7 @@ void CollisionalIonization::calculate( double gamma_s, double gammae, double gam
// Lose incident electron energy
if( U2 < Wi/We ) {
// Calculate the modified electron momentum
double pr = sqrt( ( pow( gamma_s-e, 2 )-1. )/p2 );
double pr = sqrt( ( ( gamma_s - e ) * ( gamma_s - e ) - 1. ) / p2 );
pe->momentum( 0, ie ) *= pr;
pe->momentum( 1, ie ) *= pr;
pe->momentum( 2, ie ) *= pr;
Expand Down
2 changes: 1 addition & 1 deletion src/Collisions/Collisions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Collisions::Collisions(
coeff1_ = 4.046650232e-21*params.reference_angular_frequency_SI; // h*omega/(2*me*c^2)
coeff2_ = 2.817940327e-15*params.reference_angular_frequency_SI/299792458.; // re omega / c
coeff3_ = coeff2_ * coulomb_log_factor_;
coeff4_ = pow( 3.*coeff2_, -1./3. );
coeff4_ = 1.0 / cbrt( 3.*coeff2_);
}


Expand Down
25 changes: 14 additions & 11 deletions src/Diagnostic/DiagnosticScreen.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ DiagnosticScreen::DiagnosticScreen(
if( params.nDim_particle > 1 ) {
screen_vector_a[0] = -screen_unitvector[1];
screen_vector_a[1] = screen_unitvector[0];
double norm = sqrt( pow( screen_vector_a[0], 2 ) + pow( screen_vector_a[1], 2 ) );
double norm = sqrt( screen_vector_a[0] * screen_vector_a[0] + screen_vector_a[1] * screen_vector_a[1] );
if( norm < 1.e-8 ) {
screen_vector_a[0] = 0.;
screen_vector_a[1] = 1.;
Expand Down Expand Up @@ -132,7 +132,7 @@ DiagnosticScreen::DiagnosticScreen(
ERROR( errorPrefix << ": axis `theta` not available for `" << screen_shape << "` screen" );
}
for( idim=0; idim<params.nDim_particle; idim++ ) {
coefficients[params.nDim_particle+idim] = screen_vector[idim] / pow( screen_vectornorm, 2 );
coefficients[params.nDim_particle+idim] = screen_vector[idim] / ( screen_vectornorm * screen_vectornorm );
}
} else if( type == "phi" ) {
if( screen_type == 0 ) {
Expand Down Expand Up @@ -187,7 +187,7 @@ void DiagnosticScreen::run( Patch *patch, int, SimWindow *simWindow )
} else if( screen_type == 1 ) { // sphere
double distance_to_center = 0.;
for( unsigned int idim=0; idim<ndim; idim++ ) {
distance_to_center += pow( patch->center_[idim] - screen_point[idim], 2 );
distance_to_center += ( patch->center_[idim] - screen_point[idim] ) * ( patch->center_[idim] - screen_point[idim] );
}
distance_to_center = sqrt( distance_to_center );
if( abs( screen_vectornorm - distance_to_center ) > patch->radius ) {
Expand All @@ -196,10 +196,10 @@ void DiagnosticScreen::run( Patch *patch, int, SimWindow *simWindow )
} else if( screen_type == 2 ) { // cylinder
double distance_to_axis = 0.;
for( unsigned int idim=0; idim<ndim; idim++ ) {
distance_to_axis += pow(
( patch->center_[(idim+1)%ndim] - screen_point[(idim+1)%ndim] ) * screen_unitvector[(idim+2)%ndim]
-( patch->center_[(idim+2)%ndim] - screen_point[(idim+2)%ndim] ) * screen_unitvector[(idim+1)%ndim]
, 2 );

distance_to_axis += ( ( patch->center_[(idim+1)%ndim] - screen_point[(idim+1)%ndim] ) * screen_unitvector[(idim+2)%ndim]
-( patch->center_[(idim+2)%ndim] - screen_point[(idim+2)%ndim] ) * screen_unitvector[(idim+1)%ndim] ) * ( ( patch->center_[(idim+1)%ndim] - screen_point[(idim+1)%ndim] ) * screen_unitvector[(idim+2)%ndim]
-( patch->center_[(idim+2)%ndim] - screen_point[(idim+2)%ndim] ) * screen_unitvector[(idim+1)%ndim] );
}
distance_to_axis = sqrt( distance_to_axis );
if( abs( screen_vectornorm - distance_to_axis ) > patch->radius ) {
Expand Down Expand Up @@ -260,8 +260,9 @@ void DiagnosticScreen::run( Patch *patch, int, SimWindow *simWindow )
double side_old = 0.;
double dtg = dt / s->particles->LorentzFactor( ipart );
for( unsigned int idim=0; idim<ndim; idim++ ) {
side += pow( s->particles->Position[idim][ipart] - screen_point[idim], 2 );
side_old += pow( s->particles->Position[idim][ipart] - dtg*( s->particles->Momentum[idim][ipart] ) - screen_point[idim], 2 );
side += ( s->particles->Position[idim][ipart] - screen_point[idim] ) * ( s->particles->Position[idim][ipart] - screen_point[idim] );
side_old += ( s->particles->Position[idim][ipart] - dtg*( s->particles->Momentum[idim][ipart] ) - screen_point[idim] ) *
( s->particles->Position[idim][ipart] - dtg*( s->particles->Momentum[idim][ipart] ) - screen_point[idim] ) ;
}
side = screen_vectornorm-sqrt( side );
side_old = screen_vectornorm-sqrt( side_old );
Expand All @@ -284,10 +285,12 @@ void DiagnosticScreen::run( Patch *patch, int, SimWindow *simWindow )
for( unsigned int idim=0; idim<ndim; idim++ ) {
double u1 = s->particles->Position[(idim+1)%ndim][ipart] - screen_point[(idim+1)%ndim];
double u2 = s->particles->Position[(idim+2)%ndim][ipart] - screen_point[(idim+2)%ndim];
side += pow( u1 * screen_unitvector[(idim+2)%ndim] - u2 * screen_unitvector[(idim+1)%ndim], 2 );
side += ( u1 * screen_unitvector[(idim+2)%ndim] - u2 * screen_unitvector[(idim+1)%ndim] ) *
( u1 * screen_unitvector[(idim+2)%ndim] - u2 * screen_unitvector[(idim+1)%ndim] );
u1 -= dtg * s->particles->Momentum[(idim+1)%ndim][ipart];
u2 -= dtg * s->particles->Momentum[(idim+1)%ndim][ipart];
side_old += pow( u1 * screen_unitvector[(idim+2)%ndim] - u2 * screen_unitvector[(idim+1)%ndim], 2 );
side_old += ( u1 * screen_unitvector[(idim+2)%ndim] - u2 * screen_unitvector[(idim+1)%ndim] ) *
( u1 * screen_unitvector[(idim+2)%ndim] - u2 * screen_unitvector[(idim+1)%ndim] );
}
side = r2 - side;
side_old = r2 - side_old;
Expand Down
Loading

0 comments on commit 3989d92

Please sign in to comment.