You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am having segmentation faults with a setup that use Particle Injectors. I see segmentation faults on two machines but on another machine surprisingly it could work (short run). I attach setup file, with together with stdout and std err files on Raven(.4464174) and Justus (8714049) machines. On Raven, I used intel/21.2.0 impi/2021.2 mkl/2021.2 anaconda/3/2020.02 hdf5-mpi/1.12.0 while on Justus I used lib/hdf5/1.12.1-intel-19.1.2-impi-2019.8 and numlib/python_scipy/1.5.0_numpy-1.19.0_python-3.8.3 modules.
A colleague at Justus cluster administration compiled the current Smilei (fetched from Github) with the debug option
and collected the following backtrace of the crashed simulation run
(gdb) bt #0 raise (sig=11) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00000000012f56f7 in backward::SignalHandling::sig_handler (signo=11, info=0x150c97ff7f70, _ctx=0x150c97ff7e40) at src/Tools/backward.hpp:2260 #2 <signal handler called> #3 0x000000000064621f in std::vector<double, std::allocator<double> >::data (this=0x0) at /usr/include/c++/8/bits/stl_vector.h:1056 #4 0x000000000107b5c3 in Particles::getPtrPosition (this=0x150c88007eb0, idim=0) at src/Particles/Particles.h:429 #5 0x00000000010e9253 in VectorPatch::injectParticlesFromBoundaries (this=0x7ffca0e7b908, params=..., timers=..., itime=1) at src/Patch/VectorPatch.cpp:699 #6 0x000000000129c68e in L_main_534__par_region4_2_45 () at src/Smilei.cpp:544 #7 0x0000150d42f3e3f3 in __kmp_invoke_microtask () from /opt/bwhpc/common/compiler/intel/compxe.2020.2.254/compilers_and_libraries_2020.2.254/linux/compiler/lib/intel64_lin/libiomp5.so #8 0x0000150d42ec2273 in __kmp_invoke_task_func (gtid=0) at ../../src/kmp_runtime.cpp:7515 #9 0x0000150d42ec121e in __kmp_launch_thread (this_thr=0x0) at ../../src/kmp_runtime.cpp:6109 #10 0x0000150d42f3e8cc in _INTERNAL8aaf6219::__kmp_launch_worker (thr=0x0) at ../../src/z_Linux_util.cpp:593 #11 0x0000150d4573e1cf in start_thread (arg=<optimized out>) at pthread_create.c:479 #12 0x0000150d42869e73 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (gdb)
Can you please investigate this? I believe this might be also relevant for #593, #521 and also for #293 that I had raised in past. Please let me know if you need more information.
I found the bug and indeed it could be the same as #593, but I am unsure it would also affect the other issues you mentioned. The bug seemed to be caused by other changes elsewhere in the code (not that long ago).
Anyways it should be fixed in the develop branch -> 9f6da66
Description
I am having segmentation faults with a setup that use Particle Injectors. I see segmentation faults on two machines but on another machine surprisingly it could work (short run). I attach setup file, with together with stdout and std err files on Raven(.4464174) and Justus (8714049) machines. On Raven, I used intel/21.2.0 impi/2021.2 mkl/2021.2 anaconda/3/2020.02 hdf5-mpi/1.12.0 while on Justus I used lib/hdf5/1.12.1-intel-19.1.2-impi-2019.8 and numlib/python_scipy/1.5.0_numpy-1.19.0_python-3.8.3 modules.
A colleague at Justus cluster administration compiled the current Smilei (fetched from Github) with the debug option
module load lib/hdf5/1.12.1-intel-19.1.2-impi-2019.8 HDF5 1.12.1 has been loaded $ export PYTHONEXE=python3 $ export HDF5_ROOT_DIR=$HDF5_HOME [$ make config=debug env VERSION : 4.7-248-ge563595d9-master SMILEICXX : mpicxx OPENMP_FLAG : -fopenmp -D_OMP HDF5_ROOT_DIR : /opt/bwhpc/common/lib/hdf5/1.12.1-intel-19.1.2-impi-2019.8 FFTW3_LIB_DIR : SITEDIR : /home/xx/xx_xxxxxx/xx_xx/.local/lib/python3.6/site-packages PYTHONEXE : python3 PY_CXXFLAGS : -I/usr/include/python3.6m -I/usr/include/python3.6m -I/usr/lib64/python3.6/site-packages/numpy/core/include -DSMILEI_USE_NUMPY -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION PY_LDFLAGS : -lpython3.6m -lpthread -ldl -lutil -lm -Xlinker -export-dynamic CXXFLAGS : -D__VERSION=\"4.7-248-ge563595d9-master\" -std=c++11 -Wall -I/opt/bwhpc/common/lib/hdf5/1.12.1-intel-19.1.2-impi-2019.8/include -Isrc -Isrc/Profiles -Isrc/MultiphotonBreitWheeler -Isrc/ElectroMagnSolver -Isrc/ParticleBC -Isrc/MovWindow -Isrc/Radiation -Isrc/DomainDecomposition -Isrc/Collisions -Isrc/SmileiMPI -Isrc/Patch -Isrc/PartCompTime -Isrc/Tools -Isrc/ElectroMagnBC -Isrc/ParticleInjector -Isrc/Field -Isrc/Merging -Isrc/Diagnostic -Isrc/Particles -Isrc/Python -Isrc/Pusher -Isrc/ElectroMagn -Isrc/Interpolator -Isrc/Projector -Isrc/Ionization -Isrc/Params -Isrc/Species -Isrc/Checkpoint -Isrc/picsar_interface -Ibuild/src/Python -I/usr/include/python3.6m -I/usr/include/python3.6m -I/usr/lib64/python3.6/site-packages/numpy/core/include -DSMILEI_USE_NUMPY -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -g -pg -D__DEBUG -O0 -fopenmp -D_OMP LDFLAGS : -L/opt/bwhpc/common/lib/hdf5/1.12.1-intel-19.1.2-impi-2019.8/lib -lhdf5 -lpython3.6m -lpthread -ldl -lutil -lm -Xlinker -export-dynamic -lm -fopenmp -D_OMP $ make config=debug -j8
and collected the following backtrace of the crashed simulation run
(gdb) bt #0 raise (sig=11) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00000000012f56f7 in backward::SignalHandling::sig_handler (signo=11, info=0x150c97ff7f70, _ctx=0x150c97ff7e40) at src/Tools/backward.hpp:2260 #2 <signal handler called> #3 0x000000000064621f in std::vector<double, std::allocator<double> >::data (this=0x0) at /usr/include/c++/8/bits/stl_vector.h:1056 #4 0x000000000107b5c3 in Particles::getPtrPosition (this=0x150c88007eb0, idim=0) at src/Particles/Particles.h:429 #5 0x00000000010e9253 in VectorPatch::injectParticlesFromBoundaries (this=0x7ffca0e7b908, params=..., timers=..., itime=1) at src/Patch/VectorPatch.cpp:699 #6 0x000000000129c68e in L_main_534__par_region4_2_45 () at src/Smilei.cpp:544 #7 0x0000150d42f3e3f3 in __kmp_invoke_microtask () from /opt/bwhpc/common/compiler/intel/compxe.2020.2.254/compilers_and_libraries_2020.2.254/linux/compiler/lib/intel64_lin/libiomp5.so #8 0x0000150d42ec2273 in __kmp_invoke_task_func (gtid=0) at ../../src/kmp_runtime.cpp:7515 #9 0x0000150d42ec121e in __kmp_launch_thread (this_thr=0x0) at ../../src/kmp_runtime.cpp:6109 #10 0x0000150d42f3e8cc in _INTERNAL8aaf6219::__kmp_launch_worker (thr=0x0) at ../../src/z_Linux_util.cpp:593 #11 0x0000150d4573e1cf in start_thread (arg=<optimized out>) at pthread_create.c:479 #12 0x0000150d42869e73 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (gdb)
Can you please investigate this? I believe this might be also relevant for #593, #521 and also for #293 that I had raised in past. Please let me know if you need more information.
d15_th75_mi25.py.txt
tjob_hybrid.err.4464174.txt
tjob_hybrid.out.4464174.txt
tjob_hybrid-8714049.err.txt
tjob_hybrid-8714049.out.txt
The text was updated successfully, but these errors were encountered: