You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello. I hit a strange behavior when my app crashes and it took me a while to figure out what's going on.
I think everything starts in DOGM::gridCellOccupancyUpdate function. Here the accumulate(weight_array, weights_accum); function is executed which calls thrust::inclusive_scan. The thrust documentation says:
inclusive_scan is similar to std::partial_sum in the STL. The primary difference between the two functions is that std::partial_sum guarantees a serial summation order, while inclusive_scan requires associativity of the binary operation to parallelize the prefix sum.
Results are not deterministic for pseudo-associative operators (e.g., addition of floating-point types). Results for pseudo-associative operators may vary from run to run.
After the inclusive scan it can happen that weights_accum array has NON-INCREASING values, i.e. weights_accum[i] < weights_accum[i - 1], in other words, the next value in the array can be greater than the previous. Later this array is used in gridCellPredictionUpdateKernel kernel. Because weight_array_accum has the non-increasing numbers, the line
can return a negative value. I can observe this with simple printf in the kernel. The negative m_occ_pred can lead to a negative rho_b value (also observed by printing the value), which is stored in born_masses_array.
Later in DOGM::initializeNewParticles() the array born_masses_array is used in inclusive scan to update particle_orders_accum, which is later used in normalize_particle_orders function. Inside this function it is assumed that the last value of the array is the maximum and the normalization happens:
void normalize_particle_orders(float* particle_orders_array_accum, int particle_orders_count, int v_B)
{
thrust::device_ptr<float> particle_orders_accum(particle_orders_array_accum);
float max = 1.0f;
cudaMemcpy(&max, &particle_orders_array_accum[particle_orders_count - 1], sizeof(float), cudaMemcpyDeviceToHost);
thrust::transform(
particle_orders_accum, particle_orders_accum + particle_orders_count, particle_orders_accum,
GPU_LAMBDA(float x) { return x * (v_B / max); });
}
But because born_masses_array has negative values, it can happen that the last value of particle_orders_array_accum is NOT maximum. This can lead that after the thrust::transform, the particle_orders_array_accum array can have values, greater than v_B (which is new_born_particle_count).
Later the array particle_orders_array_accum is passed to the initNewParticlesKernel1. Inside the kernel the start_idx is calculated, which, as mentioned, can be greater than new_born_particle_count (observed with printf inside the kernel). Next, this wrong index is used to update the birth_particle_array and this causes write to out-of-bounds write - compute sanitizer complains in this kernel and later a thrust::vector throws an error, because it's designed to throw in a destructor.
The problem is that all this is difficult to reproduce in a small demo, but in our bigger application it crashes all the time. Atm, I'm simply checking rho_b and set it to 0 in case it's negative.
The text was updated successfully, but these errors were encountered:
Hello. I hit a strange behavior when my app crashes and it took me a while to figure out what's going on.
I think everything starts in DOGM::gridCellOccupancyUpdate function. Here the
accumulate(weight_array, weights_accum);
function is executed which callsthrust::inclusive_scan
. Thethrust
documentation says:After the inclusive scan it can happen that
weights_accum
array has NON-INCREASING values, i.e.weights_accum[i] < weights_accum[i - 1]
, in other words, the next value in the array can be greater than the previous. Later this array is used in gridCellPredictionUpdateKernel kernel. Becauseweight_array_accum
has the non-increasing numbers, the linem_occ_pred = subtract(weight_array_accum, start_idx, end_idx);
can return a negative value. I can observe this with simple
printf
in the kernel. The negativem_occ_pred
can lead to a negativerho_b
value (also observed by printing the value), which is stored inborn_masses_array
.Later in DOGM::initializeNewParticles() the array
born_masses_array
is used in inclusive scan to updateparticle_orders_accum
, which is later used in normalize_particle_orders function. Inside this function it is assumed that the last value of the array is the maximum and the normalization happens:But because
born_masses_array
has negative values, it can happen that the last value ofparticle_orders_array_accum
is NOT maximum. This can lead that after thethrust::transform
, theparticle_orders_array_accum
array can have values, greater thanv_B
(which isnew_born_particle_count
).Later the array
particle_orders_array_accum
is passed to the initNewParticlesKernel1. Inside the kernel the start_idx is calculated, which, as mentioned, can be greater thannew_born_particle_count
(observed withprintf
inside the kernel). Next, this wrong index is used to update the birth_particle_array and this causes write to out-of-bounds write - compute sanitizer complains in this kernel and later athrust::vector
throws an error, because it's designed to throw in a destructor.The problem is that all this is difficult to reproduce in a small demo, but in our bigger application it crashes all the time. Atm, I'm simply checking
rho_b
and set it to 0 in case it's negative.The text was updated successfully, but these errors were encountered: