__syncthreads()
, it will wait there for the rest of the threads within ints block to reach the same point—this allows for the sharing of information__syncthreads()
is that using it within if-else
statements can lead to undefined behavior. in the example below
void incorrect_barrier(int n) {
...
if (threadIdx.x % 2 == 0) {
...
__syncthreads()
}
else {
...
__syncthreads()
}
}
either all the threads in the block will execute the if
path or the else
path
threadIdx
) are assigned to the same warpthreadIdx.x
- or whatever coordinate - is within the bounds of the input array)for
loops if different threads complete their iterations early. In this case, they’re deactivated on subsequent stepsthreadIdx
is within the bounds of the data), the amount of impact it has on runtime decreases as the overall size of the input data (total number of threads) grows__syncwarp()