-
Notifications
You must be signed in to change notification settings - Fork 21
Open
Labels
Description
How does OpenMP treat nested loops? For example, in the famous mat_mul() example, how would it be treated
void mat_mul(double * out, double * A, double * B, int M, int N, int K)
{
omp_set_num_threads(M * K); // one thread per output element
#pragma omp parallel for \
schedule(static, N)
for(int i = 0; i < M; ++i)
for(int k = 0; k < K; ++k)
for(int j = 0; j < N; ++j) {
// no need for critical section since we have one thread per output element
// what is bad here is that we do not make use of space locality
out[i*K + k] += A[i*N + j] * B[j*K + k];
}
}