c++ - Should I use gnu parallel mode function inside openMP parallel region(for-loop, tasks) -
i have program accelerated openmp
, inside parallel region, functions std::nth_element
, std::sort
, std::partition
called. actually, these functions used process each openmp-thread's corresponding part of array.
recently, found g++ had implemented parallel version of above functions, wonder should use function __gnu_parallel::nth_element
inside #pragma omp task
or #pragma omp for
region? if used parallel mode, total threads exceed limit set omp_set_num_threads()
, lead worse speedup?
trivial (and best) answer: benchmark , post findings.
less definitive: in experience, parallel versions of most algorithms less efficient comparable serial ones, instead relying on multiple parallel processors compensate in wall time. regarding number of threads, don't think omp spawn new threads if @ limit. remember embedded #pragma omp for
regions don't result in each of outer threads spawning more "inner threads" without specific flag (which don't remember off top of head).
Comments
Post a Comment