f# - Parallel Filtering of CUDA Array (Compute Architecture) -

April 15, 2015

i trying develop framework performing standard functional sequence operations (map, filter, fold, etc) in f# computations performed on gpu (using cuda) instead of cpu.

i've had success implementing cuda map function such can write

let result =      cudaarray      |> cudaarray.map <@ fun x -> x ** 3.0  @>      |> array.ofcudaarray

this relatively straightforward because of operations performed elementwise.

now, interested in writing similar system filtering predicate. i'm thinking implementing predicate map function returns boolean array need find way of reducing array of interest 1 matching boolean array element has value of true.

i found article (http://devblogs.nvidia.com/parallelforall/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/) describes looks nice method of solving problem, boils down using following function index elements in destination array:

// warp-aggregated atomic increment __device__ int atomicagginc(int *ctr) {   int mask = __ballot(1);   // select leader   int leader = __ffs(mask) – 1;   // leader update   int res;   if(lane_id() == leader)     res = atomicadd(ctr, __popc(mask));   // broadcast result   res = warp_bcast(res, leader);   // each thread computes own value   return res + __popc(mask & ((1 << lane_id()) – 1)); } // atomicagginc

the problem is, understand article warp_bcast supported compute capability 3.0 cards or greater.

i interested know if there there alternative warp_bcast compute capability < 3.0 or if there other approach can use solve problem without sacrificing of huge performance gains described in article? (to clear, i'm absolutely open different approaches 1 described in article listed if can me crack this.)

Search This Blog

Call

f# - Parallel Filtering of CUDA Array (Compute Architecture) -

Comments

Post a Comment

Popular posts from this blog

node.js - Using Node without global install -

php - CakePHP HttpSockets send array of paramms -

angularjs - ADAL JS Angular- WebAPI add a new role claim to the token -