Multi-GPU programming strategies using CUDA -

i need advice on project going undertake. planning run simple kernels (yet decide, hinging on embarassingly parallel ones) on multi-gpu node using cuda 4.0 following strategies listed below. intention profile node, launching kernels in different strategies cuda provide on multi-gpu environment.

single host thread - multiple devices (shared context)
single host thread - concurrent execution of kernels on single device (shared context)
multiple host threads - (equal) multiple devices (independent contexts)
single host thread - sequential kernel execution on 1 device
multiple host threads - concurrent execution of kernels on 1 device (independent contexts)
multiple host threads - sequential execution of kernels on 1 device (independent contexts)

am missing out categories? opinion test categories have chosen , general advice w.r.t multi-gpu programming welcome.

thanks,
sayan

edit:

i thought previous categorization involved redundancy, modified it.

most workloads light enough on cpu work can juggle multiple gpus single thread, became possible starting cuda 4.0. before cuda 4.0, call cuctxpopcurrent()/cuctxpushcurrent() change context current given thread. starting cuda 4.0, can call cudasetdevice() set current context correspond given device.

your option 1) misnomer, though, because there no "shared context" - gpu contexts still separate , device memory , objects such cuda streams , cuda events affiliated gpu context in created.

Search This Blog

Barbera

Multi-GPU programming strategies using CUDA -

Comments

Post a Comment

Popular posts from this blog

c# - SharpSVN - How to get the previous revision? -

c++ - Is it possible to compile a VST on linux? -

url - Querystring manipulation of email Address in PHP -