MtxVec VCL
|
Submits the Kernel to cmdQueue for computation with specified WorkSize.
Setting CPUAdjust to true will reduce WorkSize by factor OPENCL_BLOCKLEN and assume presence of kernel internal for-loops. Kernel internal for-loops can significantly speed up execution of the kernel on CPU devices lowering the function call overhead. The CPUAdjust parameter is used only if the device is of CPU type. The for-loop pattern expected inside the kernel looks like this:
where BLOCK_LEN matches OPENCL_BLOCKLEN.
Copyright (c) 1999-2025 by Dew Research. All rights reserved.
|
What do you think about this topic? Send feedback!
|