Function cuOccupancyMaxPotentialBlockSize

Source
pub unsafe extern "C" fn cuOccupancyMaxPotentialBlockSize(
    minGridSize: *mut c_int,
    blockSize: *mut c_int,
    func: CUfunction,
    blockSizeToDynamicSMemSize: CUoccupancyB2DSize,
    dynamicSMemSize: usize,
    blockSizeLimit: c_int,
) -> CUresult
Expand description

\brief Suggest a launch configuration with reasonable occupancy

Returns in \p *blockSize a reasonable block size that can achieve the maximum occupancy (or, the maximum number of active warps with the fewest blocks per multiprocessor), and in \p *minGridSize the minimum grid size to achieve the maximum occupancy.

If \p blockSizeLimit is 0, the configurator will use the maximum block size permitted by the device / function instead.

If per-block dynamic shared memory allocation is not needed, the user should leave both \p blockSizeToDynamicSMemSize and \p dynamicSMemSize as 0.

If per-block dynamic shared memory allocation is needed, then if the dynamic shared memory size is constant regardless of block size, the size should be passed through \p dynamicSMemSize, and \p blockSizeToDynamicSMemSize should be NULL.

Otherwise, if the per-block dynamic shared memory size varies with different block sizes, the user needs to provide a unary function through \p blockSizeToDynamicSMemSize that computes the dynamic shared memory needed by \p func for any given block size. \p dynamicSMemSize is ignored. An example signature is:

\code // Take block size, returns dynamic shared memory needed size_t blockToSmem(int blockSize); \endcode

Note that the API can also be used with context-less kernel ::CUkernel by querying the handle using ::cuLibraryGetKernel() and then passing it to the API by casting to ::CUfunction. Here, the context to use for calculations will be the current context.

\param minGridSize - Returned minimum grid size needed to achieve the maximum occupancy \param blockSize - Returned maximum block size that can achieve the maximum occupancy \param func - Kernel for which launch configuration is calculated \param blockSizeToDynamicSMemSize - A function that calculates how much per-block dynamic shared memory \p func uses based on the block size \param dynamicSMemSize - Dynamic shared memory usage intended, in bytes \param blockSizeLimit - The maximum block size \p func is designed to handle

\return ::CUDA_SUCCESS, ::CUDA_ERROR_DEINITIALIZED, ::CUDA_ERROR_NOT_INITIALIZED, ::CUDA_ERROR_INVALID_CONTEXT, ::CUDA_ERROR_INVALID_VALUE, ::CUDA_ERROR_UNKNOWN \notefnerr

\sa ::cudaOccupancyMaxPotentialBlockSize