Function cuOccupancyMaxPotentialBlockSizeWithFlags

Source
pub unsafe extern "C" fn cuOccupancyMaxPotentialBlockSizeWithFlags(
    minGridSize: *mut c_int,
    blockSize: *mut c_int,
    func: CUfunction,
    blockSizeToDynamicSMemSize: CUoccupancyB2DSize,
    dynamicSMemSize: usize,
    blockSizeLimit: c_int,
    flags: c_uint,
) -> CUresult
Expand description

\brief Suggest a launch configuration with reasonable occupancy

An extended version of ::cuOccupancyMaxPotentialBlockSize. In addition to arguments passed to ::cuOccupancyMaxPotentialBlockSize, ::cuOccupancyMaxPotentialBlockSizeWithFlags also takes a \p Flags parameter.

The \p Flags parameter controls how special cases are handled. The valid flags are:

  • ::CU_OCCUPANCY_DEFAULT, which maintains the default behavior as ::cuOccupancyMaxPotentialBlockSize;

  • ::CU_OCCUPANCY_DISABLE_CACHING_OVERRIDE, which suppresses the default behavior on platform where global caching affects occupancy. On such platforms, the launch configurations that produces maximal occupancy might not support global caching. Setting ::CU_OCCUPANCY_DISABLE_CACHING_OVERRIDE guarantees that the the produced launch configuration is global caching compatible at a potential cost of occupancy. More information can be found about this feature in the “Unified L1/Texture Cache” section of the Maxwell tuning guide.

Note that the API can also be used with context-less kernel ::CUkernel by querying the handle using ::cuLibraryGetKernel() and then passing it to the API by casting to ::CUfunction. Here, the context to use for calculations will be the current context.

\param minGridSize - Returned minimum grid size needed to achieve the maximum occupancy \param blockSize - Returned maximum block size that can achieve the maximum occupancy \param func - Kernel for which launch configuration is calculated \param blockSizeToDynamicSMemSize - A function that calculates how much per-block dynamic shared memory \p func uses based on the block size \param dynamicSMemSize - Dynamic shared memory usage intended, in bytes \param blockSizeLimit - The maximum block size \p func is designed to handle \param flags - Options

\return ::CUDA_SUCCESS, ::CUDA_ERROR_DEINITIALIZED, ::CUDA_ERROR_NOT_INITIALIZED, ::CUDA_ERROR_INVALID_CONTEXT, ::CUDA_ERROR_INVALID_VALUE, ::CUDA_ERROR_UNKNOWN \notefnerr

\sa ::cudaOccupancyMaxPotentialBlockSizeWithFlags