Function cuOccupancyMaxPotentialClusterSize

Source
pub unsafe extern "C" fn cuOccupancyMaxPotentialClusterSize(
    clusterSize: *mut c_int,
    func: CUfunction,
    config: *const CUlaunchConfig,
) -> CUresult
Expand description

\brief Given the kernel function (\p func) and launch configuration (\p config), return the maximum cluster size in \p *clusterSize.

The cluster dimensions in \p config are ignored. If func has a required cluster size set (see ::cudaFuncGetAttributes / ::cuFuncGetAttribute),\p *clusterSize will reflect the required cluster size.

By default this function will always return a value that’s portable on future hardware. A higher value may be returned if the kernel function allows non-portable cluster sizes.

This function will respect the compile time launch bounds.

Note that the API can also be used with context-less kernel ::CUkernel by querying the handle using ::cuLibraryGetKernel() and then passing it to the API by casting to ::CUfunction. Here, the context to use for calculations will either be taken from the specified stream \p config->hStream or the current context in case of NULL stream.

\param clusterSize - Returned maximum cluster size that can be launched for the given kernel function and launch configuration \param func - Kernel function for which maximum cluster size is calculated \param config - Launch configuration for the given kernel function

\return ::CUDA_SUCCESS, ::CUDA_ERROR_DEINITIALIZED, ::CUDA_ERROR_NOT_INITIALIZED, ::CUDA_ERROR_INVALID_CONTEXT, ::CUDA_ERROR_INVALID_VALUE, ::CUDA_ERROR_UNKNOWN \notefnerr

\sa ::cudaFuncGetAttributes, ::cuFuncGetAttribute