Function cudaOccupancyMaxPotentialClusterSize

Source
pub unsafe extern "C" fn cudaOccupancyMaxPotentialClusterSize(
    clusterSize: *mut c_int,
    func: *const c_void,
    launchConfig: *const cudaLaunchConfig_t,
) -> cudaError_t
Expand description

\brief Given the kernel function (\p func) and launch configuration (\p config), return the maximum cluster size in \p *clusterSize.

The cluster dimensions in \p config are ignored. If func has a required cluster size set (see ::cudaFuncGetAttributes),\p *clusterSize will reflect the required cluster size.

By default this function will always return a value that’s portable on future hardware. A higher value may be returned if the kernel function allows non-portable cluster sizes.

This function will respect the compile time launch bounds.

\param clusterSize - Returned maximum cluster size that can be launched for the given kernel function and launch configuration \param func - Kernel function for which maximum cluster size is calculated \param config - Launch configuration for the given kernel function

\return ::cudaSuccess, ::cudaErrorInvalidDeviceFunction, ::cudaErrorInvalidValue, ::cudaErrorUnknown, \notefnerr \note_init_rt \note_callback

\sa ::cudaFuncGetAttributes \ref ::cudaOccupancyMaxPotentialClusterSize(int*, T, const cudaLaunchConfig_t*) “cudaOccupancyMaxPotentialClusterSize (C++ API)”, ::cuOccupancyMaxPotentialClusterSize