Function cudaOccupancyMaxActiveClusters

Source
pub unsafe extern "C" fn cudaOccupancyMaxActiveClusters(
    numClusters: *mut c_int,
    func: *const c_void,
    launchConfig: *const cudaLaunchConfig_t,
) -> cudaError_t
Expand description

\brief Given the kernel function (\p func) and launch configuration (\p config), return the maximum number of clusters that could co-exist on the target device in \p *numClusters.

If the function has required cluster size already set (see ::cudaFuncGetAttributes), the cluster size from config must either be unspecified or match the required size. Without required sizes, the cluster size must be specified in config, else the function will return an error.

Note that various attributes of the kernel function may affect occupancy calculation. Runtime environment may affect how the hardware schedules the clusters, so the calculated occupancy is not guaranteed to be achievable.

\param numClusters - Returned maximum number of clusters that could co-exist on the target device \param func - Kernel function for which maximum number of clusters are calculated \param config - Launch configuration for the given kernel function

\return ::cudaSuccess, ::cudaErrorInvalidDeviceFunction, ::cudaErrorInvalidValue, ::cudaErrorInvalidClusterSize, ::cudaErrorUnknown, \notefnerr \note_init_rt \note_callback

\sa ::cudaFuncGetAttributes \ref ::cudaOccupancyMaxActiveClusters(int*, T, const cudaLaunchConfig_t*) “cudaOccupancyMaxActiveClusters (C++ API)”, ::cuOccupancyMaxActiveClusters