Function cuDevSmResourceSplitByCount

Source
pub unsafe extern "C" fn cuDevSmResourceSplitByCount(
    result: *mut CUdevResource,
    nbGroups: *mut c_uint,
    input: *const CUdevResource,
    remaining: *mut CUdevResource,
    useFlags: c_uint,
    minCount: c_uint,
) -> CUresult
Expand description

\brief Splits \p CU_DEV_RESOURCE_TYPE_SM resources.

Splits \p CU_DEV_RESOURCE_TYPE_SM resources into \p nbGroups, adhering to the minimum SM count specified in \p minCount and the usage flags in \p useFlags. If \p result is NULL, the API simulates a split and provides the amount of groups that would be created in \p nbGroups. Otherwise, \p nbGroups must point to the amount of elements in \p result and on return, the API will overwrite \p nbGroups with the amount actually created. The groups are written to the array in \p result. \p nbGroups can be less than the total amount if a smaller number of groups is needed.

This API is used to spatially partition the input resource. The input resource needs to come from one of ::cuDeviceGetDevResource, ::cuCtxGetDevResource, or ::cuGreenCtxGetDevResource. A limitation of the API is that the output results cannot be split again without first creating a descriptor and a green context with that descriptor.

When creating the groups, the API will take into account the performance and functional characteristics of the input resource, and guarantee a split that will create a disjoint set of symmetrical partitions. This may lead to fewer groups created than purely dividing the total SM count by the \p minCount due to cluster requirements or alignment and granularity requirements for the minCount.

The \p remainder set does not have the same functional or performance guarantees as the groups in \p result. Its use should be carefully planned and future partitions of the \p remainder set are discouraged.

The following flags are supported:

  • \p CU_DEV_SM_RESOURCE_SPLIT_IGNORE_SM_COSCHEDULING : Lower the minimum SM count and alignment, and treat each SM independent of its hierarchy. This allows more fine grained partitions but at the cost of advanced features (such as large clusters on compute capability 9.0+).
  • \p CU_DEV_SM_RESOURCE_SPLIT_MAX_POTENTIAL_CLUSTER_SIZE : Compute Capability 9.0+ only. Attempt to create groups that may allow for maximally sized thread clusters. This can be queried post green context creation using ::cuOccupancyMaxPotentialClusterSize.

A successful API call must either have:

  • A valid array of \p result pointers of size passed in \p nbGroups, with \p input of type \p CU_DEV_RESOURCE_TYPE_SM. Value of \p minCount must be between 0 and the SM count specified in \p input. \p remaining may be NULL.
  • NULL passed in for \p result, with a valid integer pointer in \p nbGroups and \p input of type \p CU_DEV_RESOURCE_TYPE_SM. Value of \p minCount must be between 0 and the SM count specified in \p input. \p remaining may be NULL. This queries the number of groups that would be created by the API.

Note: The API is not supported on 32-bit platforms.

\param result - Output array of \p CUdevResource resources. Can be NULL to query the number of groups. \param nbGroups - This is a pointer, specifying the number of groups that would be or should be created as described below. \param input - Input SM resource to be split. Must be a valid \p CU_DEV_RESOURCE_TYPE_SM resource. \param remaining - If the input resource cannot be cleanly split among \p nbGroups, the remaining is placed in here. Can be ommitted (NULL) if the user does not need the remaining set. \param useFlags - Flags specifying how these partitions are used or which constraints to abide by when splitting the input. Zero is valid for default behavior. \param minCount - Minimum number of SMs required

\return ::CUDA_SUCCESS, ::CUDA_ERROR_DEINITIALIZED, ::CUDA_ERROR_NOT_INITIALIZED, ::CUDA_ERROR_INVALID_DEVICE, ::CUDA_ERROR_INVALID_VALUE, ::CUDA_ERROR_INVALID_RESOURCE_TYPE, ::CUDA_ERROR_INVALID_RESOURCE_CONFIGURATION

\sa ::cuGreenCtxGetDevResource, ::cuCtxGetDevResource, ::cuDeviceGetDevResource