LHC Tunnel

LHC Tunnel

Tuesday, 25 July 2017

Nested quota models

At the Boston Forum, there were many interesting discussions on models which could be used for nested quota management (https://etherpad.openstack.org/p/BOS-forum-quotas).

Some of the background for the use has been explained previously in the blog (http://openstack-in-production.blogspot.fr/2016/04/resource-management-at-cern.html), but the subsequent discussions have also led to further review.

With the agreement to store the quota limits in Keystone
(https://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/unified-limits.html), the investigations are now focussing on the exact semantics of nested project quotas. This becomes especially true as the nesting levels go beyond 2.

There are a variety of different perspectives on the complex problem such that there is not yet a consensus on the right model. The policy itself should be replaceable so that different use cases can implement alternative algorithms according to their needs.

The question we are faced with is the default policy engine to implement. This is a description of some scenarios considered at CERN.

The following use cases in the context of the CERN cloud
  1. An LHC experiment, such as ATLAS, is given a pledge of resources of X vCPUs. They should decide the priority of allocation of resources across their working groups (e.g. ATLAS Higgs studies) and their applications. These allocations between different ATLAS teams should not be arbitrated by the CERN cloud provider but more within the envelope of the allocation for the experiment and the ratio of capacity decided by the ATLAS experiment. This would produce typically around 50 child projects for each parent and a nesting level of 2 to 3.
  2. A new user to the CERN cloud is allocated a limit for small resource (up to 10 cores typically) for testing and prototyping. This means a new user can experiment with how to create VMs, use containers through Magnum and Ceph shares and block storage. This can lead to under-utilised resources (such as tests where the resources have not been deleted afterwards) or inappropriate usage such as 'crowd-sourcing' resources for production. With thousands of users of the CERN cloud, this can become a significant share of the cloud. Using nested projects, these users could be associated with an experiment and a total cap placed on their usage. The experiment would then arbitrate between different users. This would give up to 400 child projects per parent and a nesting level of 2.
We would not expect nesting levels to exceed 3 based on the current scenarios.

Currently, we have around 3,400 projects in the CERN cloud.

A similar use case is for the channel partner models for some SMEs using the cloud where the SME owns the parent project with the cloud provider and then allocates out to customers of the SME's services (such as a useful dataset) and charges the customers an uplift on the cloud providers standard pricing to cover their development and operations costs.

Looking through the review of the different options (https://review.openstack.org/#/c/441203/), from the CERN perspective,
  • CERN child projects would be initially created with a limit of 0. Having more than 0 would mean potentially that projects could not be created in the event of the parent project quota being exceeded.
  • We would like to have resources allocated at any level of the tree. If I have a project which I want to split into two sub-projects, I would need to create the two sub-projects and then arrange to move/re-create the VMs. Requiring the resources to only be in the leaves would make this operation difficult to ensure application availability (thus I agree with Chet in the review).
  • Overbooking should not be permitted in the default policy engine. Thus, the limits on the child projects should sum up to less than the limit on the parent. This was a topic of much debate in the CERN team but it was felt that permitting overbooking would require a full traversal of the tree for each new resource creation which would be very expensive in cases like the Personal tenants. It also makes the limits on a project visible to the user of that project rather than seeing an out of quota error because a project higher up in the tree has a restriction.
  • The limits for a project should be set, at minimum, by the parent project administrator. It is not clear for CERN that there would be a use case that, in a 3 or more level tree, the administrators higher up the tree than the parent project would need to be able to change a lower project limits. A policy.json for setting the limit would allow a mixture of implementations if needed.
  • It should be possible for an administrator to lower the limit on a child project below the current usage. This allows a resource co-ordinator to make the decision regarding resource allocation and inform the child project administrators to proceed with implementation. Any project with usage over its limit would not be able to create new resources. This would also be the natural semantics in the unified limits proposal where the limits moved to Keystone and avoid having callbacks to the relevant project when changes are made to the limits.
  • Each project would have one parent in a tree like structure
  • There is no need for user_id limits so the only unit to consider is the project. The one use case where we had been considering on using this is now replaced by the 2nd example given above.
  • Given the implementation constraints, these can be parts of a new API but
    • Limits would be stored in Keystone and thus any call back to Nova, Cinder, Swift, … would be discouraged
    • A chatty protocol on resource creation which required multiple iterations with the service is non-ideal.

Within the options described in the review, this comes near to the Strict Hierarchy Default Closed model.

For consistency, I'll define the following terms:
  • Limit is the maximum amount of a particular resource which can be consumed by a project. This is assumed to be stored in Keystone as an attribute of the project.
  • Used is the resources actually used by the project itself.
  • Hierarchical Usage is the usage of all of the child projects below the project in the tree.

To give an example, the ATLAS project has a limit (L) of 100 cores but no resources used (U) inside that project. However, it has several child projects, Physics and Operations which have resources summing into the hierarchical usage (HU). These each have two child projects with resources allocated and limits.

The following rules would be applied

·      The quota administrator would not be able to increase the limit of a child project such that the sum of the limits of the child project exceeds the parent.
o   The ATLAS administrator could not increase the limit for either Physics or Operations
o   The Physics administrator could increase the limit for either the Higgs or Simulation project by 10
o   The Operations administrator could not increase the limit for either Workflow or Web
·      The quota administrator cannot set the limit on a project such that the limits on the children exceeds the parent limit
o   The ATLAS administrator could not set the Operations limit to be 50 as Limit(Workflow)+Limit(Web)>Limit(Operations)
·      The quota administrator can set the limit below the usage
o   The Physics administrator could lower the Simulation limit to 5 even though the used resources at 10
·      Creating a new resource requires that the usage in the project is less than or equal to the limit at all levels of the tree
o   No additional resources could be created in Simulation since Used(Simulation)>=Limit(Simulation)
o   No additional resources could be created in Higgs as HierarchicalUsage(Physics)>=Limit(Physics). The error message for this case would need to indicate the quota limit is in Physics.
o   Up to 25 new resources could be created in Web since Usage(Web)+25<=Limit(Web), HierarchicalUsage(Operations)+25<=Limit(Operations) and HierarchicalUsage(ATLAS)+25<=Limit(ATLAS). After this operation,
§  Used(Web)=30
§  HierarchicalUsage(Operations)=60
§  HierarchicalUsage(ATLAS)=80

Based on past experience with Nova quotas, the aim would be to calculate all the usages (both Used and HierarchicalUsage dynamically at resource creation time). The calculation of the hierarchical usage could be expensive however since it requires the navigation of the whole tree for each resource creation. Some additional Keystone calls to get the entire project tree would be needed. This may limit the use in large scale environments but the performance would only be affected where the functionality was used.