Using preemption or other mechanisms to enable urgent simulations on supercomputers is not new. However, the traditional procedure for implementing preemption is to run such jobs in a special queue for which access is only granted for a ﬁxed set of users. The policy, queue conﬁguration, and set of users on each machine, and particularly at each site, would need to be carefully negotiated (and usually frequently renegotiated). These procedures are usually not documented, thus it is difﬁcult and time consuming to add new users for urgent computing, or to change the conﬁguration of machines, for example to accommodate larger simulations. To resolve some of these issues, Special PRiority and Urgent Computing Environment (SPRUCE) 7 was implemented in the workflow. SPRUCE is a specialized software system to support urgent or event-driven computing on both traditional supercomputers and distributed Grids. It is being developed by the University of Chicago and Argonne National Laboratory and is presently functioning as a TeraGrid science gateway.
SPRUCE uses token based authentication system for resource allocation. Users are provided with right of way tokens, which are unique 16 character strings that can be activated through a web portal. The token is created on the CN value of the administrator. When a token is activated, there are other parameters that are set including:
- Resources for urgent jobs: the activated token can be used to access any resource that is specified in this list and can be used by any person registered in it.
- Lifetime of the token: Each token is given a specific time period. Once active, the token can be used during this time period.
- Maximum urgency that can be requested, specified by the colors red, orange and yellow
- People to be notified when the token is used (e.g., the local administrators)
SPRUCE is a grid middleware that integrates with the resource manager on the system. When SPRUCE is installed, the resource manager is equipped with an authentication filter that checks for a valid token on the corresponding user name or the Distinguished Name (DN). If a token is activated, the job is submitted to a queue of higher priority level.
The SCOOP on-demand system was demonstrated at the SuperComputing 2007 conference in Reno, Nevada using the resources of the SURAgrid and Louisiana Optical Network Initiative (LONI). The demo illustrated how a hurricane event triggered the use of on-demand resources, and how the priority-aware scheduler was able to schedule the runs on the appropriate queues in the appropriate order. The guarantee that a member runs as soon as data for it has been generated makes it possible to provide a guarantee that the set of runs chosen as high priority runs will complete before the six hour deadline. Other work in benchmarking the models on different architecture platforms was used to estimate the amount of CPU time that a model would need to complete given the number of on-demand processors available.
SPRUCE was used to acquire the on-demand processors on some resources, and highlighted several different advantages. For example, the SCOOP workflow was no longer tied to being run as certain special users. This also meant that there was no need for negotiating access to the on-demand queues with the resource owners. Also using SPRUCE provided the resource owners the ability to restrict the usage of the system in on-demand mode at the same time providing on-demand resources to any one who needs them. In the past this could only be done by adding and deleting user access on a case-by-case basis. SPRUCE tokens can now be handed out to users by an allocation committee, thus removing the burden of evaluating the need for on-demand resources by users from the system administrators.
Figures 7(a) and 7(b) show the execution and wait times for the various stages of execution of the SCOOP workflow. Figure 7(a) shows the execution with only best-effort resources. The pink bars depict the execution and queue wait times of the core Wave Watch III execution on eight processors. It can be seen that the queue wait times account for most of the total time. Figure 7(b) depicts the ensemble execution using on-demand resources. In this case, 16 processors were available for on-demand use, hence two ensemble members ran simultaneously while others waited for these to finish.
A closer look at the 7(b) graph indicates that ensemble members p38 and p02 executed first followed by p14 and e10. The lengths of the pink bars for p14 and e10 are double that of p38 and p02 showing that they began execution right after the first two members finished execution. Comparing the two graphs, the last run finished in about 700 seconds when using on-demand resources compared to a time of about 2100 seconds without on-demand resources. It must be noted that the tests were performed using a short three hour forecast run that completes in about 90 seconds on the chosen platform.