Urgent Computing: Exploring Supercomputing’s New Role
Pete Beckman, Mathematics and Computer Science Division, Argonne National Laboratory
Large-scale parallel simulation and modeling have changed our world. Today, supercomputers are not just for research and development or scientific exploration; they have become an integral part of many industries. A brief look at the Top 500 list of the world’s largest supercomputers shows some of the business sectors that now rely on supercomputers: finance, entertainment and digital media, transportation, pharmaceuticals, aerospace, petroleum, and biotechnology. While supercomputing may not yet be considered commonplace, the world has embraced high-performance computation (HPC). Demand for skilled computational scientists is high, and colleges and universities are struggling to meet the need for cross-disciplinary engineers who are skilled in both computation and an applied scientific domain. It is on this stage that a new breed of high-fidelity simulations is emerging – applications that need urgent access to supercomputing resources.
For some simulations, insights gained through supercomputer computation have immediate application. Consider, for example, an HPC application that could quickly calculate the exact location and magnitude of tsunamis immediately after an undersea earthquake. Since the evacuation of local residents is both costly and potentially dangerous, promptly beginning an orderly evacuation in only those areas directly threatened could save lives. Similarly, imagine a parallel wildfire simulation that coupled weather, terrain, and fuel models and could accurately predict the path of a wildfire days in advance. Firefighters could cut firebreaks exactly where they would be most effective. For these urgent computations, late results are useless results. As the HPC community builds increasingly realistic models, applications are emerging that need on-demand computation. Looking into the future, we might imagine event-driven and data-driven HPC applications running on-demand to predict everything from where to look for a lost boater after a storm to tracking a toxic plume after an industrial or transportation accident.
Of course, as we build confidence in these emerging computations, they will move from the scientist’s workbench and into critical decision-making paths. Where will the supercomputer cycles come from? It is straightforward to imagine building a supercomputer specifically for these emerging urgent computations. Even if such a system led the Top 500 list, however, it would not be as powerful as the combined computational might of the world’s five largest computers. Aggregating the country’s largest resources to solve a critical, national-scale computational challenge could provide an order of magnitude more power than attempting to rely on a prebuilt system for on-demand computation.
Furthermore, costly public infrastructure, idle except during an emergency, is inefficient. A better approach, when practical, is to temporarily use public resources during times of crisis. For example, rather than build a nationwide set of radio towers and transmitters to disseminate emergency information, the government requires that large TV and radio stations participate in the Emergency Alert System. When public broadcasts are needed, most often in the form of localized severe weather, broadcasters are automatically interrupted, and critical information is shared with the public.
As high-fidelity computation becomes more capable in predicting the future and being used for immediate decision support, governments and local municipalities must build infrastructures that can link together the largest resources from the NSF, DOE, NASA, and the NIH and use them to run time-critical urgent computations. For embarrassingly parallel applications, we might look to the emerging market for “cloud computing.” Many of the world’s largest Internet companies have embraced a model for providing software as a service. Amazon’s elastic computing cloud (EC2), for example, can provide thousands of virtual machine images rapidly and cost effectively. For applications with relatively small network communication needs, it might be most effective for urgent, on-demand computations simply to be injected into the nation’s existing Internet infrastructure supported by Amazon, Yahoo, Google, and Microsoft.
In April 2007, an urgent computing conference at Argonne National Laboratory brought together an international group of scientists to discuss how on-demand computations for HPC might be supported and change the landscape of predictive modeling. The organizers of that workshop realized that CTWatch Quarterly would be the ideal venue for exploring this new field. This issue describes how applications, urgent-computing infrastructures, and computational resources can support this new role for computing.