Friday, April 27, 2007

Challenges & Promises of Petascale Computing

CITRIS researchers will soon have access to a new generation of high-performance supercomputers far more powerful than those available today. Known as petascale computers, these new machines will be capable of conducting 10^15 floating-point operations per second (petaflops). These parallel machines may employ more than a million processors and will be able to handle huge data sets. Until now, they have been mainly the domain of military and other national security applications. With the delivery of a new petascale computer to Lawrence Berkeley National Laboratory (LBNL), and with the possibility of a Berkeley team helping to host another one at Lawrence Livermore National Laboratory (LLNL), researchers working on climate analysis, genomics, environmental monitoring, protein analysis, earthquake, nanoscience, and other CITRIS-related fields will gain access to powerful new modeling tools within the next four years.


Before these new giants can be fully exploited, some big challenges must first be addressed. UC Berkeley computer science professor Katherine Yelick is working with colleagues in the Parallelism Lab to bring such CITRIS-type applications and the petascale hardware and systems software together.

"We are trying to expose the best features of the underlying hardware to the software," says Yelick. "The hardware designers are trying to innovate and put in fast networks or networks with very interesting connectivity patterns, and we want to take full advantage of that," she says.

Yelick has one foot in the world of system-level software and the other in that of hardware development, which makes her particularly valuable to the coordination effort. She and her team have developed new compilers and programming languages (one based on C and another based on Java) for the new petascale computers.

One big challenge is the problem of pacing and managing the information flow through hundreds of thousands of processors. "It is like trying to get a million people coordinated and doing their jobs at exactly the same time," says Yelick.

Petascale machines not only have more chips, but each chip has more processors than earlier generation supercomputers. Coordinating the flow and sharing of so much activity is a job requiring new algorithms and new approaches to applications programming, too, says Yelick.

This is a big problem because the work the computer is trying to do is not equally distributed among all of its processors. In modeling weather, for example, the Earth's surface can be divided into equal sized parts, and each given a dedicated processor. But if there is a hail storm somewhere, for example, there will suddenly be a lot of significant activity in the processors associated with those parts of the model. If the rest of the system has to wait for the processors working on the hailstorm, it can lose a lot of time, says Yelick.

In addition to such load imbalance issues, the team is working to minimize the time it takes for information to travel around these computers, some of which can be as big as a tennis court.

"Light travels pretty slowly," explains Demmel. "if processors on opposite sides of the computer have to send huge amounts of information back and forth, the time adds up fast."

I love that last quote. It's so true too for our business.

There's also an issue, fwiw, of making the models capable of scaling that high. A lot of algorithms top out around 1024 processors. we'll need to go explore a lot more in that mindspace if we're going to use 'millions' of processors for programs.

No comments: