University of Utah School of Computing professor Ganesh Gopalakrishnan and assistant professor Pavel Panchekha have received one of the five DOE/ASCR X-Stack awards from the U.S. Department of Energy to adapt scientific software for next-generation high-performance computing (HPC) systems.
Gopalakrishnan and Panchekha’s project, called “ComPort: Rigorous Testing Methods to Safeguard Software Porting,” also involves the University of California, Davis; the University of Washington; Lawrence Livermore National Laboratory; and Pacific Northwest National Laboratory, all of whom have received a total of $3.4 million in DOE funding. This joint effort will address one of the major challenges for scientific computing — making sure HPC code returns similar answers when run on new HPC platforms, despite novel hardware, architectures, and compilers.
The team led by Gopalakrishnan and Panchekha will test as well as attempt to automatically repair software to ensure they are ported over and operate accurately for these newer scientific computers. High-performance computers are routinely used by universities and scientific institutes for a range of projects from weather prediction to drug discovery. In the exascale era, these computers will carry out over one quintillion calculations per second, playing critical roles in, for example, characterizing the Covid-19 spike protein.
Science and engineering research is increasingly based on the combined use of high-performance computing and machine learning. But to perform these calculations, researchers are now using the latest in HPCs, systems that are now utilizing the fastest dedicated graphics processors and other forms of computational accelerators. While the hardware has gotten faster, it often has to be used in newer ways, Gopalakrishnan said.
So his team will test software developed specifically for these modern high-speed computers to make sure they yield accurate results. “We test and ensure that the software is ported over reliably,” Gopalakrishnan said. “That is, the numerical results that are generated are reliable and explainable and support sustained scientific research, despite the changes of machines and special architectural features they employ.”
Along with testing, Gopalakrishnan and Panchekha will also develop automated diagnosis and repair methods to productively recoup billions of dollars invested in codes that have been around for multiple decades but have to be ported to newer machines.
X-Stack is one of the five advanced-computing projects announced on July 21, involving a total DOE investment of more than $13 million for five advanced-computing projects across nine states. In announcing these awards, Dr. Steve Binkley, Acting Director of DOE’s Office of Science said, “As we move through and beyond the era of exascale computing, the coming generation of supercomputers will bring a huge boost in capabilities for scientific investigation and discovery.”
Click here for more details in this Communications of the ACM magazine article from February.