Having everyone in the world complete arithmetic problems for nearly five years sounds like the worst math test ever. But it gives you an idea of just how fast exascale computers really are. Exascale computers get their name from how many calculations they can do per second – called floating operations per second, or FLOPS. With their ability to crunch over a quintillion calculations per second, one computer could complete in a mere second the same monumental task that the entire world population could tackle over five years.
The Department of Energy’s (DOE) Office of Science recently launched the world’s first exascale computer, Frontier, at DOE’s Oak Ridge Leadership Computing Facility user facility. It is currently ranked number 1 on the Top500 list of the world’s fastest supercomputers, as of November 14, 2022.
DOE is also in the process of launching two more exascale systems. Slated for delivery in late 2022, the Office of Science’s next system will be Aurora at DOE’s Argonne Leadership Computing Facility user facility. It’s expected to top 2 exaflops. Another arm of DOE, the National Nuclear Security Administration (NNSA), will deploy its own exascale system in the next year.
It's one thing to build that computing power. Putting it to work is the next step. Fortunately, there are plenty of challenges – from making product packaging that creates less waste to modeling wind farms – that are well suited for these high-performance computers. These computers will be tackling some of the biggest questions in science and industry, enabling new discoveries and solutions.
“Our science communities, from climate modeling to nuclear reactor simulations to epidemiology to control the spread of diseases, have increased their use of high-performance computing in recent decades,” said Barb Helland, associate director of the Office of Science’s Advanced Scientific Computing Research (ASCR) program.
From Petascale to Exascale
These three systems signal a new era of computing capability. But they didn’t happen in a vacuum. Rather, they’re the result of more than a decade of discussion and development. In that time, the Office of Science has launched progressively faster and more energy-efficient pre-exascale supercomputers for science.
“There are only a handful of places in the world where you could do this, and DOE is one of them. It’s incredibly important that the country do this to maintain progress,” said Rick Stevens, Associate Laboratory Director for Computing, Environment, and Life Sciences at Argonne National Laboratory (ANL), future home of Aurora.
This modern era started back in 2008. That year, the NNSA’s Roadrunner supercomputer at Los Alamos National Laboratory and the Office of Science’s Jaguar supercomputer at Oak Ridge National Laboratory (ORNL) broke petascale barriers. Aptly named for fast animals, these systems crunched more than a quadrillion calculations per second (15 zeroes).
“The petascale revolution allowed us to run very large jobs. You could see the change in the certain kinds of analyses we were able to do,” said Arvind Ramanathan, a computational biologist and computational science leader at ANL.
Ramanathan’s team, along with collaborators Rommie Amaro (University of California San Diego), Lillian Chong (University of Pittsburgh), and others recently used Office of Science petascale machines to model the spike protein of the SARS-CoV-2 virus during the COVID-19 pandemic. That feat required simulating nearly 2 million moving atoms. The techniques used in this work built off many gains in computational biology developed over a decade at petascale, including the use of artificial intelligence and data science.
At petascale, supercomputers were not only generating more modeling and simulation data for analysis, but they were enabling scientists to analyze more experimental data. Since the early 2000s, the Office of Science’s 28 national user facilities – including light sources and nanoscale science centers – have been producing massive volumes of data. With petascale machines in place, DOE leadership began looking ahead.
“We had just reached a petaflop, and Ray Orbach, the former Director of the Office of Science, asked ‘What’s next?’” recalled Helland.
Orbach wasn’t the only one asking that question. High performance computing (HPC) already had bipartisan support in Congress. In fact, the leadership computing facilities at ANL and ORNL had recently been established in response to the House’s 2004 High-End Computing Revitalization Act. In 2007, several DOE labs held Exascale Town Halls to discuss the challenges of breaking the barrier. These and other workshops were followed by seminal reports, including a 2010 Office of Science report called The Opportunities and Challenges of Exascale Computing.
Based on petascale systems, scientists predicted that exascale supercomputers would face key challenges in power efficiency, data movement, and resiliency.
“When we started, we weren’t even sure we could do it,” Stevens said. “Huge challenges had to be overcome.”
The energy-efficiency problem loomed large. Exascale systems would potentially devour up to 60 times more energy than their petascale predecessors. The current computers were already consuming as much power as several thousand homes.
To tackle this issue, the Office of Science set out to reduce energy consumption in the existing petascale systems. In 2012, the Mira supercomputer at the Argonne Leadership Computing Facility swapped energy-intensive air cooling for more efficient water cooling. The same year, the Oak Ridge Computing Leadership Facility introduced a new type of processing unit in the Titan supercomputer. Titan coupled traditional central processing units (CPUs) to NVIDIA graphics processing units (GPUs). GPUs were originally designed to render graphics in video games. By connecting a GPU to a CPU, developers were able to divide and specialize work. CPUs carried out complex communication tasks; GPUs rapidly crunched numbers and carried out repetitive tasks. With only a slight increase in energy use, computer scientists were able to make computers work 10 times faster.
“Titan was pretty disruptive,” said Doug Kothe, director of the DOE Exascale Computing Project and Associate Laboratory Director for Computing and Computational Sciences at ORNL. “Since then, we’ve gone to more and different types of GPUs.”
These improvements are clear in today’s exascale machines. Both Office of Science exascale systems, Frontier and Aurora, are hybrid CPU–GPU machines. Importantly, both systems will operate within 20 megawatts per exaflop. That efficiency meets DOE’s energy-efficiency goal for exascale and is well below the grim estimates of the early 2000s.
But efficiency wasn’t the only major challenge. Reaching exascale still required innovations in hardware, software, and infrastructure. Exascale computers might need new methods of organizing their hardware and software to become reality.
DOE convened scientists and representatives from U.S. technology companies to figure out how to build an exascale computer from the chip up. To cross the finish line, DOE created the seven-year Exascale Computing Project (ECP). The nationwide project called on about 1,000 researchers and 15 national laboratories (with six core laboratories) to develop the next generation of scientific computing tools.
“Exascale has shown that six labs can come together and work together,” Helland said. “The fact that six labs met together for six to seven years to make sure every milestone has been met and the project is on track…I’ve never been associated with something like that.”
ECP teams developed 21 scientific computing applications that are foundational to DOE missions and Office of Science programs. Through early access to Frontier and Aurora, ECP teams are validating their applications. Then they’ll start work on solving the research objectives of their projects.
“You’re going to see eyebrow-raising, breakthrough science with these applications,” Kothe said. “You may see Nobel Prizes follow from the use of these applications.”
The Promise and Potential of Exascale
Office of Science exascale systems will enable scientists to tackle increasingly complex problems. To help meet a national goal of net-zero carbon emissions by 2050, DOE is advancing many technologies—including renewable energy, fusion energy, carbon storage and sequestration, and energy-efficient materials and manufacturing. Exascale computing can accelerate the development of these and other critical technologies.
“The impact of these systems on global competitiveness has long been embodied by the saying, ‘You must out-compute to out-compete’,” said Gina Tourassi, director of the National Center for Computational Sciences at ORNL, home of Frontier.
To develop energy-efficient technologies, researchers can use exascale supercomputers to model complex materials or chemical systems. Internal combustion engines are one example, according to Christine Chalk, ASCR Program Manager. “Combustion simulations at terascale [early 2000s supercomputers] were using simple laboratory flames. But to try and simulate the internal combustion engine with its many chemical reactions and complex geometry requires exascale.”
To advance carbon-free nuclear power, researchers can simulate operating scenarios for aging fission reactors. This information could help extend reactor lifetimes. Scientists can also use supercomputers to predict the performance of new small modular reactor designs. If companies deploy these designs, it could increase the availability of safe and economical fission energy.
Another form of nuclear energy – fusion – will continue its long history with scientific computing at exascale. Experimental fusion devices are complex and create extreme environments that are expensive to build and test. At exascale, simulations of whole fusion devices can advance the many fusion experiments going on around the world.
To understand and mitigate climate change, researchers are expanding climate models. They are tackling small-scale but powerful influencers like chemical particles in the atmosphere and local weather patterns. Researchers can also use exascale to better predict natural disasters like floods and earthquakes as well as their impacts. For example, exascale will allow scientists to simulate earthquakes in ways that are more useful for predicting damage to buildings and infrastructure.
In fundamental science, researchers can drive the development of the next generation of advanced instruments, such as quantum devices.
For these and other applications, scientists have been preparing computing codes for exascale systems for years.
“We expect Frontier will be oversubscribed because there is a tremendous need,” Tourassi said. “When it comes to climate modeling and being able to predict extreme conditions that threaten our livelihood and survival, Frontier will be front and center. When it comes to health—winning the war against cancer or Alzheimer’s—Frontier will open new scientific avenues and lead to transformational breakthroughs.”
“You can think of exascale as this transition from the traditional HPC workloads to the beginning of the future workloads,” Stevens said. “We need to be thinking of this not as an end but as a means.”
With the big data explosion of the twenty-first century, AI became mainstream in scientific computing. Although GPUs were introduced to supercomputers for their energy efficiency, they proved excellent tools for AI programs as well. Both Frontier and Aurora are built to excel at running AI.
Scientists will also use exascale to help develop a new paradigm of computing—quantum. Even as scientists continue to develop quantum computers, exascale computers can do quantum simulations and other types of quantum research. At the same time, scientists will continue to rely on non-exascale supercomputers as well. Many problems can run on traditional high-performance computers that complement the most powerful exascale ones.
The Future at Exascale
With this evolution in computing approaches has come a change in the types of scientists using and working in high-performance computing.
As a director of a national supercomputing facility, Tourassi has seen a shift in the supercomputing user community. “With data-intensive workflows and AI, we started seeing applications that were not traditional members of our user community,” she said. “We also see continued growth of our existing user community.”
As director of ECP, Kothe said that more diverse teams are shaking up the status quo and setting the future of scientific computing: “The teams have changed dramatically. These are diverse, interdisciplinary teams with diverse backgrounds and points of view.”
Reflecting on the build up to exascale, Kothe said, “Every snapshot in HPC has been disruptive. Maybe we’ll even see more disruption [in future systems] than we’ll see through exascale.”
The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, please visit the Office of Science website.