Today, the U.S. Department of Energy (DOE) announced the selection of the High Performance Data Facility (HPDF) hub, which will create a new scientific user facility specializing in advanced infrastructure for data-intensive science. The Thomas Jefferson National Accelerator Facility (JLab) will be the HPDF Hub Director and the lead infrastructure will be located at JLab. The project to build the Hub will be a partnership between JLab and Lawrence Berkeley National Laboratory (LBNL), and the two labs will form a joint project team led by JLab charged to create an integrated HPDF Hub design.
The High Performance Data Facility is envisioned as a national resource that will serve as the foundation for advancing DOE’s ambitious Integrated Research Infrastructure (IRI) program. The IRI aims to provide researchers the ability to seamlessly meld DOE’s unique data resources, experimental user facilities, and advanced computing resources to accelerate the pace of discovery. The mission of the HPDF will be to enable and accelerate scientific discovery by delivering state-of-the-art data management infrastructure, capabilities, and tools. HPDF will provide leadership in the stewardship of the scientific data lifecycle and will advance DOE’s and the Biden Administration’s commitment to public access to scientific data and FAIR data principles (Findable, Accessible, Interoperable, and Reusable). The competition to lead the High Performance Data Facility was open to the Office of Science national laboratories.
“High quality research data is the rocket fuel of the AI era and all other forms of emerging technologies. At the same time, modern collaborative science demands linking distributed research resources. The High Performance Data Facility will play a central role in the operation and success of the IRI program which is designed to serve the data and analysis needs of our many DOE national laboratory user facilities and more,” said Geraldine Richmond, DOE’s Under Secretary for Science and Innovation. “With HPDF as the hub of the IRI, it will provide researchers with access to DOE’s comprehensive and well-curated data sets from experiments and simulations, integrating and simplifying access to DOE’s vast research infrastructure for researchers across the country. I congratulate JLab and LBNL on their outstanding proposals and look forward to taking the next step in this crucial project.”
“The challenges of our time call upon DOE and its national laboratories to be an open innovation ecosystem to accelerate discovery and innovation, democratize access to resources, draw new talent, and advance open science,” said Asmeret Asefaw Berhe, DOE’s Director of the Office of Science. “The High Performance Data Facility will be a cornerstone of DOE’s strategy to lead this new era of integrated science. It will be a beacon of leadership in data science, facilitating partnerships and connections across the country, and accelerating the timeline of scientific discoveries by leveraging our world class scientific user facilities, the largest amount of scientific data available globally, and the nation’s advanced computing resources.”
The HPDF will provide a crucial national resource for artificial intelligence (AI) research, opening new approaches for the nation’s researchers to attack fundamental problems in science and engineering that require nimble, shared access to large data sets, and real-time analysis of streamed data from experiments. DOE is the leading producer of scientific data in the world and the HPDF will deliver a platform for a broad spectrum of data-intensive research as we enter the era of exascale supercomputing and exascale data.
The HPDF will have a “hub-and-spoke” model in which JLab and LBNL will host mirrored centralized resources and also enable high priority DOE mission applications at “spoke” sites by deploying and orchestrating distributed infrastructure at the spokes or other locations. Under JLab’s leadership, the JLab/LBNL partnership will assemble a world class HPDF Hub project team to deliver a geographically resilient and innovative HPDF core infrastructure capable of meeting the needs of a wide diversity of users, institutions, and use cases. This JLab/LBNL partnership will itself provide the template for the first spokes partnerships and blaze new paths in institutional engagement and outreach in the emerging era of AI-enabled integrated science.
As identified in the Mission Need Statement for the High Performance Data Facility approved in August 2020, DOE anticipates that the Total Project Cost of the HPDF project, including the hub and spokes, will be between $300 million and $500 million in current and future year funds, subject to the availability of future year appropriations. As directed by the Lab Call, the awarded proposal used a planning assumption of the low end of this total project cost range ($300 million).