DOE Requirements and Guidance for Digital Research Data Management

The Department of Energy Public Access Plan (June 2023) describes how DOE-funded research and digital data will become more open and available to the public and how DOE will use persistent identifiers to help ensure scientific and research integrity. This sets the stage for increased innovation, commercial opportunities, and accelerated scientific breakthroughs, while maximizing delivery of Federally-funded research results and ensuring that transparent procedures maintain scientific and research integrity.

Principles

The following principles for the management of digital scientific research guide the DOE data management requirements:

  • Effective data management and sharing has the potential to increase the pace of scientific discovery, promote more efficient and effective use of government funding and resources, and create a more fair Federal scientific ecosystem through data sharing and preservation. Data management planning should be an integral part of research planning.
  • Sharing and preserving data are central to protecting the integrity of science by facilitating validation and replication of results and to advancing science by broadening the value of research data to disciplines other than the originating one and to society at large. To the greatest extent, with the fewest constraints possible, in a timely and fair manner, and consistent with the requirements and other principles stated in this Plan, data sharing should make digital research data available to and useful for the scientific community, industry, and the public.
  • Data management planning should maximize appropriate sharing of scientific data while preserving the balance between the relative value of long-term preservation and access and the associated cost and administrative burden.

Data Management and Sharing Plan Requirements

Requirements for public access to scientific data in digital formats apply to unclassified and otherwise unrestricted digital scientific data arising from research and development activities undertaken with DOE funds, whether in whole or in part, unless otherwise prohibited by law, regulation, or policy.

All DOE-funded research and development awards and contracts are subject to a DOE approved Data Management and Sharing Plan (DMSP). The DMSP will address validation and replication of results, timely and fair access, data repository selection, data management resources, and data sharing limitations. A DMSP may include, but is not limited to, what data will be publicly shared, the data or metadata standards that will be used, any related tools, software, or code, how data will be shared and preserved, and any necessary data protections. While a DMSP is created by a funding applicant or recipient and is specific to their scope of work, it will be reviewed and may be updated, if and when appropriate, to maintain strategic DOE program alignment, respond to reviewer feedback, and/or to reflect the progress of the supported research. If applicable, proposals may include the cost of implementing the DMSP in the proposed budget.

The DMSP requirements below, defined in the 2023 DOE Public Access Plan, will be integrated into solicitations and invitations for research proposals beginning October 1, 2025. Prior to that date, solicitations will continue to include data management requirements based on the 2014 DOE Public Access Plan.

Data Management and Sharing Plan requirements:

  1. Validation and replication of results

    The DMSP should describe how data generated in the course of the research project will be publicly shared and preserved in a timely and fair manner, without unnecessary limits or delays to access, that enables validation and replication of results. If data will not be publicly shared and preserved (see “Data sharing limitations”), the DMSP should describe how results could be validated and replicated.

  2. Timely and fair access

    The DMSP should provide a plan for making all scientific data displayed in peer-reviewed scholarly publications resulting from the proposed research open, machine-readable, and digitally accessible to the public at the time of publication. This includes data that are displayed in charts, figures, images, etc. In addition, the underlying digital scientific data used to generate peer-reviewed scholarly publications should be made freely available and publicly accessible at the time of publication, in accordance with the principles stated above. The published article should indicate how these data can be accessed. The DMSP should also provide a timeline for sharing digital scientific data produced under the DOE-funded R&D effort not associated with peer-reviewed scholarly publications.

  3. Data repository selection

    The DMSP should specify the use of digital repositories that align, to the extent practicable, with the National Science and Technology Council document entitled “Desirable Characteristics of Data Repositories for Federally Funded Research.” In general, DOE does not endorse or require sharing in any specific repository and encourages researchers to select the repository that is most appropriate for their data type and discipline, though individual sponsoring research offices may provide specific guidance or designate a specific repository.

  4. Data management and sharing resources

    The DMSP should describe the data management and sharing resources that may be available and used in the course of the proposed research. In particular, a DMSP that explicitly or implicitly commit data management and sharing resources at a facility beyond what is conventionally made available to approved users should be accompanied by written approval from that facility. In determining the resources available for data management and sharing at DOE scientific user facilities, researchers should consult the published description of data management resources and practices at that facility and reference it in the DMSP.

  5. Data sharing limitations

    The DMSP should address any limitations of data sharing to facilitate the protection of confidentiality, privacy, business confidential information, and/or security; avoid negative impact on intellectual property rights, innovation, program and operational improvements, and U.S. competitiveness; consider maximizing appropriate sharing through risk-mitigated limited access; preserve the balance between the relative value of long-term preservation and access and the associated cost and administrative burden; and otherwise be consistent with all applicable laws, regulations, and DOE orders and policies. Depending on the DOE funding agreement, a contractor or financial award recipient may have the right to assert copyright to or protect from public release for certain scientific data products. When contractors or award recipients assert copyright of scientific data, the DMSP should address licensing requirements and any limitations for sharing the copyrighted data. When contractors or award recipients assert data protection, the scientific data will not be shared with the public during the data protection period.

The DOE sponsoring research office or element may modify or add to the requirements above for Data Management and Sharing Plans for any program or project. Any such changes should be identified in the applicable solicitation or invitation for research funding for the projects impacted by the changes.

Reporting of Data Products

Scientific data that are shared publicly must be reported to DOE’s Office of Scientific and Technical Information (OSTI) and through any other applicable reporting requirements. DOE Order 241.1C, Scientific and Technical Information Management Data, establishes Scientific and Technical Information (STI) and DMSP requirements as well as roles and responsibilities for Federal staff and contractors.

To improve discoverability and attribution, DOE encourages the citation of publicly available datasets in the reference section of publications and the use of persistent identifiers, such as a Digital Object Identifier (DOI). If a DOI is not assigned by the repository hosting the data, OSTI will assign a DOI when the data record is reported.

Best Practices for Data Sharing

In alignment with the DOE Principles for scientific data management, researchers should maximize the appropriate sharing of scientific data, subject to the data sharing limitations requirements identified above, while preserving the balance between the relative value of long-term preservation and access and the associated cost and administrative burden. If no limitations are applicable, data should be shared openly, so that it can be freely used and re-used subject to minimal requirements beyond attribution, and in alignment with the findable, accessible, interoperable, and reusable (FAIR) data principles. Similarly, if no limitations are applicable, computer software should be shared openly, with source code distributed under a license in which the user is granted the right to use, copy, modify, and prepare derivative works thereof, without having to make royalty payments.

When limitations or cost and burden considerations are applicable, researchers are encouraged to share data in a manner that mitigates risk, such as providing reduced data sets, aggregated data, or tiered access to data in a way that maintains appropriate protections. Strategies used to enable maximal data sharing should be described in the DMSP.

Researchers are encouraged to use the data standards, formats, and other common practices of their research community or scientific domain to ensure maximal interoperability and impact of data sharing. Specified metadata standards, repository selection, timelines for availability of data, and other DMSP elements may be evaluated against relevant community practices to ensure alignment.

Considerations for sharing and reuse permissions

A DMSP should clearly define the intended permissions for use and sharing of data and results and address any dependencies the outputs may have on reuse permissions of input sources or processes. If no limitations are applicable for data or software sharing, minimal requirements beyond attribution are encouraged. However, additional permissions or reuse requirements may be necessary to address appropriate limitations for sharing.

To ensure that sharing and reuse of the data is allowed as intended, it is important to identify each item, component, or process underlying the data acquired, collected or generated, particularly when multiple data sets will be combined. It is best to obtain proper permissions in advance to avoid unintended impacts in the ability to share outputs as intended.

Before acquiring, collecting, generating, or sharing data, consider these questions:

  • What data do you want to acquire, access, collect, generate and/or share?
  • What is the source or who owns the data being acquired, accessed, collected, generated and/or shared?
  • What restrictions has the data owner placed on use and sharing of their data?
  • How do you plan to protect the data?
  • With whom do you plan to share the data?
  • What are your initial plans to use the data? What are the anticipated future uses?
  • With whom and how do you plan to share the results or product(s) created using the data?

Ideally, existing data or code as well as any outputs of a DOE-funded project will have metadata to assist in understanding provenance – from whom or where did the data come from – and what permissions were granted with the data or for generating data.

Repository selection and certifications

A DOE sponsoring research program or a specific solicitation may specify that a particular data repository should be used to share scientific data. If a repository is not specified, applicants are encouraged to consider whether any DOE-supported data repositories are suitable for use.

If an external data repository will be used to share data, the National Science and Technology Council (NSTC) report, “Desirable Characteristics of Data Repositories for Federally Funded Research,” provides guidance on selecting a repository that helps ensure research data are findable, accessible, interoperable, and reusable (FAIR) to the greatest extent possible, while integrating privacy, security, and other protections.

The NSTC Desirable Characteristics fall into three main categories, with additional considerations that may be applicable for repositories storing human data.

NSTC Desirable Characteristics of Data Repositories for Federally Funded Research
Organizational InfrastructureDigital Object ManagementTechnologyAdditional Considerations for Human Data

Free and Easy Access

Clear Use Guidance

Risk Management

Retention Policy

Long-term Organizational Sustainability

Unique Persistent Identifiers

Metadata

Curation and Quality Assurance

Broad and Measured Reuse

Common Format

Provenance

Authentication

Long-term Technical Sustainability

Security and Integrity

Fidelity to Consent

Security

Limited Use Compliant

Download Control

Request Review

Plan for Breach

Accountability

There is not a central governing body that evaluates whether a data repository aligns with the NSTC Desirable Characteristics. However, community-based certifications may aid applicants in assessing the extent to which a repository aligns with the NSTC Desirable Characteristics.

Publishing and use of digital persistent identifiers

Researchers are encouraged to include document data, code, and models shown in or underlying peer-reviewed publications as citations and/or references with associated digital persistent identifiers (e.g., DOIs). This practice meets expectations and DOE requirements regarding persistent identifiers. Additional information about persistent identifiers as well as certain persistent identifier services are available on the OSTI PIDs website.

References and resources

Writing a Data Management and Sharing Plan

Frequently Asked Questions

Glossary