October 31 – November 1, 2023 | Alexandria, VA

The U.S. Department of Energy (DOE) Solar Energy Technologies Office (SETO) hosted a two-day, in-person workshop on the solar applications of artificial intelligence (AI) and machine learning (ML).

Post-workshop Report

Goals of Workshop

  1. Provide a broader understanding of SETO-funded research that employs artificial intelligence (AI) and machine learning (ML) elements; 
  2. Establish links between projects using similar techniques or addressing similar problems;
  3. Assist SETO staff in formulating a future posture towards AI and ML technologies; and 
  4. Support a discussion between SETO, industry experts, and solar R&D stakeholders regarding the impact of the state-of-the-art AI and ML technologies on the operation of a funding agency. 

Workshop Topic Tracks

  1. The use of data, ML models, and AI methods as replacements of physical, analytical, or empirical models in various applications, such as forecasting, materials synthesis, data imputation, and power flow optimization. 
  2. The use of data and ML in classification-related problems, including anomaly detection in operations and cybersecurity. 
  3. The role of large datasets in AI and ML, including the challenges they present in R&D contexts. 
  4. The use of generative AI and Large Language Models, especially in the context of funding agency operations. 

Feedback Summary from Breakout Discussions

  1. ML/AI Models and Methods (Tracks 1 & 2):
    • Current Status:
      • AI, and especially ML, is being leveraged across many areas of solar research but the level of adoption varies considerably, which is driven in part by the type and size of datasets that are available to the community. Researchers funded by SETO have adopted AI tools more rapidly in grid integration than materials research.
      • ML approaches are being increasingly deployed in materials and device development to accelerate experimental campaigns through active learning.
    • Challenges Raised to SETO:
      • A large cost driver of utility-scale operations is labor, which is currently in short supply across the solar industry and cannot currently be directly mitigated by AI/ML adoption.
      • There is a need for more ML/AI experts within the solar sector, which is difficult to meet since there are already more lucrative opportunities for AI/ML talent in other industries.
      • AI/ML techniques may be used to improve the data quality of historical datasets and improve prediction at various time horizons. However, there are calls for transparency of methodologies, while the utility of data noise remains an openly debated question in dataset preparation.
    • Opportunities Suggested to SETO:
      • AI could aid in operations-and-maintenance (O&M) decision-making at utility-scale installations, such as when and where to dispatch field technicians. Augmented reality technology paired with AI could increase the efficiency and supply of labor.
      • There is a need to train more undergraduate and graduate students on these topics and to strengthen links between data science and the energy sector.
      • Engage with other DOE offices funding fundamental AI research, such as the Advanced Scientific Computing Research (ASCR) program, to help it appropriately respond and adapt to the rapidly changing capabilities and challenges of AI.
      • SETO could encourage the development of models that are accessible, trustworthy, and energy-efficient (e.g., avoid unnecessary computation when possible).
  2. Datasets (Track 3):
    • Current Status:
      • The solar industry does not have a data-first culture. Data collection and management is often a secondary priority in deployment activities.
      • The Open Energy Data Initiative (OEDI) offers a set of data lakes aggregated from U.S. Department of Energy’s programs, offices, and national laboratories. This includes both measured and synthetic datasets. However, there is a demand for more high-quality datasets across a diverse range of solar-related topics.
      • For-profit organizations are wary of sharing proprietary data, which limits opportunities for data sharing and utilization across stakeholders.
    • Challenges Raised to SETO:
      • Data fidelity is a broad challenge to solar datasets with common issues arising from improper data labelling, limited time series, data gaps, and noise.
      • There is a large appetite for data from for-profit organizations, such as utilities, asset owners, and PV manufacturers. However, consideration must be given to ensuring useful information is provided while simultaneously protecting the business interests of the data owners.
      • Available datasets are not often regularly updated and quickly lose relevance. There is a need for findable, indexable, and accessible repositories that prioritize active data management and curation.
    • Opportunities Suggested to SETO:
      • Requests were raised for DOE/SETO to negotiate data-use agreements with awardees in lieu of non-disclosure agreements and encourage awardees to use open repositories for any developed models and generated datasets. 
      • Calls were made for SETO to reinforce the use of open access data lakes with awardees, advocate for their use by the wider community, and help maintain them to ensure the resources remains useful and up to date. 
      • There is a desire for SETO/DOE to facilitate the aggregation, anonymization, validation and sharing of data with consistent formatting. Particularly, in overseeing the proper and considerate transfer of proprietary data from companies to the research community.
  3. LLMs and Funding Agency Operations (Track 4):
    • Current Status:
      • President Biden issued on October 30, 2023 an Executive Order on the Safe, Secure, and Trustworthy Artificial Intelligence (14110) and the Office of Management and Budget has issued draft guidance on the of AI by government agencies (OMB-2023-0020).
      • Large Language Models (LLMs), a form of Generative AI, have biases that arise from their training sources and may hallucinate when synthesizing text. Carefully curating the input data for an LLM can improve the accuracy of the generative output.
      • There is a wariness about using confidential or proprietary information with generative AI tools. Therefore, common current use cases involve public communication tasks that avoid disclosing sensitive information, such as generating marketing text or synthesizing an intermediate draft of text with the intention of manually revising prior to release.
    • Challenges Raised to SETO:
      • There was significant debate about whether a funding agency should use LLMs for evaluating proposals. Concerns were voiced about the lack of both transparency and trustworthiness of the outcomes.
      • Concerns were raised that existing biases in AI tools would be propagated into agency decision making about funding. 
      • There was debate about the use of LLM by applicants to write funding proposals. Some argued it should be permitted with disclosure, whereas others argued it should be permitted without any need to disclose its use.
    • Opportunities Suggested to SETO:
      • Many called for SETO to offer more requests for information (RFI), prizes, and callouts in FOAs on AI/ML topics in solar applications to encourage continued innovation and further engagement with AI experts that have not yet entered the solar application space. 
      • There were requests that a policy is made available about the usage of generative AI in the application and selection processes of SETO funding opportunities.

Agenda, Presentations, & Posters

Day 1

Speaker(s)/Presentation Titles 
Opening Remarks Dr. Tassos Golnas, Department of Energy Solar Energy Technologies Office
Welcome AddressGarrett Nilsen, Department of Energy Solar Energy Technologies Office 
Welcome Address Ann Dunkin, Department of Energy Chief Information Officer
Keynote Prof. Le Xie, Texas A&M University - Energy System Digitization in the Era of AI: A Layered Approach Towards Carbon Neutrality 
Introduction to Track 1 Prof. Tonio Buonassisi, Massachusetts Institute of Technology - Learning by Doing: A Journey Toward Accelerating Experimental R&D Using Machine Learning 
Track 1 Presentations 
Introduction to Track 2 
Track 2 Presentations  
Breakout Discussions Facilitated discussion on Track 1 and Track 2 topics 
Concluding Remarks (Day 1) Dr. Larkin Sayre, Department of Energy Solar Energy Technologies Office 
Poster Presentations Review of posters from SETO-funded projects with research involving AI and ML. View the posters.


Day 2

Speaker(s)/Presentation Titles 
Opening Remarks (Day 2)Dr. Larkin Sayre, Department of Energy Solar Energy Technologies Office 
Keynote Dr. Hendrik Hamann, IBM Research - Big Data, ML and AI for the Energy Transition 
Introduction to Track 3 Tim Boyle, Databricks - LLMs and AI on Lakehouse
Track 3 Presentations 
Breakout Discussion Facilitated discussion on Track 3 topics 
Introduction to Track 4 Dr. Svitlana Volkova, Aptima - Opportunities and Limitations of Large Pretrained Models for Energy Technologies
Track 4 Presentations 
Breakout Discussions Facilitated discussion on Track 4 topics 
Concluding Remarks (Day 2)Dr. Tassos Golnas, Department of Energy Solar Energy Technologies Office 

Additional Information