The BPD statistically analyzes the energy performance and physical and operational characteristics of real commercial and residential buildings.

The Building Performance Database offers two primary methods to analyze building performance data. These are “Explore,” which allows users to browse a single dataset within the BPD, and “Compare,” which allowed users to compare multiple datasets within the BPD side by side.


The Explore tool allows users to browse a single dataset within the BPD. It offers three different data displays: histogram, scatter plot, and table. Data can be filtered by more than 35 different parameters such as building type, location, floor area, and others detailed below.

Users can filter the dataset based on five parameter categories, which can be combined to narrow the display to specific records of interest. Your filtered dataset can also be saved, allowing you to retain the data you’ve been working with.

The parameter categories include the following:

  • Building Classification: Data can be filtered into residential or commercial categories, as well as subcategories such as retail type, office type, multifamily, and many others.
  • Location: Data can be filtered by climate zone, state, city or zip code.
  • Building Information: Data can be filtered by year built, operating hours, number of people, occupant density, floor area, LEED score, LEED year, ENERGY STAR label, ENERGY STAR label year, ENERGY STAR rating, and ENERGY STAR rating year.
  • Building Systems: Data can be filtered by a variety of different building system types, including lighting, heating, heating fuel, cooling, window glass layers, window glass type, air flow control, wall insulation R-value, wall type, and roof and ceiling.
  • Energy Use Intensity: Data can be filtered by source EUI, fuel EUI, site EUI or electric EUI, as well as the year of the energy data. 

The filtered data are displayed as a histogram, scatter plot, and table. Each shows the energy performance distribution of the selected buildings in terms of the selected parameter or parameters.

  • Histogram: The histogram displays the data as a bar chart. Bars can be defined as specific counts or as a percentage of filtered records.
  • Scatter Plot: Displays the data as a scatter plot, with X and Y axes selected by the user.
  • Table: Displays summary statistical data for a given parameter, grouped by additional parameters. 


Compare allows users to create two separate, filtered datasets and compare them using a number of different variables. The data can be filtered using the same parameters as the Explore tool. This feature is useful for determining, for example, the difference between buildings in different climate zones, or the difference between buildings with different technologies.

  • Datasets can be filtered and compared based on the same parameters as the Explore function above.  
  • Allows users to conduct portfolio level analysis of the impacts of particular variables such as technology type or location.
  • Results are presented as a histogram and scatter plot. The histogram shows the difference between the two datasets for the selected parameter and the scatterplot displays both datasets.
  • Depending on data sparsity, the histogram may not take into account the relative impact of other building characteristics that may be correlated. For example, buildings with more efficient lighting may also have more efficient HVAC systems. Or buildings with more efficient systems may have more amenities and services which cause their energy use to be higher.
  • Histogram results can be calculated using one of two methods:
    • Actuarial: Samples the datasets pairwise and generates a distribution of differences. The horizontal axis shows the change in the value of the selected variable, while the vertical axis shows the percentage or count of the one-to-one comparisons that resulted in that level of change. In other words, the y-axis is equivalent to the probability of observing the change in the variable specified on the x-axis.
    • Regression: Fits a multiple linear regression model to the datasets, then uses the model coefficients to predict the distribution of differences.
    • Note that this analysis does not take into account differences between buildings such as characteristics outside of the parameters being used as filter settings.