Beruflich Dokumente
Kultur Dokumente
a r t i c l e i n f o a b s t r a c t
Article history: A reasonable building energy efficiency benchmarking program plays an important role in energy con-
Received 17 July 2019 sumption control and supervision. Previous studies have focused on the process of establishing a single
Revised 12 November 2019
benchmarking method, but few have compared the performances and outcomes of different methods. To
Accepted 17 December 2019
fill this gap, this paper selects three benchmarking methods—multiple linear regression (MLR) based on
Available online 19 December 2019
Energy Star, stochastic frontier analysis (SFA) and the descriptive statistics method (DSM) based on the
Keywords: national energy consumption standard in China—to develop benchmarking models. We demonstrate each
Energy performance method using data on the energy and building characteristics of 45 four- and five-star hotel buildings
Energy benchmarking located in Chongqing, China. To compare the consistency, robustness and explanatory ability of the three
Data-driven approaches methods, we first utilize the Spearman rank correlation analysis to test whether these methods have
Comparative analysis consistent energy efficiency ranks and then present Sankey diagrams to further reveal the interactions of
the estimated energy efficiency grades obtained from the three methods. It is found that the results of
DSM and SFA are most consistent, while MLR vs. SFA and MLR vs. DSM present significant differences in
evaluating building energy performance. In addition, DSM is more robust for evaluating the ranks of sam-
pled buildings, while SFA is more robust for evaluating energy efficiency grades. Furthermore, we discuss
the explanatory ability of each method. In addition to the building characteristics, the design and op-
erational characteristics of the HVAC system have great effects on building energy consumption. Finally,
we present suggestions for policy-makers regarding the development and implementation of the building
energy benchmarking program in Chongqing and for the management of buildings with different energy
performances to further improve the energy efficiency.
© 2019 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.enbuild.2019.109711
0378-7788/© 2019 Elsevier B.V. All rights reserved.
2 Y. Ding and X. Liu / Energy & Buildings 209 (2020) 109711
Some popular benchmarking tools have been proposed and not be suitable for application to benchmark building energy
adopted by developed countries, such as the Energy Star rating in performance on a city level.
the U.S. [5], VDI 3807 in Germany [6], and the Energy Smart tool To improve the reliability of a single energy benchmarking
in Singapore [7]. In China, the “Standard for Energy Consumption indicator and reduce the dimensions of the building characteristics
of Buildings, GB/T51161-2016 (shortened to Standard) [8] was re- of sampled buildings, a data-driven approach has been found to be
cently released for evaluating the energy consumption of buildings, a solution, and regression analysis is most widely used to assess
representing the first standard setting an energy usage quota for building energy performance [16,22,23]. The regression analysis
buildings at the national level [9]. The Standard notes the energy establishes a linear regression model using an EUI and its key
quotas for buildings located in different climate zones; however, explanatory variables, then an energy efficiency ratio is defined
differences in economic conditions, energy policies and architec- according to the actual EUI and predicted EUI. A recent study
tural forms may all lead to differences in building energy usage has improved the linear regression method by adopting quantile
levels within cities, which indicates that the energy quota values regression, which does not require normal or constant variance for
in the Standard may not be representative in certain cities. This a dataset [24]. Although regression analysis is easy to apply and
characteristic increases the difficulties for local governments in accepted by engineers and policy-makers, there are still significant
conducting reasonable energy benchmarking programs at the city limitations. First, residuals from a regression model not only
level. There is still no official operating energy usage or efficiency represent relatively inefficient items in buildings but also contain
evaluation program for public buildings in Chongqing. To help random items, which may include measurement errors, statistical
policy-makers better understand how to choose the best bench- noise and the effects of unexplained factors. Additionally, a large
marking method and identify inefficient buildings, this paper col- sample size and comprehensive explanatory variables are needed
lected a sample of hotel buildings in Chongqing, considered build- for linear regression analysis since the analysis is sensitive to
ing designs and operational characteristics, applied three bench- outliers. Furthermore, linear regression models are not likely to
marking methods to compare the obtained energy performance as- capture the non-linear nature of building energy consumption and
sessments and put forward potential suggestions for policy-makers. its drivers.
Unlike linear regression, machine learning (ML) methods, such
as artificial neural networks, random forests, and regression trees,
1.2. Literature review which enable learning non-linear relations and obtaining higher
accuracies in the research field of predicting building energy
1.2.1. Current benchmarking methods consumption and building energy optimization [25–27], have also
According to previous studies, current benchmarking methods been adopted in recent studies of building energy benchmarking
can be divided into three main categories: i) physical models, ii) [28–30]. Similar to the MLR method, the energy efficiency ratio
energy performance indicators, and iii) data-driven approaches. is defined as the actual EUI divided by the model-predicted
The physical model-based benchmarking method focuses on estab- EUI and used to quantify the relative energy performance of a
lishing a benchmarking model with known input parameters by building. It should be noted that the accuracy performance of this
using simulation tools and then comparing the actual energy per- approach highly relies on a large sample size for training, testing
formance to the results of the benchmarking model. To make the and validating ML models to ensure their reliability. Another
input parameters more representative of the dataset, researchers data-driven approach discussed in current studies is clustering
have selected average values as input parameters for simulations analysis, which classifies buildings based on multiple dimensions
[10]. Although the simulation approach can quantify the influence of building features using the k-means clustering method [31].
of each parameter on energy consumption, there are time and The centroid of each cluster is considered the energy performance
effort costs, as well as a gap between the actual and simulated benchmarking reference. However, the main shortcoming of the
energy performance due to the significant uncertainty of building clustering method is that we do not know how to classify new
model inputs [11]. Another physically based model mentioned in buildings that are not part of the samples used for clustering.
some studies [12,13] proposes a method based on exergy analysis Some recent studies have also used frontier analysis to mea-
to define an exergy efficiency indicator for assessing the energy sure building energy efficiency by fitting frontiers and actual
performances of buildings and accounting for thermodynamic en- EUIs. Frontier analysis can be divided into data envelopment
ergy flows. However, this method was developed based on detailed analysis (DEA) and stochastic frontier analysis (SFA). DEA uses
physical parameters and may not be feasible for use in evaluating a linear programming technique to measure the efficiencies of
the energy performances of a large number of buildings. decision-making units [32–34], which is a useful approach because
The energy performance indicator, or energy usage intensity it considers the complex nature of the relations between multiple
(EUI), is simple to use for ranking sampled buildings and under- inputs and outputs. However, the main drawback of DEA is that
standing energy usage levels according to descriptive statistics it attributes residuals to inefficient items and ignores the impacts
values, such as the mean and quantile values of EUI [14,15]. of random items. Moreover, it cannot provide a specific equation
Commonly used EUI measures include the energy consumption relating the input and output due to its nonparametric nature. To
per unit area, energy consumption per person, and energy con- improve these defects, the SFA approach has been introduced in
sumption per unit room [16–18]. However, the main drawback of some studies to evaluate energy efficiency performance [35–37].
EUI is that it does not consider other important energy drivers, SFA is a parameter frontier approach that assumes a function
such as occupancy and the HVAC system, leading to unreliable to give the maximum possible output as a function of certain
benchmarking results. Based on such limitations, some studies inputs. The advantage of SFA compared with DEA and the multiple
have proposed a normalized EUI using the degree-day method linear regression (MLR) approach is that this model can estimate
[19], energy disaggregated indicators [20], or a detailed bottom-up inefficiency and data noise according to the deviations from the
indicator system [21]. However, some limitations still exist when frontier and the actual energy consumption using the assump-
these methods are applied. For example, normalizing an EUI using tions about the distribution of the measurement errors and the
the degree-day method relies on the assumption that building inefficiency terms [38]. That is, SFA acknowledges the presence
energy consumption and heating/cooling degree-days have a linear of measurement errors and other sources of statistical noise and
correlation, energy disaggregated indicators do not consider the enables the separation of inefficiency from data noise, while MLR
effects of occupancies, and a bottom-up indicator system may assumes that all residuals come from inefficiency. This method of
Y. Ding and X. Liu / Energy & Buildings 209 (2020) 109711 3
Table 1
Summary of comparisons of benchmarking methods in the references.
Chen et al. [41] Lorenz curve Practical EUIs and overall 212 buildings Small
energy consumption
Gao and Malkawi [31] K-means clustering Energy Star scores 1964 buildings Validation based on 4 buildings. Half are almost
the same, and half are greatly different.
Yang et al. [37] Decision trees and Energy Star scores and 10,153 The proposed method is more consistent with the
stochastic frontier analysis actual EUI buildings actual EUI than with Energy Star scores.
Papadopoulos and Xgboost and k-means Energy Star scores 7487 buildings Significant
Kontokosta [30] clustering
eliminating statistical noise from residuals contributes to a more local governments enhance their understanding of how to choose
accurate estimation of efficiency. an appropriate benchmarking method and formulate energy effi-
ciency policies.
1.2.2. Comparisons of benchmarking methods
The aforementioned studies have a common goal, namely,
2. Data and methodology
developing an evaluation indicator of building energy perfor-
mance while reducing the effects derived from the core energy-
2.1. Data acquisition
influencing factors. Most of the studies draw attention to proposing
a new benchmarking method, with only a few studies mentioning
An effective energy benchmarking program should be based on
the validation of the proposed method by comparison with other
a sufficient dataset. However, unlike in the U.S., where some pub-
methods. Table 1 summarizes the comparisons mentioned in
licly available databases, such as the EIA CBECS (Energy Informa-
current literature, including the applied benchmarking method,
tion Administration’s Commercial Buildings Energy Consumption
validated method, sample size and differences. In each reference
Survey) [42] and the DOE BPD (Department of Energy’s Building
except reference [37], only one method was used to validate the
Performance Database) [43], have been established, there is no na-
reliability of the proposed method. No matter how small or sig-
tional or local database of building energy performance in China.
nificant the differences, it is difficult to define which methods are
Supported by the Chongqing Housing and Urban-Rural Committee,
“accurate” since none of the results given by a certain benchmark-
this study collected data from the energy audit reports of a to-
ing method can be considered as ground truth. Note that these
tal of 48 hotel buildings that were classified as energy retrofitting
references indicate that Energy Star seems to perform unstably re-
demonstration projects in Chongqing. The selected information
garding energy benchmarking results. In summary, there is a lack
contains data on building energy consumption and potential en-
of a full study considering the comparison of different benchmark-
ergy influencing factors. Most current studies focus on the inherent
ing methods. Meanwhile, conducting a comparison of different
design characteristics of buildings, such as the gross floor area and
benchmarking methods regarding building energy performance
floor type, and seldom consider equipment and operation charac-
evaluation is very worthwhile, since unlike the comparisons in
teristics. According to one study [44], six key factors that influence
other research fields—e.g., those of building energy consumption
real energy use are the building envelope, building equipment (en-
prediction, which focus on the accuracy or time consumption of
ergy systems), building operation and maintenance, weather, in-
the applied methods [39,40]—we draw more attention to the con-
door comfort criteria, and occupant behaviors. Therefore, this study
sistency among different methods rather than absolute accuracy.
comprehensively considered potential energy-influencing factors,
Based on the above reasoning, we present two main contri-
and the variables used in this study are summarized as follows:
butions to the performance and application of current commonly
used energy benchmarking methods. We first selected three rep- (1) Energy consumption
resentative benchmarking methods—MLR based on Energy Star,
SFA and the DSM based on the Standard in China—to discuss The energy bills for 12–24 months were collected from energy
whether they can provide consistent benchmarking results and audit reports. The energy bill data involved all types of monthly
which method has the most robust performance, and then we hotel energy consumption data, including electricity and natural
compared the three methods according to consistency, robustness gas, which were verified by electric power and gas companies to
and explanatory ability. Second, we present suggestions for policy- ensure their factuality.
makers regarding how to select and apply benchmarking meth-
ods to manage buildings with different energy performances to (1) Potential influencing factors
further improve the energy efficiency. The reasoning behind the (a) Building characteristics: building age; hotel ratings; loca-
selection of these three methods is, first, because Energy Star is tion; gross floor area, number of guest rooms; number of
one of the most popular energy benchmarking tools and is always stories; floor area of guest rooms, dining, conference and
regarded as a reference for validation. Second, SFA is able to ac- entertainment zones; underground floor area; floor area
count for random effects and technical inefficiency, which is a su- with air conditioning; glazing ratio; performance index of
periority compared with the ordinary least squares estimation ap- external wall, roofs and windows.
plied in MLR. Finally, DSM is the method that was adopted and (b) Occupancy: annual average occupancy rate.
recommended recently in a national standard in China; thus, we (c) Equipment parameters: lighting power density of guest
were concerned about the relationships between the correspond- rooms; air conditioning system type; chiller type of air
ing evaluation results and those given by the other two popu- conditioning system; rated energy efficiency ratio (REER) of
lar benchmarking methods. We did not consider other data-driven chiller.
methods such as machine learning or clustering analysis due to (d) Operation: heating and cooling period.
the limitation of the sample size in this study. This study provides (e) Weather: daily average outdoor temperature used to calcu-
a first and an important step in developing and implementing a late heating- and cooling-degree days (the heating reference
building energy benchmarking program in Chongqing city, helping degree is 18°C and the cooling reference degree is 26°C).
4 Y. Ding and X. Liu / Energy & Buildings 209 (2020) 109711
Table 2
Statistical summary of data collected from the surveyed hotels.
The 48 sampled buildings are distributed in multiple districts (1) Determination of the energy usage intensity
in Chongqing, as shown in Fig. 1. This district is in China’s hot
There are several types of EUIs, including the annual energy
summer and cold winter zone, where the average temperature
consumption per unit area and the annual energy consumption
of the coldest month is from 4 to 13°C and that of the warmest
per unit per room. To select the most appropriate type of EUI
month is from 22 to 36°C, making both cooling and heating
for the sampled buildings in this study, the Pearson correlation
systems important and complex in this district. The main energy
coefficient (r) was adopted to obtain a building energy-influencing
sources of the sampled hotels are electricity and natural gas. Elec-
factor that is more significantly correlated with total energy
tricity is mainly used for lighting, plug-in devices, water pumps,
consumption compared to the rest of the factors. The degree of
fans and terminals of the HVAC system, elevators, domestic hot
correlation between the building energy consumption and various
water pumps, etc. Natural gas is mainly used for cooking, boilers
influencing factors can be measured by Eq. (1).
of HVAC systems, domestic hot water boilers, etc. n
The selected sample initially comprised 48 hotels, of which (xi − x̄ )(yi − ȳ )
r= n i=1
n (1)
i=1 (xi − x̄ ) i=1 (yi − ȳ )
45 were finally used. Three hotels were excluded because not
Y. Ding and X. Liu / Energy & Buildings 209 (2020) 109711 5
Table 3
Cooling and heating source of the samples.
Percentage (%) 62 13 25
Here, r ranges between (−1, +1). When r is less than 0, it and four input combinations were considered in this study due to
indicates a negative correlation, and in the opposite case, there is the limited number of sample buildings and various potential in-
a positive correlation. In addition, the larger is the value of |r|, the dependent variables, which may have resulted in overfitting of the
stronger is the correlation between the two variables; otherwise, model. The limited combinations considered also greatly reduced
the weaker is the correlation between the two variables. the calculation process. The commonly used Cobb-Douglas cost
function was used in this study, as shown in Eq. (4).
(1) Development of a multiple linear regression model
β β β
This study used the stepwise regression method to develop a yi = f (xi ; β ) = eβ0 x1i1 x2i2 . . . xkik , i = 1, 2 . . . , n (4)
benchmarking model. This method uses an automatic procedure The proposed SFA model for EUI is specifically as follows:
to determine the best explanatory variables among the candi-
β β β
date variables mentioned in the Section 2.1 and minimizes the yi = f (xi ; β )eεi = eβ0 x1i1 x2i2 . . . xkik eεi , i = 1, 2, . . . , n (5)
collinearity problems between the variables for use in a regression
model. Each step introduces the most significant variable, checks where yi represents the actual EUI of a building and n is the num-
whether or not the selected variables in the model are signifi- ber of buildings; xk is a vector of energy-influencing factors af-
cant, and removes the insignificant variables from the candidate fecting the building energy consumption and k is the number of
explanatory variables based on a specific criteria (the F-test was factors; the β coefficients pertain to the potential factors; and ɛi
used in this study, and the criteria were as follows: probability of includes the inefficiency ui and random error ν i and is written as
F-value to enter ≤0.05, probability of F-value to remove ≥0.10). In εi = ui + νi . After taking the log transformation of Eq. (5), the func-
this way, the final stepwise regression model was able to include tion is shown in Eq. (6).
the best explanatory variables. The developed MLR model is given
k
by the following equation: ln yi = β0 + βk ln xki + ui + νi , i = 1, 2, . . . , n (6)
Y = β0 + β1 X1 + β2 X2 + · · · + β p X p + ε (2) k=1
Table 4
Pearson correlations of potential factors for annual energy consumption.
Table 5
Results from multiple regression analysis of the EUI.
Table 6 Table 8
Benchmarking rating table. Parameter estimates for the SFA model
by the maximum-likelihood technique.
Percentile EER Score
Parameter Estimate
10% 0.607 90
20% 0.717 80 λ 3.678
30% 0.804 70 σu2 0.348
40% 0.884 60 σv2 0.025
50% 0.963 50 γ 0.931
60% 1.047 40 log[L(H0 )] −20.381
70% 1.142 30 log[L(H1 )] −19.017
80% 1.261 20 LR test 0.049∗
90% 1.438 10
Significance codes:.
∗
0.1.
Table 7
Coefficient estimates for the SFA model by the maximum-
likelihood technique.
resents the most variation in the actual EUI that comes from
Variable Coefficient value Standard error Probability inefficiency, with only 7% of the random error contained in the
Intercept 2.104 0.717 0.003∗∗∗
error item. An LR test was used to test whether there were
GFASER 0.222 0.073 0.002∗∗∗ differences between the sampled buildings in terms of energy
FLOOR 0.161 0.091 0.074∗ efficiency. According to Table 8, the LR test provided a statistic of
HTCEWALL 0.104 0.127 0.412 2.73 (=−2 × (−19.017 − (−20.381 ))), which exceeded the 90%
HVAC1 0.348 0.107 0.001∗∗∗
critical value of 2.71. Hence, this result demonstrates that there are
Significance codes:,∗ ∗ ∗ ,0.01,∗ ∗ ,0.05, ‘∗ ’0.1. significant differences in energy efficiency between the buildings.
The coefficients of the SFA model, including GFASER , FLOOR,
HTCEWALL and HVAC1, are all positive and show a positive relation-
the coefficients and parameters for the SFA model are shown in ship between the corresponding variable and the EUI. This result is
Tables 7 and 8. partly the same as the findings in [35]; that is, a growth in enter-
The parameter gamma (γ ) is close to 1 and shows that the tainment area in a building will lead to an increase in the build-
inefficiency item accounts for 93% of the error item, which rep- ing’s EUI. It is worth mentioning that the input variables are dif-
8 Y. Ding and X. Liu / Energy & Buildings 209 (2020) 109711
Fig.. 3. Relation between the predicted frontier and the actual EUI. (For interpretation of the references to color in this figure, the reader is referred to the web version of
this article.)
ferent in the final SFA and MLR models. One possible reason could energy efficiency if it has an EUI lower than 108 kWh/m2 . When
be related to the different function forms used in the two models. the EUI is between these two indicators, it indicates a normal
Another possible reason could be the use of different methods of performance in which there may still be some energy saving
independent variable selection in the two models. In particular, the potential. Fig. 5 displays the probability density curves with kernel
MLR model used a stepwise regression, while the SFA model deter- density estimation for the corrected EUI from the Standard. The
mined independent variables according to the results of an LR test. area between the best and worst performers is narrow, which
As shown in Fig. 3, the purple line represents the predicted might exacerbate the difficulties in the benchmarking process for
frontier (the minimum EUI with the highest energy efficiency) and existing buildings since the requirement of the constraint value is
is equivalent to the actual EUI, demonstrating that the energy per- higher than the value of the 75th percentile applied in another
formance of the buildings on the line is efficient. The scatters are study [19]. However, it is beneficial in enhancing the motivation
closer to the purple line, meaning that the corresponding buildings for the energy-efficient design of new buildings and to encourage
are more efficient in energy utilization. However, several buildings users to apply advanced technologies for efficiency improvements.
were found to have an actual EUI lower than the frontier because
of random error items. Overall, there is a clear gap between the 4. Comparative analysis of benchmarking methods
actual EUI and the fitted frontier for sampled buildings according
to the slope of the two lines in Fig. 3. This section compares the energy performance assessments
Fig. 4 shows the distribution of the energy efficiency scores ob- based on the above methods, including the MLR based on Energy
tained from the SFA model. In general, the scores estimated from Star, SFA and the DSM. Each of these data-driven benchmarking
the SFA model ranged from 0.26 to 0.93, with an average score methods provided a set of definite benchmarking results for the
of 0.66. The majority of the sampled buildings were evaluated sampled buildings. Evaluating the accuracy of each benchmarking
between 0.5 and 0.89, which means that most of the buildings method is a challenge for a benchmarking program since it is
have relatively good energy performance. time-consuming and costly to acquire the practical energy effi-
ciencies of individual buildings in large-scale building sampling,
3.3. Benchmarks based on the standard leading to an absence of ground truth to verify the obtained en-
ergy efficiency scores of each method. Hence, this section verifies
Because all the sampled buildings have adopted central air con- the energy performance by comparing whether the benchmarking
ditioning systems to improve the indoor thermal environments, results of the three methods are consistent. In other words, we
they are all considered to be Type B buildings. According to the focus on the consistency and robustness of the benchmarking
requirements of the correction method in the Standard, the EUIs results instead of a comparison to the ground truth of energy
of five hotels need to be corrected since their annual average performance. In addition, benchmarking does not merely put
occupancy rate and percentage of guest room area exceed the forward an accurate energy performance assessment or a ranking
standard values. The constraint value is 150 kWh/m2 , and the table. A better explanatory ability is another important property of
leading value is 108 kWh/m2 . This result signifies that a building benchmarking methods [37,52]. This requires benchmarking meth-
is regarded as an inefficient energy user if its EUI is higher than ods to provide suggestions for improving the energy performance
150 kWh/m2 , while a building can be considered a leader in of buildings or a better understanding of benchmarking programs.
Y. Ding and X. Liu / Energy & Buildings 209 (2020) 109711 9
Fig.. 4. The frequency distribution of the energy efficiency scores based on SFA.
Table 9
Spearman correlation test of the ranking results based on
the three benchmark methods.
MLR 1 – –
SFA 0.439(0.003) 1 –
DSM –0.505 (0.000) –0.797 (0.000) 1
4.2. Explanatory ability energy efficiency scores estimated by SFA. The different colored
bars suggest the measures that the government could adopt
The MLR method based on Energy Star provides three inde- for different buildings. The yellow bars represent the 22% of
pendent variables, namely, the percentage of entertainment and buildings that are inefficient and may receive a financial penalty
leisure area and cooling and heating periods, that significantly and be required to implement energy retrofitting. The blue bars
affect the EUIs in the sampled buildings. The variables of cooling correspond to the most efficient 4% of buildings that could be
and heating periods are related to the operational characteristics established as demonstration buildings to encourage relatively
of the HVAC system. This result suggests that hotels in this sample inefficient buildings to further improve their energy efficiency and
should adopt measures that can reduce the operation time of provide an example of advanced energy efficiency technologies
HVAC systems to improve the energy performance. The SFA model and management strategies. The largest gray bars with 0.5–0.89
in this paper indicates that almost all of the input variables are re- efficiency scores refer to less efficient buildings, which could be
lated to design parameters that seldom change during the building encouraged through incentive schemes to conduct energy audits
operation period, such as the number of building floors, the type of and retrofit to further improve their energy performance.
chiller units and the U-values of external walls. The MLR and SFA The interpretability of benchmarking methods indicates the
methods reveal that the determinations of EUIs are related to the building information that should be collected for benchmarking.
design and operational characteristics of the HVAC systems, which Current studies focus on inherent building characteristics and
suggests that energy retrofitting for these sampled buildings could have shown the importance of such information in influencing
be dependent on the implementation of effective measurements the building energy consumption. According to the MLR and SFA
in the HVAC systems. A similar result was obtained in an investi- models, this study shows that the design and operational charac-
gation of 68 retrofitted demonstration projects in Chongqing [54]. teristics of HVAC systems also have significant effects on building
Although the MLR and SFA benchmarking methods enable energy consumption. Therefore, the information of the HVAC
providing some potential suggestions for energy conservation system should also be considered in data collection for developing
measurements for the sampled buildings, they are unlikely to energy benchmarking programs, local or national energy databases
provide detailed solutions for individual buildings in which there and energy performance disclosure policies.
is high energy use or energy waste. Furthermore, the obtained
energy efficiency scores can only reflect the relative energy per- 5.2. Limitations and future work
formances among the sampled buildings. Therefore, on-site energy
audits are still needed to further detect energy saving potentials Similar to any data-driven method, the benchmarking methods
for individual buildings. The best use case for these benchmarking we discussed in this study would give more accurate benchmark-
methods is to determine which buildings are considered to be ing results if a sufficient number of buildings were sampled and
inefficient in a sample and try to understand the core drivers of the data of comprehensive energy influencing factors could be
energy consumption in the buildings. The explanatory ability of acquired. Although this study collected various details of potential
the benchmarking results allows different parties, from building energy drivers, including the physical, operational characteristics
users and energy service companies to governmental policy- and occupancy behaviors of buildings, its sample was still limited
makers, to discover deeper insights that can be used to make in size and far smaller than the samples in similar previous studies
better energy efficiency decisions and policies. in other countries [24,35,37]. This situation indeed increased the
Compared with the other two methods, the DSM has a lower difficulties of developing an accurate energy benchmarking model
explanatory ability, as reflected in two aspects. On one hand, the in China. Another limitation is that the actual energy efficiency
correction factors (annual average occupancy rate and percentage performances of the sampled buildings were not available to verify
of guest room area) are inherent building characteristics that are the estimated energy efficiency scores relative to the ground truth
related to the building function and business rather than to the for the benchmarking methods in this study. In future work, we
energy performance. On the other hand, the correction method will first expand the number of sampled buildings. Specifically, we
mentioned in the Standard does not provide a detailed explanation plan to use a larger available database (such as the CBECS database
of how the correction formulas were conducted. in the U.S.) to compare the performances of the discussed bench-
marking methods, confirm whether the differences still occur
5. Discussion and analyze the robustness of each method to further determine
the most robust benchmarking method. Then, more data-driven
5.1. Policy implications methods including machine learning will be considered to make a
comprehensive comparison. In addition, almost all of the sampled
This study discussed how to apply three methods to evaluate hotels are retrofitted demonstration projects in Chongqing, and we
the energy performances of sampled buildings in Chongqing and intend to obtain the energy bills from the post-retrofitting period
compared the performances of the methods based on consis- to quantify the energy savings and evaluate the effects of energy
tency, robustness and explanatory ability. The results can assist retrofitting at the city level.
governmental policy-makers in developing a building energy
benchmarking program. 6. Conclusions
According to the comparative analysis, there is not sufficient
evidence to determine the best or worst benchmarking method, Data-driven methods are widely used to benchmark the energy
since there are significant differences in the estimated efficiency performances of buildings, but few studies have compared the
scores (or grades) between the three methods, and there is a lack performances or accuracies of these methods. In this paper, we se-
of ground truth of the energy performance for sampled buildings, lected the widely used MLR based on Energy Star, SFA and the DSM
as well. Hence, policy-makers are recommended to select the most based on the energy consumption standard in China to determine
robust benchmarking method or apply multiple benchmarking the performance of each benchmarking method. The energy and
methods at the same time to avoid misleading policy conclusions building characteristic information of 45 four- and five-star hotels
obtained by using only one unreliable benchmarking method. in Chongqing was collected for a case study. Since it is difficult
In particular, Fig. 6 presents that SFA has the best robustness. to obtain the actual energy efficiencies of individual buildings, we
As an example, Fig. 4 shows the frequency distribution of the focused on the consistency and robustness instead of the absolute
12 Y. Ding and X. Liu / Energy & Buildings 209 (2020) 109711
accuracy of the three benchmarking methods to quantify their per- [2] X. Yang, S. Zhang, W. Xu, Impact of zero energy buildings on medium-to-long
formance via Spearman correlation analysis and Sankey diagrams. term building energy consumption in China, Energy Policy 129 (2019) 574–586,
doi:10.1016/j.enpol.2019.02.025.
Some interesting findings were obtained from the results of the [3] UNFCCC (United Nations Framework Convention on Climate Change), The Paris
comparative analysis of the three benchmarking methods. The re- agreement, 2015, < http://unfccc.int/paris agreement/items/9485.php >(Ac-
sults of DSM and SFA were found to be the most consistent, while cessed June 24, 2019).
[4] W. Chung, Review of building energy-use performance benchmarking method-
those of MLR and SFA and of MLR and DSM showed significant ologies, Appl. Energy. 88 (2011) 1470–1479, doi:10.1016/j.apenergy.2010.11.022.
differences in terms of the evaluated energy performances accord- [5] Energy Star. < http://www.energystar.gov/buildings >(Accessed June 24, 2019).
ing to the low Spearman coefficients and more buildings receiving [6] The Association of German Engineers, Characteristic values of energy con-
sumption in building heating and electricity, VDI 3807 part 2-2014. 2014, Ger-
different energy efficiency grades. Furthermore, the DSM was more
man.
robust in evaluating the ranks of the sampled buildings, and SFA [7] National University of Singapore and National Environment Agency, Technical
was more robust in evaluating the energy efficiency grades, and guide towards energy smart office, 2013. 2013, Singapore.
[8] MoHURD (Ministry of Housing and Urban-Rural Construction Development,
it could be potentially recommended for use by policy-makers to
China), Standard For Energy Consumption of Buildings, China Architecture and
conduct an energy benchmarking program at the city level. Building Press, Beijing, 2016 GB/T51161-2016.
The intention of this paper was to compare the performance [9] D. Yan, T. Hong, C. Li, Q. Zhang, J. An, S. Hu, A thorough assessment of China’s
of data-driven benchmarking methods that are commonly used in standard for energy consumption of buildings, Energy Build. 143 (2017) 114–
128, doi:10.1016/j.enbuild.2017.03.019.
current studies to determine the most appropriate benchmarking [10] Y. Sheng, Z. Miao, J. Zhang, X. Lin, H. Ma, Energy consumption model and en-
method that could be recommended to policy-makers. This paper ergy benchmarks of five-star hotels in China, Energy Build. 165 (2018) 286–
confirmed that using only one benchmarking method may lead 292, doi:10.1016/j.enbuild.2018.01.031.
[11] D. Yan, W. O’Brien, T. Hong, X. Feng, H. Burak Gunay, F. Tahmasebi, A. Mah-
to distorted evaluations of the energy performances of sam- davi, Occupant behavior modeling for building performance simulation: cur-
pled buildings. Policy-makers are recommended to utilize multiple rent state and future challenges, Energy Build. 107 (2015) 264–278, doi:10.
benchmarking methods or select the most robust method to obtain 1016/j.enbuild.2015.08.032.
[12] D.B.D.E. Santo, An energy and exergy analysis of a high-efficiency engine tri-
the assessment results that are likely to be the most approximate generation system for a hospital: a case study methodology based on an-
to the ground truth. Moreover, policy-makers are suggested to nual energy demand profiles, Energy Build. 76 (2014) 185–198, doi:10.1016/j.
apply different strategies such as mandatory schemes or incentive enbuild.2014.02.014.
[13] P. Gonçalves, A.R. Gaspar, M.G. Da Silva, Energy and exergy-based indicators
policies to manage buildings with different energy performances.
for the energy performance assessment of a hotel building, Energy Build. 52
Finally, the explanatory power of the benchmarking methods in- (2012) 181–188, doi:10.1016/j.enbuild.2012.06.011.
dicates the influence of the design and operational characteristics [14] A. Juaidi, F. AlFaris, F.G. Montoya, F. Manzano-Agugliaro, Energy benchmarking
for shopping centers in Gulf coast region, Energy Policy 91 (2016) 247–255,
of the HVAC system on the building energy consumption. These
doi:10.1016/j.enpol.2016.01.012.
characteristics are suggested to be further collected to develop [15] J. Zhao, Y. Xin, D. Tong, Energy consumption quota of public buildings based on
local or national databases and energy performance disclosure statistical analysis, Energy Policy 43 (2012) 362–370, doi:10.1016/j.enpol.2012.
policies. In future work, we intend to expand the sample size 01.015.
[16] Z. Wei, W. Xu, D. Wang, L. Li, L. Niu, W. Wang, B. Wang, Y. Song, A study of
by using a public database, such as the CBECS in the U.S. and city-level building energy efficiency benchmarking system for China, Energy
compare more data-driven methods to test the consistency and Build 179 (2018) 1–14, doi:10.1016/j.enbuild.2018.08.038.
robustness of energy performance assessments based on different [17] Y.S. Kim, J. Srebric, Impact of occupancy rates on the building electricity con-
sumption in commercial buildings, Energy Build. 138 (2017) 591–600, doi:10.
benchmarking methods. 1016/j.enbuild.2016.12.056.
[18] P.O. Oluseyi, O.M. Babatunde, O.A. Babatunde, Assessment of energy consump-
tion and carbon footprint from the hotel sector within Lagos, Nigeria, Energy
Declaration of Competing Interest Build. 118 (2016) 106–113, doi:10.1016/j.enbuild.2016.02.046.
[19] Y. Xin, S. Lu, N. Zhu, W. Wu, Energy consumption quota of four and five star
None. luxury hotel buildings in Hainan province, China, Energy Build. 45 (2012) 250–
256, doi:10.1016/j.enbuild.2011.11.014.
[20] C. Yan, S. Wang, F. Xiao, A simplified energy performance assessment method
Acknowledgments for existing buildings based on energy bill disaggregation, Energy Build. 55
(2012) 563–574, doi:10.1016/j.enbuild.2012.09.043.
[21] H. Li, X. Li, Benchmarking energy performance for cooling in large commercial
The authors would like to thank Dr. Travis Walter at the buildings, Energy Build. 176 (2018) 179–193, doi:10.1016/j.enbuild.2018.07.039.
Lawrence Berkeley National Laboratory for his many useful and [22] W. Xuchao, R. Priyadarsini, L. Siew Eang, Benchmarking energy use and green-
house gas emissions in Singapore’s hotel industry, Energy Policy 38 (2010)
helpful suggestions for improving this study. The authors are
4520–4527, doi:10.1016/j.enpol.2010.04.006.
thankful to the Chongqing Housing and Urban-Rural Committee [23] Y. Ding, Z. Zhang, Q. Zhang, W. Lv, Z. Yang, N. Zhu, Benchmark analysis of elec-
and the energy service companies for granting access to rele- tricity consumption for complex campus buildings in China, Appl. Therm. Eng.
vant energy audit reports. This work is also supported by the 131 (2018) 428–436, doi:10.1016/j.applthermaleng.2017.12.024.
[24] J. Roth, R. Rajagopal, Benchmarking building energy efficiency using quantile
project “Standard System for Post-assessment of Green Build- regression, Energy 152 (2018) 866–876, doi:10.1016/j.energy.2018.02.108.
ing Performance” (the National Key Projects in 13th Five-Year, [25] M.W. Ahmad, M. Mourshed, Y. Rezgui, Trees vs neurons: comparison between
project no. 2016TFC0700105), National Natural Science Founda- random forest and ann for high-resolution prediction of building energy con-
sumption, Energy Build. 147 (2017) 77–89, doi:10.1016/j.enbuild.2017.04.038.
tion of China (Grant number 51978095), the project "A Big Data [26] F. Smarra, A. Jain, T. de Rubeis, D. Ambrosini, A. D’Innocenzo, R. Mangharam,
Platform of Supervison, Management and Evaluation for Insti- Data-driven model predictive control using random forests for building energy
tional Buidlings in Chongqing" and the China Scholarship Council optimization and climate control, Appl. Energy 226 (2018) 1252–1272, doi:10.
1016/j.apenergy.2018.02.126.
(Student no. 201806050085). [27] A. Jain, F. Smarra, M. Behl, R. Mangharam, Data-driven model predictive con-
trol with regression trees-An application to building energy management, ACM
Trans. Cyber Phys. Syst. 2 (1) (2018) 1–12, doi:10.1145/3127023.
Supplementary materials
[28] F. Khayatian, L. Sarto, G. Dall’O’, Application of neural networks for evaluat-
ing energy performance certificates of residential buildings, Energy Build. 125
Supplementary material associated with this article can be (2016) 45–54, doi:10.1016/j.enbuild.2016.04.067.
[29] S. Papadopoulos, E. Azar, W.L. Woon, C.E. Kontokosta, Evaluation of tree-based
found, in the online version, at doi:10.1016/j.enbuild.2019.109711.
ensemble learning algorithms for building energy performance estimation, J.
Build. Perform. Simul. 11 (2017) 322–332, doi:10.1080/19401493.2017.1354919.
References [30] S. Papadopoulos, C.E. Kontokosta, Grading buildings on energy performance
using city benchmarking data, Appl. Energy 233-234 (2019) 244–253, doi:10.
[1] Building Energy Research Center in Tsinghua University, Annual Report on 1016/j.apenergy.2018.10.053.
China Building Energy Efficiency (in Chinese), 2017, China Architecture & Build- [31] X. Gao, A. Malkawi, A new methodology for building energy performance
ing Press, Beijing, 2017.
Y. Ding and X. Liu / Energy & Buildings 209 (2020) 109711 13
benchmarking: an approach based on intelligent clustering algorithm, Energy [43] United States Department of Energy. Building performance database.
Build. 84 (2014) 607–616, doi:10.1016/j.enbuild.2014.08.030. <https://www.energy.gov/eere/buildings/building- performance- database- bpd
[32] S. Önüt, S. Soner, Energy efficiency assessment for the antalya region hotels in >(Accessed 24 June 2019 ).
Turkey, Energy Build. 38 (2006) 964–971, doi:10.1016/j.enbuild.2005.11.006. [44] H. Yoshino, T. Hong, N. Nord, IEA ebc annex 53: total energy use in buildings—
[33] W.S. Lee, Benchmarking the energy efficiency of government buildings with Analysis and evaluation methods, Energy Build. 152 (2017) 124–136, doi:10.
data envelopment analysis, Energy Build. 40 (2008) 891–895, doi:10.1016/j. 1016/j.enbuild.2017.07.038.
enbuild.2007.07.001. [45] Y. Jiang, X. Yang, The electricity equivalent calculation used in the energy anal-
[34] J. Keirstead, Benchmarking urban energy efficiency in the UK, Energy Policy 63 ysis, Energy China. 32 (2010) 5–11 (in Chinese).
(2013) 575–587, doi:10.1016/j.enpol.2013.08.063. [46] W.H. Greene, A gamma-distributed stochastic frontier model, J. Econom. 46
[35] J. Buck, D. Young, The potential for energy efficiency gains in the Cana- (1990) 141–163, doi:10.1016/0304- 4076(90)90052- U.
dian commercial building sector: a stochastic frontier study, Energy 32 (2007) [47] G.E. Battese, G.S. Corra, Estimation of a production frontier model: with appli-
1769–1780, doi:10.1016/j.energy.20 06.11.0 08. cation to the pastoral zone of eastern australia, Aust. J. Agric. Econ. 21 (1977)
[36] M. Filippini, L.C. Hunt, US residential energy demand and energy efficiency: 169–179, doi:10.1111/j.1467-8489.1977.tb00204.x.
a stochastic demand frontier approach, Energy Econ. 34 (2012) 1484–1491, [48] T. Coelli, A guide to frontier version 4.1: a computer program for stochastic
doi:10.1016/j.eneco.2012.06.013. frontier production and cost function estimation, Comput. Progr. Stoch. Front.
[37] Z. Yang, J. Roth, R.K. Jain, DUE-B : data-driven urban energy benchmarking of 7 (1996) 1–33, doi:10.10 07/BF0 0158774.
buildings using recursive partitioning and stochastic frontier analysis, Energy [49] J. Jondrow, K. Lovell, I.S. Materov, P. Schmidt, On the estimation of technical
Build. 163 (2018) 58–69, doi:10.1016/j.enbuild.2017.12.040. inefficiency in the stochastic frontier production function model, J. Econom. 19
[38] D. Aigner, K. Lovell, P. Schmidt, Upland development and prospects for the ru- (1982) 233–238, doi:10.1016/0304-4076(82)90 0 04-5.
ral poor: experience in northern Thailand, J. Econom. 6 (1977) 21–37, doi:10. [50] J.C. Wang, A study on the energy performance of hotel buildings in Taiwan,
1016/0304- 4076(77)90052- 5. Energy Build. 49 (2012) 268–275, doi:10.1016/j.enbuild.2012.02.016.
[39] H.F. Deng, D. Fannon, M.J. Eckelman, Predictive modeling for us commer- [51] Coelli, T., Henningsen, A., 2017. Frontier: stochastic frontier analysis. R Package
cial building energy use: a comparison of existing statistical and machine Version 1.1–2. < https://cran.r-project.org/web/packages/frontier >(Accessed 24
learning algorithms using CBECS microdata, Energy Build. 163 (2018) 34–43, June 2019).
doi:10.1016/j.enbuild.2017.12.031. [52] C. Koulamas, A.P. Kalogeras, R. Pacheco-torres, J. Casillas, L. Ferrarini, Suitability
[40] T. Ostergard, R.L. Jensen, S.E. Maagaard, A comparison of six metamodeling analysis of modeling and assessment approaches in energy efficiency in build-
techniques applied to building performance simulations, Appl. Energy 211 ings, Energy Build. 158 (2018) 1662–1682, doi:10.1016/j.enbuild.2017.12.002.
(2018) 89–103, doi:10.1016/j.apenergy.2017.10.102. [53] Institute for Market Transformation. Coming to NYC: Building Energy Grades;
[41] Y.B. Chen, H.W. Tan, U. Berardi, A data-driven approach for building en- 2018. < https://www.imt.org/coming- to- nyc- building- energy- grades/ >(Ac-
ergy benchmarking using the Lorenz curve, Energy Build. 169 (2018) 319–331, cessed 24 June 2019).
doi:10.1016/j.enbuild.2018.03.066. [54] D. He, Y. Huang, Y... Ding, Systematically analysis and study on technical mea-
[42] EIA 2019. Commercial Buildings Energy Consumption Survey (CBECS). < http: sures of energy savings for typical public buildings in Chongqing, J. Chongqing
//www.eia.gov/consumption/commercial/about.cfm> (Accessed 24 June 2019). Univ. Sci. Technol. (Nat. Sci. Ed.) 16 (2014) 112–116.