Beruflich Dokumente
Kultur Dokumente
errors
This content has been downloaded from IOPscience. Please scroll down to see the full text.
(http://iopscience.iop.org/0031-9155/61/6/2514)
View the table of contents for this issue, or go to the journal homepage for more
Download details:
IP Address: 213.23.82.131
This content was downloaded on 12/10/2016 at 09:17
Modulation index for VMAT considering both mechanical and dose calculation uncertainties
Jong Min Park, So-Yeon Park and Hyoungnyoun Kim
Implementation of phantom-less IMRT delivery verification using Varian DynaLog files and R/V output
C E Agnew, R B King, A R Hounsell et al.
Impact of MLC leaf position errors on head and neck IMRT plans
G Mu, E Ludlum and P Xia
Performance assessment of a 2D array of plastic scintillation detectors for IMRT quality assurance
Mathieu Guillot, Luc Gingras, Louis Archambault et al.
E-mail: sye@snu.ac.kr
Abstract
Discrepancies between planned and delivered movements of multi-leaf
collimators (MLCs) are an important source of errors in dose distributions
during radiotherapy. In this work we used machine learning techniques to
train models to predict these discrepancies, assessed the accuracy of the model
predictions, and examined the impact these errors have on quality assurance
(QA) procedures and dosimetry. Predictive leaf motion parameters for the
models were calculated from the plan files, such as leaf position and velocity,
whether the leaf was moving towards or away from the isocenter of the MLC,
and many others. Differences in positions between synchronized DICOM-RT
planning files and DynaLog files reported during QA delivery were used
as a target response for training of the models. The final model is capable
of predicting MLC positions during delivery to a high degree of accuracy.
For moving MLC leaves, predicted positions were shown to be significantly
0031-9155/16/062514+18$33.00 © 2016 Institute of Physics and Engineering in Medicine Printed in the UK 2514
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
1. Introduction
The introduction of volumetric modulated arc therapy (VMAT) as a method for delivering
radiotherapy has decreased delivery time and monitor units (MU) as compared to conven-
tional intensity modulated radiation therapy (IMRT) (Otto 2008). However, due to the highly
choreographed nature of VMAT delivery, many potential sources of error arise, necessitating
patient specific quality assurance (QA) and dosimetric verification techniques. The complex
movement of the multi-leaf collimator (MLC) is one such source of errors between treatment
planning and delivery. MLC positional errors are differences between the planned and deliv-
ered positions of the individual MLC leaves. These deviations can be studied by comparing
the leaf positions encoded in the planning DICOM-RT files, which contain the intended leaf
positions, to the machine reported DynaLog files, which contain the leaf positions during
delivery. Although the manufacturer specified accuracy of DynaLog files is not present in the
literature, DynaLog file reported MLC positions have been shown to be accurate through the
analysis of film (Zygmanski et al 2003), 2D diode array (Li et al 2003), and electronic portal
imaging device (Zeidan et al 2004) measurements.
Systematic shifts in leaf position and leaf gap have been shown to have detrimental effects
on the accuracy of the delivery of dose distributions for both IMRT (Rangel and Dunscombe
2009, Yan et al 2009, Bai et al 2013) and VMAT (Oliver et al 2010, Tatsumi et al 2011). Some
of the causes of leaf errors are known; for example, velocity of individual MLC leaves has
been shown to have an approximately linear relationship with positional errors (Ramsey et al
2001, Losasso 2008). Miura et al also showed that gamma passing rates are correlated with
MLC leaf velocity, indicating that errors in MLC positions due to large velocities may have
negative effects on dosimetric accuracy (Miura et al 2014b). Furthermore, it has been shown
that constraining the millimeters traveled per leaf per MU improves the delivery accuracy of
the treatment plan (Chen et al 2011, Miura et al 2014a).
Due to the negative impact of MLC positional errors on the delivery accuracy of radio-
therapy plans, it is advantageous to be able to predict how the errors will impact the deliv-
ery accuracy. To this end, several modulation indices have been developed in attempts to
score the delivery accuracy of VMAT and IMRT plans (Li and Xing 2013, Masi et al 2013,
2515
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
Park et al 2014a, 2014b) before they are delivered. However, these methods are correla-
tional, and appropriate thresholds for these values are difficult to define (Park et al 2014b).
Furthermore, these indices do not give the treatment planner any information as to how the
dose distribution as viewed in the treatment planning system (TPS) will be influenced by
the errors.
Therefore, in this study we focused on creating a method for predicting MLC positional
errors before delivery, and incorporated those errors into the dose distribution calculation to
enable the treatment planner to see a more accurate representation of the dose as it will be
delivered. To predict the errors, we first acquired planned and delivered MLC positions from
a series of VMAT plans, and calculated the differences between the two. Next, we calculated
leaf motion parameters of the plans which were hypothesized to lead to MLC errors. We then
built machine learning models using these parameters as inputs to predict the errors between
planned and delivered MLC positions. We then verified the accuracy of the predictions, and
assessed their impact on QA and patient dosimetry.
The final outcome of the study is a model capable of taking a planned set of MLC positions
in the form of a DICOM-RT file, and predicting the positions which will be delivered to a
high degree of accuracy. By including the predictions of the model into the TPS, we show that
it is possible to achieve a more accurate representation of the true locations of MLC leaves,
which allows treatment planners to see a realistic view of the dose that will be delivered to
the patient.
A retrospectively selected set of 74 VMAT plans was acquired from three separate institu-
tions for this study. The plans from Institution 1 were for head and neck (H&N) (N = 20),
and prostate (N = 20) cancer. The plans from Institution 2 were also for H&N (N = 6), and
prostate (N = 10). For Institution 3 there were H&N plans from various sites (N = 15) and
prostate plans (N = 3).
All plans were generated in the Eclipse system (Varian Medical Systems, Palo Alto, CA)
with the progressive resolution optimizer 3 (PRO3, ver.11.0.31, Varian Medical Systems, Palo
Alto, CA). Dose distributions were calculated using the anisotropic analytic algorithm (AAA,
ver.11.0.31, Varian Medical Systems, Palo Alto, CA) with a dose calculation grid of 2 mm.
Two full arcs were used in each plan, and optimized such that the angular separation between
control points (CPs) was 2.0341°, leading to 356 individual CPs per plan.
Each plan was delivered using a linear accelerator equipped with a Varian Millennium 120
MLC. All plans from each institution were delivered using a single linear accelerator, and
therefore a single MLC from the respective institution. The Millennium 120 MLC consists
of two banks of 60 MLC leaves, with the outer 20 and inner 40 on each side having widths
of 1 cm and 0.5 cm, respectively. Initial calibration of all MLCs was performed by a qualified
Varian engineer. In all institutions included in this study, TG-40 (Kutcher et al 1994) and
TG-142 (Klein et al 2009) protocols are followed for MLC QA.
The planned positions of each individual MLC leaf at each CP for every plan were extracted
from DICOM-RT files exported from the Eclipse system. Therefore, from each plan 42,720
leaf position data points were extracted (356 CPs for each of the 120 MLC leaves). The plans
2516
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
were then delivered, and delivered locations of the individual MLC leaves were extracted from
the Dynalog of the MLC.
After extracting the planned positions from the DICOM-RT files and the delivered posi-
tions from the Dynalog files, the two sets of positions (planned and delivered) must be
synchronized before the difference between positions can be calculated. Synchronization
must take into account the differences between the sampling times of the DICOM-RT and
DynaLog files. DynaLog files record the position every 0.05 s in units of motor counts,
which were converted to millimeters according to manufacturer specifications. DICOM-RT
positions are recorded at each CP in units of millimeters. At a CP where there is to be more
than 4.238 MU delivered, the gantry slows to allow successful delivery (Park et al 2015),
changing the time between CPs. In this dataset there were no CPs in any of the plans at
which the planned MU was greater than this threshold, and therefore the time between
CPs was taken to be a constant 0.424 s, the maximum gantry movement speed (Park et al
2015). Therefore, after synchronization, the maximum time difference between the plan file
and the DynaLog is 0.025 s, that is, half of the sampling time of the counts recorded in the
DynaLog files.
After synchronization, the differences between the positions present in the DICOM-RT file
(planned positions) and the positions reported by the DynaLog file (delivered positions) for
each leaf were calculated using an in-house program written in R (R Development Core Team
2010). The absolute value of this quantity for each leaf at each CP is the error magnitude.
A number of parameters characterising MLC leaf motion were derived from the planned leaf
positions. Each parameter was calculated for all MLC leaves at all CPs, thus each data point
represents a single MLC leaf at a single CP. The position of each leaf at the CP of interest, and
also at the previous and subsequent CPs was calculated. The instantaneous velocity for each
leaf at each CP was calculated as the leaf position minus the leaf position at the previous CP,
divided by the time between CPs, as described in equation (1):
Position CP − Position CP −1
Velocity
(1) CP =
0.424 s
Acceleration for each leaf was calculated in a similar fashion. Velocity and acceleration for
each leaf were also calculated for the previous and subsequent CP. Velocity and acceleration
of both adjacent MLC leaves was calculated under the hypothesis that friction from adjacent
leaves may induce errors.
Movement of the MLC leaves was also sorted into several categories. A category was
defined to separate leaf motions into categories defining the state of motion, including: ‘at
rest’, ‘moving’, ‘coming to a stop’, or ‘moving for only a single CP’ at the given CP. An
MLC leaf was defined to be at rest if it did not move during the CPs before or after the CP
of interest. Leaf movement direction was categorized to differentiate whether the leaf was
moving towards, or away from the isocenter of the MLC. To further investigate the effect of
friction from adjacent MLC leaves on the movement of the leaf of interest, a category was
defined to classify the two adjacent leaves as ‘both moving in the same direction’, ‘both mov-
ing in the opposite direction’, ‘one moving in the opposite direction’, or ‘both at rest’. The
CP at which the error occurred (i.e. 1 to 356), the arc number (i.e. ‘1’ or ‘2’), and the leaf
bank the leaf was a part of (i.e. ‘A’ or ‘B’) were also extracted. The extraction of the errors
between planning and delivery, and the calculation of predictive leaf motion parameters is
displayed in figure 1.
2517
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
Figure 1. Workflow of the extraction of errors between DICOM-RT and DynaLog files,
and the extraction of leaf motion parameters from planned positions.
For each institution, the data was split into three separate datasets, termed training, validation,
and testing. Thus there were nine sets in total. Since there was no overlap of plans within
the datasets, each set may be considered as independent of the remaining sets from each
institution.
A single plan was randomly chosen from each institution to be the training set for that insti-
tution (N = 1, 1, 1). The choice to use only a single plan for training from each institution was
based on two observations. First, the large number of training data available in each plan (42,
720) was determined to be a sufficient number for training of the machine learning algorithms
used in this study through cross validation with different sizes of training data sets. Second,
the errors are dependent on the individual MLC rather than the plan itself, thus any sufficient
amount of data from each unique MLC would be appropriate to train a model. Therefore,
a model specific to the MLC of a given linear accelerator should be built.
A predictive model specific to each institution was fit using only the data from that institu-
tion’s training plan. After each model was fit to the training plan, the accuracy of each model
was tested on a validation set consisting of two randomly selected plans from the model’s
respective institution. The purpose of the validation set was to find the optimal combination
of leaf motion parameters and to tune any parameter values the model may have. This tuning
process is done on the validation set, rather than the training set, to avoid both over-fitting
of the models, as well as overly optimistic accuracy assessments which do not hold out of
sample. The leaf motion parameters and tunable model parameters were sequentially iter-
ated over to minimize the root mean square error (RMSE) between predicted and delivered
positions on the validation set, and the model with the lowest RMSE was chosen as the final
model.
A final validation of the models was performed using the remaining plans from each insti-
tution, that is, the testing set (N = 37, 13, 15). Model performance on the testing set was then
assessed using mean absolute error (MAE) and RMSE between the predicted and delivered
positions. In this case, RMSE is used as an alternative to the more common standard devia-
tion (SD), as the distributions of the errors do not follow the normal distribution. It should be
noted, however, that the formulas for RMSE and SD are identical, the difference is in interpre-
tation. The statistics reported in this study are from the test set only; this is an alternative and
2518
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
Figure 2. Workflow for training, validating, and finally reporting the statistics of
predictive models.
preferable method to using cross-validation, because the untouched test set can be thought of
as real-world data, as the test data wasn’t used during the training or validation process. This
process is shown in figure 2.
To examine the importance of the choice of training plan on the quality of predictions, anal-
ysis of the predictions was performed using a different randomly selected plan from each
institution for model training than was used initially. For the second model, identical model
parameters as used to train the initial models were used.
Furthermore, to test whether models trained using a different MLC were able to make
accurate predictions for other MLCs, a model trained from each institution was used to
make predictions on testing plans from each of the other institutions. For example, a model
trained using a single plan from Institution 1 was used to make predictions on plans from both
Institutions 2 and 3, and the results compared to the predictions made using models trained
using data from Institutions 2 and 3, respectively.
Due to the non-normal distribution of error predictions, the Mann–Whitney U test was
performed to examine differences in model accuracy using alternative training plans. For each
test performed, the p-values along with 95% confidence intervals of the difference in median
error prediction was reported.
Several different models were tested to find a model with the best predictive accuracy. The
models included a simple linear regression model, a multiple linear regression model, a model
based on the random forest algorithm, and a model based on the cubist algorithm (described
below). The inputs to the models were the leaf motion parameters described above, and the
target response for each model was the difference between the planned and delivered MLC
leaf positions.
Of the leaf motion parameters extracted from the differences between planned positions
and delivered positions, a set of two quantitative parameters: leaf position and instantaneous
2519
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
velocity, and four qualitative parameters: movement towards or away from the center, whether
the leaf was at rest/starting/stopping/moving for a single CP, the CP number, and the leaf bank
were utilized in the final models. All other parameters were found to decrease the RMSE on
the validation set.
The R programming language (R Development Core Team 2010) was used for all data
analysis and modeling.
2.6.1. Linear regressions. For the linear regression modeling, two models were built. The
first, LMV Only, was a simple linear regression of velocity against the target response (differ-
ence between planned and delivered MLC leaf positions). The second was a multiple linear
regression, regressing the parameters described in section 2.6 against the target response.
2.6.2. Random forest. The random forest implementation used was based on Breiman and
Cutler’s algorithm (Breiman 2001) as implemented in the R package ‘randomForest’ (Liaw
and Wienes 2002). Random forests create a predictive model by first randomly selecting a
subset of a given number of features from the feature space. A sample of the training data is
then taken, and the selected features are used to create a decision tree which separates the data
such that the homogeneity of the samples at the terminal node of each branch is maximized.
This process is repeated many times, and each decision tree produced in this way is saved to
create a ‘forest’ of decision trees. To make predictions on new data, the new data point is fed
into each tree, and the tree offers a prediction which is the average of all the data points used
in training which follow the same path through the tree as the new data point. The prediction
of the algorithm is then the average of the predictions from each tree in the forest.
For the random forest model, the number of features randomly sampled as candidates for
each split of the decision tree was four, the value which minimized the RMSE on the valida-
tion set. Any number of trees above 100, and any sample size above 4000 were found to have
little impact on accuracy.
An example of a random forest as applied to the prediction of MLC positional errors is
as follows. First, the algorithm selects four leaf motion parameters, for example leaf veloc-
ity, leaf position, whether the leaf is moving or resting, and whether the leaf is moving
towards or away from the isocenter. Then, a sample of 4000 errors (differences between
planned and delivered positions) and the associated leaf motion parameters for those errors
are extracted. From these, a tree is built with a number of terminal nodes with criteria
such as ‘if leaf velocity is greater than X cm per second, and the leaf is moving towards
the isocenter, the error is Y’. In this study, 100 such trees are built, each having up to 1000
terminal nodes.
2520
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
For each plan from Institution 1, 2D dose distributions of each VMAT plan as delivered with
6 MV photons were acquired with a MapCHECK2 detector array (Sun Nuclear Corporation,
Melbourne, FL). The MapCHECK2 was inserted into a MapPHAN (Sun Nuclear Corporation,
Melbourne, FL) during delivery. Before delivery, the relative responses of each detector in the
MapCHECK2 array, as well as the absolute response of the detector to a known dose were
calibrated according to manufacturer specifications. The absolute dose of the Linac was also
calibrated according to the American Association of Physicists Task Group 51 (AAPM TG51)
protocol (Almond et al 1999).
A CT image of the device setup was imported into the Eclipse system and used for the
calculation of the 2D dose distributions of plans with either planned or predicted positions.
The distributions were calculated with a 2 mm calculation grid, PRO3 optimizer, and AAA
algorithm, as above.
After delivery and calculation, both global and local gamma evaluations were performed
with SNC patient software (ver. 6.1.2, Sun Nuclear Corporation, Melbourne, FL). Gamma
criterion of 3%/3 mm, 2%/2 mm, and 1%/2 mm were used with a 10% threshold for the ROI,
as frequently cited in the literature (Iftimia et al 2010, Heilemann et al 2013). The differences
between the passing rates using the planned MLC positions, and the passing rates using the
predicted MLC positions were compared using paired t-tests to assess the difference in mean
passing rates between the two.
For five H&N patients from Institution 1 for whom patient CT data was available, DICOM-RT
files were reconstructed with predicted MLC positions. These, along with the planned DICOM-RT
files, were imported into the Eclipse system. Dose distributions to the patient CT images were cal-
culated using the same parameters as above, with the exception of the calculation grid size, which
was reduced to 1 mm. For the target volume, clinically relevant dose-volumetric parameters such
as the dose received by 95% of the target volume (D95%), D5%, the minimum dose, the maximum
dose, and the mean dose were compared between planned and predicted VMAT plans. For organs
at risk (OARs) in the H&N plans, the volume of each parotid gland receiving 50% of the dose
(V50%), and mean dose to each parotid gland and each sub-mandibular gland (SMG) were com-
pared. Differences in the dose volumetric parameters between calculations using planned MLC
positions versus delivered positions, and predicted positions versus delivered positions were com-
pared using paired t-tests to assess the mean differences between the two.
3. Results
Several leaf motion parameters were particularly important in increasing model accuracy.
The motion parameter which offered the most predictive ability was leaf velocity, which had
an approximately linear relationship with error magnitude (β = 0.129, CI = 0.128 to 0.130,
p < 0.001), with coefficient of determination, R2, of 0.902 ( p < 0.001). This relationship is
shown in figure 3(A).
Whether the leaf was moving towards or away from the isocenter of the MLC also had a
statistically significant effect on the mean error magnitude, making this category an impor-
tant predictive motion parameter. The MAE of all leaves moving away from the center was
2521
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
1.37 mm (RMSE = 0.99 mm), while the MAE of leaves moving towards the center was only
1.14 mm (RMSE = 0.97 mm). A difference in means of 0.235 mm, with 95% confidence inter-
val (CI) of 0.231 to 0.238 mm (p < 0.001) by the Welch two sample t-test. A boxplot express-
ing the difference between movement directions is shown in figure 3(B).
The model based on the cubist algorithm outperformed all other models. Planned, delivered,
and predicted leaf positions of a single MLC leaf for two representative sets of CPs are shown
in figure 4. The figure shows that in all cases the predicted positions more closely coincide
with the delivered positions than do the planned positions.
The MAE, and root mean squared error between planned and delivered, and predicted and
delivered for moving, resting, and all MLC leaves from each institution are summarized in
tables 1–3, respectively. The considerably lower error between predicted and delivered versus
planned and delivered positions for the cubist model is shown in figure 5.
For Institution 1, the MAE between the planned leaf positions and the delivered leaf
positions of moving MLC leaves was 1.284 mm, with root mean squared error (RMSE) of
1.636 mm. The MAE between positions predicted by the Cubist model and the delivered
positions was 0.253 mm (RMSE = 0.371 mm). Therefore, the predictions were, on average,
greater than 1 mm closer to the delivered positions than were the planned positions.
Institutions 2 and 3 showed similar tendencies. For Institution 2, the MAE between planned
and delivered positions of moving leaves was 1.409 (RMSE = 1.699) mm, and the difference
between predicted and delivered was 0.278 (0.387) mm. These values for Institution 3 were
1.145 (1.495) mm and 0.274 (0.426) mm.
The results of testing the dependence of the predictions on choice of training plan are shown
in table 4. For Institutions 1 and 2 there was no significant difference in the predictions made
2522
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
Figure 4. Planned, delivered, and predicted (cubist) positions of a single MLC leaf
from an H&N plan over two sets of CPs. In plot (A), the leaf is planned to drop rapidly,
with delivered positions lagging until the leaf slows. Plot (B) show a set of CPs where
heavy modulation is planned, but the delivered positions consistently fail to reach the
target. In all cases the predicted positions are closer to the delivered positions than are
the planned positions.
Table 1. Model performance in predicting delivered MLC positions for moving MLCs
from the test set (N = 65).
All plans H&N plans Prostate plans
MAE RMSE
Institution Model (mm)a (mm)b MAE RMSE MAE RMSE
1 Planned 1.284 1.636 1.358 1.354 1.086 1.489
LMV Onlyc 0.324 0.45 0.354 0.476 0.244 0.373
LMd 0.282 0.407 0.302 0.423 0.227 0.359
Random forest 0.275 0.395 0.29 0.407 0.237 0.36
Cubist 0.253 0.371 0.269 0.384 0.21 0.332
2 Planned 1.409 1.699 1.458 1.735 1.361 1.663
LMV Only 0.313 0.409 0.315 0.408 0.311 0.41
LM 0.286 0.372 0.291 0.375 0.281 0.369
Random forest 0.284 0.384 0.29 0.387 0.279 0.381
Cubist 0.278 0.387 0.285 0.393 0.272 0.38
3 Planned 1.145 1.495 1.153 1.504 1.075 1.412
LMV Only 0.356 0.501 0.354 0.5 0.375 0.517
LM 0.305 0.448 0.3 0.443 0.346 0.483
Random forest 0.314 0.454 0.313 0.454 0.318 0.451
Cubist 0.274 0.426 0.273 0.424 0.286 0.44
a
Mean absolute error.
b
Root mean squared error.
c
Linear regression model using only leaf velocity.
d
Linear regression model with all leaf motion parameters.
2523
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
Table 2. Model performance in predicting delivered MLC positions for MLCs at rest
from the test set (N = 65).
All plans H&N plans Prostate plans
MAE RMSE
Institution Model (mm)a (mm)b MAE RMSE MAE RMSE
1 Planned 0.084 0.158 0.159 0.246 0.043 0.075
LMV Onlyc 0.085 0.158 0.16 0.246 0.044 0.075
LMd 0.052 0.085 0.097 0.121 0.027 0.057
Random forest 0.039 0.074 0.061 0.106 0.028 0.048
Cubist 0.027 0.054 0.056 0.088 0.012 0.017
2 Planned 0.037 0.109 0.051 0.132 0.033 0.1
LMV Only 0.039 0.109 0.052 0.132 0.034 0.1
LM 0.016 0.045 0.022 0.06 0.015 0.038
Random forest 0.021 0.051 0.029 0.068 0.018 0.044
Cubist 0.005 0.013 0.007 0.019 0.005 0.01
3 Planned 0.033 0.129 0.037 0.136 0.023 0.11
LMV Only 0.034 0.129 0.038 0.136 0.024 0.11
LM 0.025 0.087 0.026 0.082 0.022 0.099
Random forest 0.022 0.084 0.023 0.08 0.019 0.095
Cubist 0.009 0.04 0.009 0.033 0.01 0.053
a
Mean absolute error.
b
Root mean squared error.
c
Linear regression model using only leaf velocity.
d
Linear regression model with all leaf motion parameters.
Table 3. Model performance in predicting delivered MLC positions for moving and
resting MLCs from the test set (N = 65).
All plans H&N plans Prostate plans
MAE RMSE
Institution Model (mm)a (mm)b MAE RMSE MAE RMSE
1 Planned 0.513 0.987 0.802 1.247 0.24 0.651
LMV Onlyc 0.17 0.298 0.264 0.387 0.082 0.176
LMd 0.134 0.253 0.207 0.32 0.065 0.164
Random forest 0.124 0.244 0.183 0.307 0.067 0.162
Cubist 0.108 0.226 0.17 0.288 0.049 0.145
2 Planned 0.39 0.867 0.636 0.124 0.281 0.724
LMV Only 0.109 0.228 0.162 0.282 0.086 0.199
LM 0.086 0.193 0.134 0.246 0.064 0.163
Random forest 0.089 0.2 0.138 0.255 0.067 0.169
Cubist 0.075 0.196 0.122 0.254 0.055 0.164
3 Planned 0.576 1.048 0.647 1.115 0.292 0.72
LMV Only 0.191 0.362 0.211 0.38 0.114 0.278
LM 0.162 0.319 0.176 0.332 0.105 0.259
Random forest 0.165 0.323 0.182 0.34 0.096 0.242
Cubist 0.139 0.299 0.153 0.314 0.08 0.227
a
Mean absolute error.
b
Root mean squared error.
c
Linear regression model using only leaf velocity.
d
Linear regression model with all leaf motion parameters.
2524
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
Figure 5. MAE between planned and delivered positions, and between predicted and
delivered positions for both resting and moving MLC leaves. In all cases the predicted
positions are much closer to the delivered positions than are the planned positions.
Table 4. Differences in the predictions made by models trained using different plans
from the same institution.
Difference in
Institution median (mm) 95% CIa p value
1 0.001 −0.001–0.003 0.294
2 0.001 −0.001–0.004 0.275
3 0.025 0.021–0.029 <0.001
a
95% Confidence interval.
2525
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
Figure 6. Boxplots showing the increase in passing rates through the utilization of
predicted MLC positions. Plots (A) and (B) show the increases in local and global
gamma passing rate when using predicted positions for H&N plans, respectively. Plots
(C) and (D) show the same information for prostate plans.
Table 6. The change in gamma passing rates due to the inclusion of predicted errors in
the plans is shown. For all local criteria, and for all H&N plans the passing rate of the
plan is improved when errors are predicted.
Local gamma passing rates Global gamma passing rates
PR change PR change
(%)a p value 95% CIb (%) p value 95% CI
using different plans for training the model. For Institution 3, there was a significant differ-
ence, with p-value < 0.001. However, the confidence interval was from 0.021 to 0.029 mm,
indicating that the effect of using a different training plan was small.
2526
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
Figure 7. Representative DVH curves showing the curves of (left to right) (A): left
parotid, right parotid, left SMG, right SMG, and (B): PTV 48 Gy, PTV 54 Gy, PTV 67.5 Gy.
In all cases the DVH curves calculated using the predicted positions are in closer
agreement with the delivered curves than are the planned curves.
Figure 8. Average percent differences in dose volumetric parameters planned versus
delivered positions, and predicted versus delivered positions. Plots (A)–(C) show the
percent changes for PTVs, parotids, and SMGs, respectively. Stars above the bars
indicate significance (* = p < 0.05, ** = p < 0.01, and *** = p < 0.001).
2527
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
The results of the differences in predictions when using a model trained with data from a
different institution (and therefore a different MLC) are presented in table 5. Table 5 shows
that for all combinations of training institution and testing institution, there were significant
differences in the predictions made by the model trained using the same institution as the
testing data, and the predictions made by the model trained using a different institution as the
testing data. However, the estimated differences in medians of the predictions were all less
than 0.1 mm.
The analysis of the improvements in gamma passing rates was separated into four categories,
local and global passing rates for both H&N and prostate plans. This data is summarized in
table 6 and figure 6. Table 6 presents the mean differences between the passing rates of the
plans utilizing the planned MLC positions, and plans utilizing predicted positions.
In all cases for H&N plans the passing rate is increased by calculating the dose plane with
the predicted positions. This indicates that the predicted positions better represent the reality
of delivery than do the planned positions. For prostate plans there was a similar trend, how-
ever since the global passing rates for prostate plans were often near 100%, the difference was
generally much smaller.
Representative DVH curves for OARs and planning target volumes (PTVs) of the patient
dose distributions as calculated using the planned, predicted, and delivered MLC positions
are presented in figure 7. The average differences in dose volumetric parameters between
planned and delivered, and predicted and delivered, are shown in figure 8. In all cases the dose
volumetric parameters calculated with the predicted positions are in closer agreement with the
delivered parameters than are the planned parameters. Figure 8 shows that the largest differ-
ences are present in the OAR dose distributions. For instance, the average difference between
planning and delivery of the volume of the right parotid receiving 50% of the dose was 8.16%
(SD = 3.3%, p = 0.005), whereas the difference between the predicted and delivered was
statistically insignificant (0.18%, SD = 0.96%, p = 0.7). The change in the PTVs was of the
same general magnitude as the changes in the OARs, but owing to the much larger dose
prescriptions they had smaller percent changes.
4. Discussion
A model capable of predicting errors for specific MLC leaves could help to inform better
optimization algorithms for creating plans capable of being delivered as intended. In this study
such a model was built and validated. First, it was shown that MLC errors are predictable to
a high degree of accuracy. Second that such MLC errors have an appreciable impact on the
gamma passing rate of the plan, and a new plan corrected for the errors raises the passing rates.
Finally, dose volumetric histograms (DVH) recalculated with predicted positions incorporated
into the plan provide the treatment planner with a better representation of the deliverable dose
distributions for the PTV and OARs. In this study, several parameters which offer predictive
capability were established. Although much of the variance in the positional errors is captured
by leaf velocity, the linear model taking only velocity into account was outperformed by all
other models. This indicates that there are other patterns in the data not related to velocity
2528
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
which may result in discrepancies between planned and delivered positions. The other patterns
were well predicted by the best performing cubist model.
The inclusion of a leaf motion parameter in the final model does not necessarily imply that
the parameter has a real effect on error magnitude. The inclusion of leaf bank is an example of
this, where the inclusion increases predictive accuracy not because one leaf bank is more error
prone. Rather, it allows the model to switch the direction of the predicted error when the ori-
entation of the coordinate system switches after the first treatment arc. In this study there was
no significant difference between the means of errors from leaf banks A and B by the Welch
two sample t-test ( p = 0.15). This is in accordance with the findings of Kerns et al (2014), and
in opposition to Stell et al (2004).
Some of the leaf motion parameters which were hypothesized to be related to error magni-
tude did not have an appreciable effect on the models, for example, leaf acceleration, and the
movement of adjacent MLC leaves. Inclusion of either of these parameters led to over-fitting
of the training data, and consequently decreased the RMSE of the models on the validation set,
hindering the generalizability of the models.
It has been posited that dose errors correlate with gap error, and not necessarily with indi-
vidual leaf position errors (Losasso 2008). In contrast to dynamic IMRT plans, where leaves
on opposing leaf banks move in the same direction, in VMAT plans the leaves move back and
forth in both directions. Therefore it is important to know how many of the errors assessed in
this study are gap errors (where the opposing leaves have errors in opposite directions, leading
to larger or smaller leaf gaps than intended), or shift errors (where opposing leaves are both
shifted in the same direction, with little change to the leaf gap). For H&N plans in this study,
the average proportion of errors which were gap errors was 31.77% (SD = 1.46%). Prostate
plans generally showed a lower proportion, with 15.70% (SD = 1.34%) of the errors being gap
errors. It was also found that, in general, when the errors of opposing leaves were in opposite
directions, the average change in leaf gap was 1.74 mm (SD = 0.44 mm), whereas for shift
errors the average change in leaf gap was 0.35 mm (SD = 0.20 mm). That is, although there are
fewer gap errors, the magnitude of gap errors is typically much larger than that of shift errors.
It was shown that the accuracy of predictions for a given MLC was independent of the
choice of training plan. However, using training data from a different MLC of the same model
led to discrepancies in predictions. It is therefore recommended that a model specific to each
MLC should be trained, and used for predictions only for that specific MLC.
MLC leaf position errors are potentially a contributing factor in radiotherapy treatment
plans failing to be delivered as intended. The prediction of leaf position errors could be used
as a component of a modulation index to predict the delivery accuracy of a plan pre-delivery.
For example, an MLC modulation index could be built as a linear combination or ratio of the
number of predicted errors above or below certain thresholds. Although methods such as these
may be able to predict deliverability, pre-treatment QA should continue to be an important
part of the treatment workflow.
Predictions may also be used to further investigate the dosimetric effects of MLC errors.
Dosimetric effects of random MLC errors have been studied in the past by sampling from
a Gaussian (Rangel and Dunscombe 2009, Oliver et al 2010) or from a uniform distribu-
tion (Mu et al 2007, Yan et al 2009, Bai et al 2013). However, neither of these distributions
accurately model a realistic error distribution, nor do they take into account the directional
dependence of leaf errors on leaf velocity. Therefore, by utilizing the method for error pre-
diction described in this study, more accurate assessments of the dosimetric effects of MLC
errors may be investigated. This study is limited in that it only considered Varian Millennium
120 MLCs, however, there is nothing precluding the methods from being adapted to other
MLCs, and this will be undertaken as a future work.
2529
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
It is important to note that this work is concerned with the internal representation of MLC
positions used to calculate the dose distributions within the TPS. If the positions sent to the
MLC controller were altered to be the predicted positions, there would still be positional errors.
5. Conclusions
In this study, it was shown that MLC leaf position errors can be predicted to a high degree of
accuracy by utilizing statistical learning techniques. All models took only a single plan as an
input, the models are simple to implement, and take approximately one second to train. By
utilizing the predicted positions, rather than the planned positions to calculate dose distributions
it was shown that gamma passing rates can be increased, and that errors in MLC positions that
impact dose volumetric parameters can be reduced. The methodology developed in this study
was shown to be generalizable to other institutions by assessing their own institutional data.
By incorporating and correcting for the predicted errors in MLC positions, optimization
routines for encoding MLC leaf positions may be improved, and would allow for more realis-
tic calculation of the dose distributions as truly delivered to the patient.
Acknowledgments
This work was in part supported by the National Research Foundation of Korea (490-
20150036, 490-20140041 and 5267-20150100) grant funded by the Korea government. The
authors are grateful to the editor and associate editors for their valuable comments and review
of this paper.
References
Almond P R, Biggs P J, Coursey B M, Hanson W F, Huq M S, Nath R and Rogers D W O 1999 AAPM’s
TG-51 protocol for clinical reference dosimetry of high-energy photon and electron beams Med.
Phys. 26 1847–70
Bai S, Li G, Wang M, Jiang Q, Zhang Y and Wei Y 2013 Effect of MLC leaf position, collimator
rotation angle, and gantry rotation angle errors on intensity-modulated radiotherapy plans for
nasopharyngeal carcinoma Med. Dosimetry 38 143–7
Breiman L 2001 Random forests Mach. Learn. 45 5–32
Chen F, Rao M, Ye J-S, Shepard D M and Cao D 2011 Impact of leaf motion constraints on IMAT plan
quality, deliver accuracy, and efficiency Med. Phys. 38 6106–18
Heilemann G, Poppe B and Laub W 2013 On the sensitivity of common gamma-index evaluation
methods to MLC misalignments in Rapidarc quality assurance Med. Phys. 40 031702
Iftimia I, Cirino E T, Xiong L and Mower H W 2010 Quality assurance methodology for Varian RapidArc
treatment plans J. Appl. Clin. Med. Phys. 11 130–43
Kerns J R, Childress N and Kry S F 2014 A multi-institution evaluation of MLC log files and performance
in IMRT delivery Radiat. Oncol. 9 176
Klein E E et al 2009 Task group 142 report: quality assurance of medical accelerators Med. Phys.
36 4197–212
Kuhn M and Johnson K 2013 Applied Predictive Modeling (Berlin: Springer)
Kuhn M, Weston S, Keefer C and Coulter N 2014 C code for cubist by Ross Quinlan Cubist: Rule- and
Instance-Based Regression Modeling
Kutcher G J et al 1994 Comprehensive QA for radiation oncology: report of AAPM radiation therapy
committee task group 40 Med. Phys. 21 581
Li J G, Dempsey J F, Ding L, Liu C and Palta J R 2003 Validation of dynamic MLC-controller log files
using a 2D diode array Med. Phys. 30 799
Li R and Xing L 2013 An adaptive planning strategy for station parameter optimized radiation therapy
(SPORT): segmentally boosted VMAT Med. Phys. 40 050701
2530
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al
Liaw A and Wiener M 2002 Classification and regression by randomForest R News 2 18–22
Losasso T 2008 IMRT delivery performance with a varian multileaf collimator Int. J. Radiat. Oncol.
Biol. Phys. 71 S85–8
Masi L, Doro R, Favuzza V, Cipressi S and Livi L 2013 Impact of plan parameters on the dosimetric
accuracy of volumetric modulated arc therapy Med. Phys. 40 071718
Miura H, Tanooka M, Fujiwara M, Takada Y, Doi H, Odawara S, Kosaka K, Kamikonya N and Hirota S
2014a Predicting delivery error using a DICOM-RT plan for volumetric modulated arc therapy Int.
J. Med. Phys. Clin. Eng. Radiat. Oncol. 3 82–7
Miura H, Tanooka M, Inoue H, Fujiwara M, Kosaka K, Doi H, Takada Y, Odawara S, Kamikonya N and
Hirota S 2014b DICOM-RT plan complexity verification for volumetric modulated arc therapy Int.
J. Med. Phys. Clin. Eng. Radiat. Oncol. 3 117–24
Mu G, Ludlum E and Xia P 2007 Impact of MLC leaf position errors on simple and complex IMRT plans
for head and neck cancer Phys. Med. Biol. 53 77–88
Oliver M, Gagne I, Bush K, Zavgorodni S, Ansbacher W and Beckham W 2010 Clinical significance
of multi-leaf collimator positional errors for volumetric modulated arc therapy Radiother. Oncol.
97 554–60
Otto K 2008 Volumetric modulated arc therapy: IMRT in a single gantry arc Med. Phys. 35 310–7
Park J M, Park S-Y, Kim H, Kim J H, Carlson J and Ye S-J 2014a Modulation indices for volumetric
modulated arc therapy Phys. Med. Biol. 59 7315–40
Park S-Y, Kim I H, Ye S-J, Carlson J and Park J M 2014b Texture analysis on the fluence map to evaluate
the degree of modulation for volumetric modulated arc therapy Med. Phys. 41 111718
Park J M, Wu H G, Kim J H, Carlson J N K and Kim K 2015 The effect of MLC speed and acceleration
on the plan delivery accuracy of VMAT BJR 88 20140698
R Development Core Team 2010 R: a language and environment for statistical computing R Foundation
for Statistical Computing (http://www.R-project.org/)
Ramsey C R, Spencer K M, Alhakeem R and Oliver A L 2001 Leaf position error during conformal
dynamic arc and intensity modulated arc treatments Med. Phys. 28 67
Rangel A and Dunscombe P 2009 Tolerances on MLC leaf position accuracy for IMRT delivery with a
dynamic MLC Med. Phys. 36 3304
Stell A M, Li J G, Zeidan O A and Dempsey J F 2004 An extensive log-file analysis of step-and-shoot
intensity modulated radiation therapy segment delivery errors Med. Phys. 31 1593
Tatsumi D, Hosono M N, Nakada R, Ishii K, Tsutsumi S, Inoue M, Ichida T and Miki Y 2011 Direct
impact analysis of multi-leaf collimator leaf position errors on dose distributions in volumetric
modulated arc therapy: a pass rate calculation between measured planar doses with and without the
position errors Phys. Med. Biol. 56 N237–46
Yan G, Liu C, Simon T, Peng L-C, Fox C and Li J 2009 On the sensitivity of patient-specific IMRT QA
to MLC positioning errors J. Appl. Clin. Med. Phys. 10 120–8
Zeidan O A, Li J G, Ranade M, Stell A M and Dempsey J F 2004 Verification of step-and-shoot IMRT
delivery using a fast video-based electronic portal imaging device Med. Phys. 31 463
Zygmanski P, Kung J H, Jiang S B and Chin L 2003 Dependence of fluence errors in dynamic IMRT on
leaf-positional errors varying with time and leaf number Med. Phys. 30 2736
2531