Sie sind auf Seite 1von 19

Home Search Collections Journals About Contact us My IOPscience

A machine learning approach to the accurate prediction of multi-leaf collimator positional

errors

This content has been downloaded from IOPscience. Please scroll down to see the full text.

2016 Phys. Med. Biol. 61 2514

(http://iopscience.iop.org/0031-9155/61/6/2514)

View the table of contents for this issue, or go to the journal homepage for more

Download details:

IP Address: 213.23.82.131
This content was downloaded on 12/10/2016 at 09:17

Please note that terms and conditions apply.

You may also be interested in:

Modulation index for VMAT considering both mechanical and dose calculation uncertainties
Jong Min Park, So-Yeon Park and Hyoungnyoun Kim

Modulation indices for volumetric modulated arc therapy


Jong Min Park, So-Yeon Park, Hyoungnyoun Kim et al.

Implementation of phantom-less IMRT delivery verification using Varian DynaLog files and R/V output
C E Agnew, R B King, A R Hounsell et al.

Direct impact of MLC leaf position errors on dose distributions in VMAT


D Tatsumi, M N Hosono, R Nakada et al.

Impact of MLC leaf position errors on head and neck IMRT plans
G Mu, E Ludlum and P Xia

Performance assessment of a 2D array of plastic scintillation detectors for IMRT quality assurance
Mathieu Guillot, Luc Gingras, Louis Archambault et al.

Utilizing knowledge from prior plans in the evaluation of quality assurance


Carl Stanhope, Q Jackie Wu, Lulin Yuan et al.
Institute of Physics and Engineering in Medicine Physics in Medicine & Biology

Phys. Med. Biol. 61 (2016) 2514–2531 doi:10.1088/0031-9155/61/6/2514

A machine learning approach to the


accurate prediction of multi-leaf collimator
positional errors
Joel N K Carlson1,2, Jong Min Park2,3,4,5, So-Yeon Park2,3,4,6,
Jong In Park1,2, Yunseok Choi6,7 and Sung-Joon Ye1,2,3,4,5,6
1
  Program in Biomedical Radiation Sciences, Department of Transdisciplinary
Studies, Graduate School of Convergence Science and Technology, Seoul National
University, Seoul 08826, Korea
2
  Biomedical Research Institute, Seoul National University Hospital, Seoul 03080,
Korea
3
  Department of Radiation Oncology, Seoul National University Hospital, Seoul
03080, Korea
4
  Institute of Radiation Medicine, Seoul National University Medical Research
Center, Seoul 03080, Korea
5
  Center for Convergence Research on Robotics, Advance Institutes of Convergence
Technology, Suwon 16229, Korea
6
  Interdisciplinary Program in Radiation Applied Life Science, Seoul National
University College of Medicine, Seoul 03080, Korea
7
  Department of Radiation Oncology, Veterans Health Service Medical Center, Seoul
26465, Korea

E-mail: sye@snu.ac.kr

Received 11 September 2015, revised 20 January 2016


Accepted for publication 9 February 2016
Published 7 March 2016

Abstract
Discrepancies between planned and delivered movements of multi-leaf
collimators (MLCs) are an important source of errors in dose distributions
during radiotherapy. In this work we used machine learning techniques to
train models to predict these discrepancies, assessed the accuracy of the model
predictions, and examined the impact these errors have on quality assurance
(QA) procedures and dosimetry. Predictive leaf motion parameters for the
models were calculated from the plan files, such as leaf position and velocity,
whether the leaf was moving towards or away from the isocenter of the MLC,
and many others. Differences in positions between synchronized DICOM-RT
planning files and DynaLog files reported during QA delivery were used
as a target response for training of the models. The final model is capable
of predicting MLC positions during delivery to a high degree of accuracy.
For moving MLC leaves, predicted positions were shown to be significantly

0031-9155/16/062514+18$33.00  © 2016 Institute of Physics and Engineering in Medicine  Printed in the UK 2514
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

closer to delivered positions than were planned positions. By incorporating


predicted positions into dose calculations in the TPS, increases were shown in
gamma passing rates against measured dose distributions recorded during QA
delivery. For instance, head and neck plans with 1%/2 mm gamma criteria had
an average increase in passing rate of 4.17% (SD  =  1.54%). This indicates
that the inclusion of predictions during dose calculation leads to a more
realistic representation of plan delivery. To assess impact on the patient, dose
volumetric histograms (DVH) using delivered positions were calculated for
comparison with planned and predicted DVHs. In all cases, predicted dose
volumetric parameters were in closer agreement to the delivered parameters
than were the planned parameters, particularly for organs at risk on the
periphery of the treatment area. By incorporating the predicted positions
into the TPS, the treatment planner is given a more realistic view of the dose
distribution as it will truly be delivered to the patient.

Keywords: machine learning, MLC position, quality assurance

(Some figures may appear in colour only in the online journal)

1. Introduction

The introduction of volumetric modulated arc therapy (VMAT) as a method for delivering
radiotherapy has decreased delivery time and monitor units (MU) as compared to conven-
tional intensity modulated radiation therapy (IMRT) (Otto 2008). However, due to the highly
choreographed nature of VMAT delivery, many potential sources of error arise, necessitating
patient specific quality assurance (QA) and dosimetric verification techniques. The complex
movement of the multi-leaf collimator (MLC) is one such source of errors between treatment
planning and delivery. MLC positional errors are differences between the planned and deliv-
ered positions of the individual MLC leaves. These deviations can be studied by comparing
the leaf positions encoded in the planning DICOM-RT files, which contain the intended leaf
positions, to the machine reported DynaLog files, which contain the leaf positions during
delivery. Although the manufacturer specified accuracy of DynaLog files is not present in the
literature, DynaLog file reported MLC positions have been shown to be accurate through the
analysis of film (Zygmanski et al 2003), 2D diode array (Li et al 2003), and electronic portal
imaging device (Zeidan et al 2004) measurements.
Systematic shifts in leaf position and leaf gap have been shown to have detrimental effects
on the accuracy of the delivery of dose distributions for both IMRT (Rangel and Dunscombe
2009, Yan et al 2009, Bai et al 2013) and VMAT (Oliver et al 2010, Tatsumi et al 2011). Some
of the causes of leaf errors are known; for example, velocity of individual MLC leaves has
been shown to have an approximately linear relationship with positional errors (Ramsey et al
2001, Losasso 2008). Miura et al also showed that gamma passing rates are correlated with
MLC leaf velocity, indicating that errors in MLC positions due to large velocities may have
negative effects on dosimetric accuracy (Miura et al 2014b). Furthermore, it has been shown
that constraining the millimeters traveled per leaf per MU improves the delivery accuracy of
the treatment plan (Chen et al 2011, Miura et al 2014a).
Due to the negative impact of MLC positional errors on the delivery accuracy of radio-
therapy plans, it is advantageous to be able to predict how the errors will impact the deliv-
ery accuracy. To this end, several modulation indices have been developed in attempts to
score the delivery accuracy of VMAT and IMRT plans (Li and Xing 2013, Masi et al 2013,

2515
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

Park et al 2014a, 2014b) before they are delivered. However, these methods are correla-
tional, and appropriate thresholds for these values are difficult to define (Park et al 2014b).
Furthermore, these indices do not give the treatment planner any information as to how the
dose distribution as viewed in the treatment planning system (TPS) will be influenced by
the errors.
Therefore, in this study we focused on creating a method for predicting MLC positional
errors before delivery, and incorporated those errors into the dose distribution calculation to
enable the treatment planner to see a more accurate representation of the dose as it will be
delivered. To predict the errors, we first acquired planned and delivered MLC positions from
a series of VMAT plans, and calculated the differences between the two. Next, we calculated
leaf motion parameters of the plans which were hypothesized to lead to MLC errors. We then
built machine learning models using these parameters as inputs to predict the errors between
planned and delivered MLC positions. We then verified the accuracy of the predictions, and
assessed their impact on QA and patient dosimetry.
The final outcome of the study is a model capable of taking a planned set of MLC positions
in the form of a DICOM-RT file, and predicting the positions which will be delivered to a
high degree of accuracy. By including the predictions of the model into the TPS, we show that
it is possible to achieve a more accurate representation of the true locations of MLC leaves,
which allows treatment planners to see a realistic view of the dose that will be delivered to
the patient.

2.  Materials and methods

2.1.  VMAT plans

A retrospectively selected set of 74 VMAT plans was acquired from three separate institu-
tions for this study. The plans from Institution 1 were for head and neck (H&N) (N  =  20),
and prostate (N  =  20) cancer. The plans from Institution 2 were also for H&N (N  =  6), and
prostate (N  =  10). For Institution 3 there were H&N plans from various sites (N  =  15) and
prostate plans (N  =  3).
All plans were generated in the Eclipse system (Varian Medical Systems, Palo Alto, CA)
with the progressive resolution optimizer 3 (PRO3, ver.11.0.31, Varian Medical Systems, Palo
Alto, CA). Dose distributions were calculated using the anisotropic analytic algorithm (AAA,
ver.11.0.31, Varian Medical Systems, Palo Alto, CA) with a dose calculation grid of 2 mm.
Two full arcs were used in each plan, and optimized such that the angular separation between
control points (CPs) was 2.0341°, leading to 356 individual CPs per plan.
Each plan was delivered using a linear accelerator equipped with a Varian Millennium 120
MLC. All plans from each institution were delivered using a single linear accelerator, and
therefore a single MLC from the respective institution. The Millennium 120 MLC consists
of two banks of 60 MLC leaves, with the outer 20 and inner 40 on each side having widths
of 1 cm and 0.5 cm, respectively. Initial calibration of all MLCs was performed by a qualified
Varian engineer. In all institutions included in this study, TG-40 (Kutcher et al 1994) and
TG-142 (Klein et al 2009) protocols are followed for MLC QA.

2.2.  Determining MLC error magnitude

The planned positions of each individual MLC leaf at each CP for every plan were extracted
from DICOM-RT files exported from the Eclipse system. Therefore, from each plan 42,720
leaf position data points were extracted (356 CPs for each of the 120 MLC leaves). The plans

2516
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

were then delivered, and delivered locations of the individual MLC leaves were extracted from
the Dynalog of the MLC.
After extracting the planned positions from the DICOM-RT files and the delivered posi-
tions from the Dynalog files, the two sets of positions (planned and delivered) must be
synchronized before the difference between positions can be calculated. Synchronization
must take into account the differences between the sampling times of the DICOM-RT and
DynaLog files. DynaLog files record the position every 0.05 s in units of motor counts,
which were converted to millimeters according to manufacturer specifications. DICOM-RT
positions are recorded at each CP in units of millimeters. At a CP where there is to be more
than 4.238 MU delivered, the gantry slows to allow successful delivery (Park et al 2015),
changing the time between CPs. In this dataset there were no CPs in any of the plans at
which the planned MU was greater than this threshold, and therefore the time between
CPs was taken to be a constant 0.424 s, the maximum gantry movement speed (Park et al
2015). Therefore, after synchronization, the maximum time difference between the plan file
and the DynaLog is 0.025 s, that is, half of the sampling time of the counts recorded in the
DynaLog files.
After synchronization, the differences between the positions present in the DICOM-RT file
(planned positions) and the positions reported by the DynaLog file (delivered positions) for
each leaf were calculated using an in-house program written in R (R Development Core Team
2010). The absolute value of this quantity for each leaf at each CP is the error magnitude.

2.3.  Leaf motion characterisation

A number of parameters characterising MLC leaf motion were derived from the planned leaf
positions. Each parameter was calculated for all MLC leaves at all CPs, thus each data point
represents a single MLC leaf at a single CP. The position of each leaf at the CP of interest, and
also at the previous and subsequent CPs was calculated. The instantaneous velocity for each
leaf at each CP was calculated as the leaf position minus the leaf position at the previous CP,
divided by the time between CPs, as described in equation (1):
Position CP  −  Position CP −1
Velocity
(1) CP =  
0.424 s
Acceleration for each leaf was calculated in a similar fashion. Velocity and acceleration for
each leaf were also calculated for the previous and subsequent CP. Velocity and acceleration
of both adjacent MLC leaves was calculated under the hypothesis that friction from adjacent
leaves may induce errors.
Movement of the MLC leaves was also sorted into several categories. A category was
defined to separate leaf motions into categories defining the state of motion, including: ‘at
rest’, ‘moving’, ‘coming to a stop’, or ‘moving for only a single CP’ at the given CP. An
MLC leaf was defined to be at rest if it did not move during the CPs before or after the CP
of interest. Leaf movement direction was categorized to differentiate whether the leaf was
moving towards, or away from the isocenter of the MLC. To further investigate the effect of
friction from adjacent MLC leaves on the movement of the leaf of interest, a category was
defined to classify the two adjacent leaves as ‘both moving in the same direction’, ‘both mov-
ing in the opposite direction’, ‘one moving in the opposite direction’, or ‘both at rest’. The
CP at which the error occurred (i.e. 1 to 356), the arc number (i.e. ‘1’ or ‘2’), and the leaf
bank the leaf was a part of (i.e. ‘A’ or ‘B’) were also extracted. The extraction of the errors
between planning and delivery, and the calculation of predictive leaf motion parameters is
displayed in figure 1.

2517
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

Figure 1.  Workflow of the extraction of errors between DICOM-RT and DynaLog files,
and the extraction of leaf motion parameters from planned positions.

2.4.  Model training, validation, and testing

For each institution, the data was split into three separate datasets, termed training, validation,
and testing. Thus there were nine sets in total. Since there was no overlap of plans within
the datasets, each set may be considered as independent of the remaining sets from each
institution.
A single plan was randomly chosen from each institution to be the training set for that insti-
tution (N  =  1, 1, 1). The choice to use only a single plan for training from each institution was
based on two observations. First, the large number of training data available in each plan (42,
720) was determined to be a sufficient number for training of the machine learning algorithms
used in this study through cross validation with different sizes of training data sets. Second,
the errors are dependent on the individual MLC rather than the plan itself, thus any sufficient
amount of data from each unique MLC would be appropriate to train a model. Therefore,
a model specific to the MLC of a given linear accelerator should be built.
A predictive model specific to each institution was fit using only the data from that institu-
tion’s training plan. After each model was fit to the training plan, the accuracy of each model
was tested on a validation set consisting of two randomly selected plans from the model’s
respective institution. The purpose of the validation set was to find the optimal combination
of leaf motion parameters and to tune any parameter values the model may have. This tuning
process is done on the validation set, rather than the training set, to avoid both over-fitting
of the models, as well as overly optimistic accuracy assessments which do not hold out of
sample. The leaf motion parameters and tunable model parameters were sequentially iter-
ated over to minimize the root mean square error (RMSE) between predicted and delivered
positions on the validation set, and the model with the lowest RMSE was chosen as the final
model.
A final validation of the models was performed using the remaining plans from each insti-
tution, that is, the testing set (N  =  37, 13, 15). Model performance on the testing set was then
assessed using mean absolute error (MAE) and RMSE between the predicted and delivered
positions. In this case, RMSE is used as an alternative to the more common standard devia-
tion (SD), as the distributions of the errors do not follow the normal distribution. It should be
noted, however, that the formulas for RMSE and SD are identical, the difference is in interpre-
tation. The statistics reported in this study are from the test set only; this is an alternative and

2518
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

Figure 2. Workflow for training, validating, and finally reporting the statistics of
predictive models.

preferable method to using cross-validation, because the untouched test set can be thought of
as real-world data, as the test data wasn’t used during the training or validation process. This
process is shown in figure 2.

2.5.  Testing independence of predictions from training plan

To examine the importance of the choice of training plan on the quality of predictions, anal-
ysis of the predictions was performed using a different randomly selected plan from each
institution for model training than was used initially. For the second model, identical model
parameters as used to train the initial models were used.
Furthermore, to test whether models trained using a different MLC were able to make
accurate predictions for other MLCs, a model trained from each institution was used to
make predictions on testing plans from each of the other institutions. For example, a model
trained using a single plan from Institution 1 was used to make predictions on plans from both
Institutions 2 and 3, and the results compared to the predictions made using models trained
using data from Institutions 2 and 3, respectively.
Due to the non-normal distribution of error predictions, the Mann–Whitney U test was
performed to examine differences in model accuracy using alternative training plans. For each
test performed, the p-values along with 95% confidence intervals of the difference in median
error prediction was reported.

2.6.  Predictive model parameters and types

Several different models were tested to find a model with the best predictive accuracy. The
models included a simple linear regression model, a multiple linear regression model, a model
based on the random forest algorithm, and a model based on the cubist algorithm (described
below). The inputs to the models were the leaf motion parameters described above, and the
target response for each model was the difference between the planned and delivered MLC
leaf positions.
Of the leaf motion parameters extracted from the differences between planned positions
and delivered positions, a set of two quantitative parameters: leaf position and instantaneous

2519
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

velocity, and four qualitative parameters: movement towards or away from the center, whether
the leaf was at rest/starting/stopping/moving for a single CP, the CP number, and the leaf bank
were utilized in the final models. All other parameters were found to decrease the RMSE on
the validation set.
The R programming language (R Development Core Team 2010) was used for all data
analysis and modeling.

2.6.1. Linear regressions.  For the linear regression modeling, two models were built. The
first, LMV Only, was a simple linear regression of velocity against the target response (differ-
ence between planned and delivered MLC leaf positions). The second was a multiple linear
regression, regressing the parameters described in section 2.6 against the target response.

2.6.2.  Random forest.  The random forest implementation used was based on Breiman and
Cutler’s algorithm (Breiman 2001) as implemented in the R package ‘randomForest’ (Liaw
and Wienes 2002). Random forests create a predictive model by first randomly selecting a
subset of a given number of features from the feature space. A sample of the training data is
then taken, and the selected features are used to create a decision tree which separates the data
such that the homogeneity of the samples at the terminal node of each branch is maximized.
This process is repeated many times, and each decision tree produced in this way is saved to
create a ‘forest’ of decision trees. To make predictions on new data, the new data point is fed
into each tree, and the tree offers a prediction which is the average of all the data points used
in training which follow the same path through the tree as the new data point. The prediction
of the algorithm is then the average of the predictions from each tree in the forest.
For the random forest model, the number of features randomly sampled as candidates for
each split of the decision tree was four, the value which minimized the RMSE on the valida-
tion set. Any number of trees above 100, and any sample size above 4000 were found to have
little impact on accuracy.
An example of a random forest as applied to the prediction of MLC positional errors is
as follows. First, the algorithm selects four leaf motion parameters, for example leaf veloc-
ity, leaf position, whether the leaf is moving or resting, and whether the leaf is moving
towards or away from the isocenter. Then, a sample of 4000 errors (differences between
planned and delivered positions) and the associated leaf motion parameters for those errors
are extracted. From these, a tree is built with a number of terminal nodes with criteria
such as ‘if leaf velocity is greater than X cm per second, and the leaf is moving towards
the isocenter, the error is Y’. In this study, 100 such trees are built, each having up to 1000
terminal nodes.

2.6.3. Cubist.  Cubist is a rule-based model consisting of several different methodologies.


The operation of the cubist algorithm is similar to the random forest, however there are several
optimizations. One such optimization is that while a random forest makes predictions using an
average of the training points within the terminal node of a given branch, the cubist algorithm
builds a linear regression model at each terminal node. There are several other optimizations,
and the algorithm in its entirety is described in detail in Kuhn and Johnson (2013). Tun-
able parameters of the cubist algorithm include committees and neighbors. Committees being
somewhat analogous to the number of decision trees used to contribute their predictions to the
final prediction, and neighbors representing a number of neighboring training points which
can be used to aid in prediction. The values of committees and neighbors in the final model
were 100 and 0, respectively. The R implementation of the cubist algorithm from the ‘Cubist’
package was used (Kuhn et al 2014).

2520
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

2.7.  Integration with treatment plan for gamma analysis

For each plan from Institution 1, 2D dose distributions of each VMAT plan as delivered with
6 MV photons were acquired with a MapCHECK2 detector array (Sun Nuclear Corporation,
Melbourne, FL). The MapCHECK2 was inserted into a MapPHAN (Sun Nuclear Corporation,
Melbourne, FL) during delivery. Before delivery, the relative responses of each detector in the
MapCHECK2 array, as well as the absolute response of the detector to a known dose were
calibrated according to manufacturer specifications. The absolute dose of the Linac was also
calibrated according to the American Association of Physicists Task Group 51 (AAPM TG51)
protocol (Almond et al 1999).
A CT image of the device setup was imported into the Eclipse system and used for the
calculation of the 2D dose distributions of plans with either planned or predicted positions.
The distributions were calculated with a 2 mm calculation grid, PRO3 optimizer, and AAA
algorithm, as above.
After delivery and calculation, both global and local gamma evaluations were performed
with SNC patient software (ver. 6.1.2, Sun Nuclear Corporation, Melbourne, FL). Gamma
criterion of 3%/3 mm, 2%/2 mm, and 1%/2 mm were used with a 10% threshold for the ROI,
as frequently cited in the literature (Iftimia et al 2010, Heilemann et al 2013). The differences
between the passing rates using the planned MLC positions, and the passing rates using the
predicted MLC positions were compared using paired t-tests to assess the difference in mean
passing rates between the two.

2.8.  Integration with treatment plan for DVH analysis

For five H&N patients from Institution 1 for whom patient CT data was available, DICOM-RT
files were reconstructed with predicted MLC positions. These, along with the planned DICOM-RT
files, were imported into the Eclipse system. Dose distributions to the patient CT images were cal-
culated using the same parameters as above, with the exception of the calculation grid size, which
was reduced to 1 mm. For the target volume, clinically relevant dose-volumetric parameters such
as the dose received by 95% of the target volume (D95%), D5%, the minimum dose, the maximum
dose, and the mean dose were compared between planned and predicted VMAT plans. For organs
at risk (OARs) in the H&N plans, the volume of each parotid gland receiving 50% of the dose
(V50%), and mean dose to each parotid gland and each sub-mandibular gland (SMG) were com-
pared. Differences in the dose volumetric parameters between calculations using planned MLC
positions versus delivered positions, and predicted positions versus delivered positions were com-
pared using paired t-tests to assess the mean differences between the two.

3. Results

3.1.  Predictive leaf motion parameters

Several leaf motion parameters were particularly important in increasing model accuracy.
The motion parameter which offered the most predictive ability was leaf velocity, which had
an approximately linear relationship with error magnitude (β  =  0.129, CI  =  0.128 to 0.130,
p  <  0.001), with coefficient of determination, R2, of 0.902 ( p  <  0.001). This relationship is
shown in figure 3(A).
Whether the leaf was moving towards or away from the isocenter of the MLC also had a
statistically significant effect on the mean error magnitude, making this category an impor-
tant predictive motion parameter. The MAE of all leaves moving away from the center was

2521
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

Figure 3. Predictive value of leaf motion characteristics. Error magnitude (the


difference between planned and delivered positions) versus individual leaf velocity
on 10 000 randomly sampled errors from all institutions is shown in plot (A), with a
linear regression of the sample in blue. β represents the slope of the line (0.129 mm of
error for every increase in velocity of 1 mm s−1), and R2 represents the coefficient of
determination. Plot (B) shows the difference in median error magnitudes between MLC
leaves moving toward or away from the isocenter of the MLC.

1.37 mm (RMSE  =  0.99 mm), while the MAE of leaves moving towards the center was only
1.14 mm (RMSE  =  0.97 mm). A difference in means of 0.235 mm, with 95% confidence inter-
val (CI) of 0.231 to 0.238 mm (p  <  0.001) by the Welch two sample t-test. A boxplot express-
ing the difference between movement directions is shown in figure 3(B).

3.2.  Predictive accuracy

The model based on the cubist algorithm outperformed all other models. Planned, delivered,
and predicted leaf positions of a single MLC leaf for two representative sets of CPs are shown
in figure 4. The figure shows that in all cases the predicted positions more closely coincide
with the delivered positions than do the planned positions.
The MAE, and root mean squared error between planned and delivered, and predicted and
delivered for moving, resting, and all MLC leaves from each institution are summarized in
tables 1–3, respectively. The considerably lower error between predicted and delivered versus
planned and delivered positions for the cubist model is shown in figure 5.
For Institution 1, the MAE between the planned leaf positions and the delivered leaf
positions of moving MLC leaves was 1.284 mm, with root mean squared error (RMSE) of
1.636 mm. The MAE between positions predicted by the Cubist model and the delivered
positions was 0.253 mm (RMSE  =  0.371 mm). Therefore, the predictions were, on average,
greater than 1 mm closer to the delivered positions than were the planned positions.
Institutions 2 and 3 showed similar tendencies. For Institution 2, the MAE between planned
and delivered positions of moving leaves was 1.409 (RMSE  =  1.699) mm, and the difference
between predicted and delivered was 0.278 (0.387) mm. These values for Institution 3 were
1.145 (1.495) mm and 0.274 (0.426) mm.

3.3.  Independence of predictions from choice of training plan

The results of testing the dependence of the predictions on choice of training plan are shown
in table 4. For Institutions 1 and 2 there was no significant difference in the predictions made
2522
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

Figure 4.  Planned, delivered, and predicted (cubist) positions of a single MLC leaf
from an H&N plan over two sets of CPs. In plot (A), the leaf is planned to drop rapidly,
with delivered positions lagging until the leaf slows. Plot (B) show a set of CPs where
heavy modulation is planned, but the delivered positions consistently fail to reach the
target. In all cases the predicted positions are closer to the delivered positions than are
the planned positions.

Table 1.  Model performance in predicting delivered MLC positions for moving MLCs
from the test set (N  =  65).
All plans H&N plans Prostate plans
MAE RMSE
Institution Model (mm)a (mm)b MAE RMSE MAE RMSE
1 Planned 1.284 1.636 1.358 1.354 1.086 1.489
LMV Onlyc 0.324 0.45 0.354 0.476 0.244 0.373
LMd 0.282 0.407 0.302 0.423 0.227 0.359
Random forest 0.275 0.395 0.29 0.407 0.237 0.36
Cubist 0.253 0.371 0.269 0.384 0.21 0.332
2 Planned 1.409 1.699 1.458 1.735 1.361 1.663
LMV Only 0.313 0.409 0.315 0.408 0.311 0.41
LM 0.286 0.372 0.291 0.375 0.281 0.369
Random forest 0.284 0.384 0.29 0.387 0.279 0.381
Cubist 0.278 0.387 0.285 0.393 0.272 0.38
3 Planned 1.145 1.495 1.153 1.504 1.075 1.412
LMV Only 0.356 0.501 0.354 0.5 0.375 0.517
LM 0.305 0.448 0.3 0.443 0.346 0.483
Random forest 0.314 0.454 0.313 0.454 0.318 0.451
Cubist 0.274 0.426 0.273 0.424 0.286 0.44
a
 Mean absolute error.
b
 Root mean squared error.
c
 Linear regression model using only leaf velocity.
d
 Linear regression model with all leaf motion parameters.

2523
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

Table 2.  Model performance in predicting delivered MLC positions for MLCs at rest
from the test set (N  =  65).
All plans H&N plans Prostate plans
MAE RMSE
Institution Model (mm)a (mm)b MAE RMSE MAE RMSE
1 Planned 0.084 0.158 0.159 0.246 0.043 0.075
LMV Onlyc 0.085 0.158 0.16 0.246 0.044 0.075
LMd 0.052 0.085 0.097 0.121 0.027 0.057
Random forest 0.039 0.074 0.061 0.106 0.028 0.048
Cubist 0.027 0.054 0.056 0.088 0.012 0.017
2 Planned 0.037 0.109 0.051 0.132 0.033 0.1
LMV Only 0.039 0.109 0.052 0.132 0.034 0.1
LM 0.016 0.045 0.022 0.06 0.015 0.038
Random forest 0.021 0.051 0.029 0.068 0.018 0.044
Cubist 0.005 0.013 0.007 0.019 0.005 0.01
3 Planned 0.033 0.129 0.037 0.136 0.023 0.11
LMV Only 0.034 0.129 0.038 0.136 0.024 0.11
LM 0.025 0.087 0.026 0.082 0.022 0.099
Random forest 0.022 0.084 0.023 0.08 0.019 0.095
Cubist 0.009 0.04 0.009 0.033 0.01 0.053
a
 Mean absolute error.
b
 Root mean squared error.
c
 Linear regression model using only leaf velocity.
d
 Linear regression model with all leaf motion parameters.

Table 3.  Model performance in predicting delivered MLC positions for moving and
resting MLCs from the test set (N  =  65).
All plans H&N plans Prostate plans
MAE RMSE
Institution Model (mm)a (mm)b MAE RMSE MAE RMSE
1 Planned 0.513 0.987 0.802 1.247 0.24 0.651
LMV Onlyc 0.17 0.298 0.264 0.387 0.082 0.176
LMd 0.134 0.253 0.207 0.32 0.065 0.164
Random forest 0.124 0.244 0.183 0.307 0.067 0.162
Cubist 0.108 0.226 0.17 0.288 0.049 0.145
2 Planned 0.39 0.867 0.636 0.124 0.281 0.724
LMV Only 0.109 0.228 0.162 0.282 0.086 0.199
LM 0.086 0.193 0.134 0.246 0.064 0.163
Random forest 0.089 0.2 0.138 0.255 0.067 0.169
Cubist 0.075 0.196 0.122 0.254 0.055 0.164
3 Planned 0.576 1.048 0.647 1.115 0.292 0.72
LMV Only 0.191 0.362 0.211 0.38 0.114 0.278
LM 0.162 0.319 0.176 0.332 0.105 0.259
Random forest 0.165 0.323 0.182 0.34 0.096 0.242
Cubist 0.139 0.299 0.153 0.314 0.08 0.227
a
 Mean absolute error.
b
 Root mean squared error.
c
 Linear regression model using only leaf velocity.
d
 Linear regression model with all leaf motion parameters.

2524
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

Figure 5.  MAE between planned and delivered positions, and between predicted and
delivered positions for both resting and moving MLC leaves. In all cases the predicted
positions are much closer to the delivered positions than are the planned positions.

Table 4.  Differences in the predictions made by models trained using different plans
from the same institution.
Difference in
Institution median (mm) 95% CIa p value
1 0.001 −0.001–0.003 0.294
2 0.001 −0.001–0.004 0.275
3 0.025 0.021–0.029 <0.001
a
 95% Confidence interval.

Table 5. Differences in the predictions made by models trained using different


institutions than the testing data.
Model Testing Difference in
institutiona institutionb median (mm) 95% CIc p value
1 2 0.048 0.045–0.051 <0.001
3 0.080 0.076–0.084 <0.001
2 1 0.058 0.055–0.060 <0.001
3 0.018 0.014–0.022 <0.001
3 1 0.064 0.062–0.066 <0.001
2 0.007 0.004–0.010 <0.001
a
 The institution for which the data used to train the model was from.
b
 The institution for which the data used to test the model was from.
c
 95% Confidence interval.

2525
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

Figure 6.  Boxplots showing the increase in passing rates through the utilization of
predicted MLC positions. Plots (A) and (B) show the increases in local and global
gamma passing rate when using predicted positions for H&N plans, respectively. Plots
(C) and (D) show the same information for prostate plans.

Table 6.  The change in gamma passing rates due to the inclusion of predicted errors in
the plans is shown. For all local criteria, and for all H&N plans the passing rate of the
plan is improved when errors are predicted.
Local gamma passing rates Global gamma passing rates
PR change PR change
(%)a p value 95% CIb (%) p value 95% CI

H&N 1%/2 mm 4.17 <0.001 3.32–5.03 3.53 <0.001 3.00–4.06


2%/2 mm 3.6 <0.001 2.85–4.35 1.47 <0.001 1.07–1.86
3%/3 mm 1.83 <0.001 1.42–2.24 0.41 0.002 0.18–0.64
Prostate 1%/2 mm – 0.08 – – 0.50 –
2%/2 mm 0.83 0.005 0.29–1.34 −0.16 0.02 −0.29–0.03
3%/3 mm 0.64 <0.001 0.35–0.92 −0.09 0.03 −0.17–0.01
a
 Passing rate change from using planned positions to predicted positions.
b
 95% confidence interval.

using different plans for training the model. For Institution 3, there was a significant differ-
ence, with p-value  <  0.001. However, the confidence interval was from 0.021 to 0.029 mm,
indicating that the effect of using a different training plan was small.

2526
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

Figure 7.  Representative DVH curves showing the curves of (left to right) (A): left
parotid, right parotid, left SMG, right SMG, and (B): PTV 48 Gy, PTV 54 Gy, PTV 67.5 Gy.
In all cases the DVH curves calculated using the predicted positions are in closer
agreement with the delivered curves than are the planned curves.

Figure 8.  Average percent differences in dose volumetric parameters planned versus
delivered positions, and predicted versus delivered positions. Plots (A)–(C) show the
percent changes for PTVs, parotids, and SMGs, respectively. Stars above the bars
indicate significance (*  =  p  <  0.05, **  =  p  <  0.01, and ***  =  p  <  0.001).

2527
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

The results of the differences in predictions when using a model trained with data from a
different institution (and therefore a different MLC) are presented in table 5. Table 5 shows
that for all combinations of training institution and testing institution, there were significant
differences in the predictions made by the model trained using the same institution as the
testing data, and the predictions made by the model trained using a different institution as the
testing data. However, the estimated differences in medians of the predictions were all less
than 0.1 mm.

3.4.  Gamma analysis

The analysis of the improvements in gamma passing rates was separated into four categories,
local and global passing rates for both H&N and prostate plans. This data is summarized in
table 6 and figure 6. Table 6 presents the mean differences between the passing rates of the
plans utilizing the planned MLC positions, and plans utilizing predicted positions.
In all cases for H&N plans the passing rate is increased by calculating the dose plane with
the predicted positions. This indicates that the predicted positions better represent the reality
of delivery than do the planned positions. For prostate plans there was a similar trend, how-
ever since the global passing rates for prostate plans were often near 100%, the difference was
generally much smaller.

3.5.  DVH analysis

Representative DVH curves for OARs and planning target volumes (PTVs) of the patient
dose distributions as calculated using the planned, predicted, and delivered MLC positions
are presented in figure  7. The average differences in dose volumetric parameters between
planned and delivered, and predicted and delivered, are shown in figure 8. In all cases the dose
volumetric parameters calculated with the predicted positions are in closer agreement with the
delivered parameters than are the planned parameters. Figure 8 shows that the largest differ-
ences are present in the OAR dose distributions. For instance, the average difference between
planning and delivery of the volume of the right parotid receiving 50% of the dose was 8.16%
(SD  =  3.3%, p  =  0.005), whereas the difference between the predicted and delivered was
statistically insignificant (0.18%, SD  =  0.96%, p  =  0.7). The change in the PTVs was of the
same general magnitude as the changes in the OARs, but owing to the much larger dose
prescriptions they had smaller percent changes.

4. Discussion

A model capable of predicting errors for specific MLC leaves could help to inform better
optimization algorithms for creating plans capable of being delivered as intended. In this study
such a model was built and validated. First, it was shown that MLC errors are predictable to
a high degree of accuracy. Second that such MLC errors have an appreciable impact on the
gamma passing rate of the plan, and a new plan corrected for the errors raises the passing rates.
Finally, dose volumetric histograms (DVH) recalculated with predicted positions incorporated
into the plan provide the treatment planner with a better representation of the deliverable dose
distributions for the PTV and OARs. In this study, several parameters which offer predictive
capability were established. Although much of the variance in the positional errors is captured
by leaf velocity, the linear model taking only velocity into account was outperformed by all
other models. This indicates that there are other patterns in the data not related to velocity

2528
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

which may result in discrepancies between planned and delivered positions. The other patterns
were well predicted by the best performing cubist model.
The inclusion of a leaf motion parameter in the final model does not necessarily imply that
the parameter has a real effect on error magnitude. The inclusion of leaf bank is an example of
this, where the inclusion increases predictive accuracy not because one leaf bank is more error
prone. Rather, it allows the model to switch the direction of the predicted error when the ori-
entation of the coordinate system switches after the first treatment arc. In this study there was
no significant difference between the means of errors from leaf banks A and B by the Welch
two sample t-test ( p  =  0.15). This is in accordance with the findings of Kerns et al (2014), and
in opposition to Stell et al (2004).
Some of the leaf motion parameters which were hypothesized to be related to error magni-
tude did not have an appreciable effect on the models, for example, leaf acceleration, and the
movement of adjacent MLC leaves. Inclusion of either of these parameters led to over-fitting
of the training data, and consequently decreased the RMSE of the models on the validation set,
hindering the generalizability of the models.
It has been posited that dose errors correlate with gap error, and not necessarily with indi-
vidual leaf position errors (Losasso 2008). In contrast to dynamic IMRT plans, where leaves
on opposing leaf banks move in the same direction, in VMAT plans the leaves move back and
forth in both directions. Therefore it is important to know how many of the errors assessed in
this study are gap errors (where the opposing leaves have errors in opposite directions, leading
to larger or smaller leaf gaps than intended), or shift errors (where opposing leaves are both
shifted in the same direction, with little change to the leaf gap). For H&N plans in this study,
the average proportion of errors which were gap errors was 31.77% (SD  =  1.46%). Prostate
plans generally showed a lower proportion, with 15.70% (SD  =  1.34%) of the errors being gap
errors. It was also found that, in general, when the errors of opposing leaves were in opposite
directions, the average change in leaf gap was 1.74 mm (SD  =  0.44 mm), whereas for shift
errors the average change in leaf gap was 0.35 mm (SD  =  0.20 mm). That is, although there are
fewer gap errors, the magnitude of gap errors is typically much larger than that of shift errors.
It was shown that the accuracy of predictions for a given MLC was independent of the
choice of training plan. However, using training data from a different MLC of the same model
led to discrepancies in predictions. It is therefore recommended that a model specific to each
MLC should be trained, and used for predictions only for that specific MLC.
MLC leaf position errors are potentially a contributing factor in radiotherapy treatment
plans failing to be delivered as intended. The prediction of leaf position errors could be used
as a component of a modulation index to predict the delivery accuracy of a plan pre-delivery.
For example, an MLC modulation index could be built as a linear combination or ratio of the
number of predicted errors above or below certain thresholds. Although methods such as these
may be able to predict deliverability, pre-treatment QA should continue to be an important
part of the treatment workflow.
Predictions may also be used to further investigate the dosimetric effects of MLC errors.
Dosimetric effects of random MLC errors have been studied in the past by sampling from
a Gaussian (Rangel and Dunscombe 2009, Oliver et al 2010) or from a uniform distribu-
tion (Mu et al 2007, Yan et al 2009, Bai et al 2013). However, neither of these distributions
accurately model a realistic error distribution, nor do they take into account the directional
depend­ence of leaf errors on leaf velocity. Therefore, by utilizing the method for error pre-
diction described in this study, more accurate assessments of the dosimetric effects of MLC
errors may be investigated. This study is limited in that it only considered Varian Millennium
120 MLCs, however, there is nothing precluding the methods from being adapted to other
MLCs, and this will be undertaken as a future work.

2529
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

It is important to note that this work is concerned with the internal representation of MLC
positions used to calculate the dose distributions within the TPS. If the positions sent to the
MLC controller were altered to be the predicted positions, there would still be positional errors.

5. Conclusions

In this study, it was shown that MLC leaf position errors can be predicted to a high degree of
accuracy by utilizing statistical learning techniques. All models took only a single plan as an
input, the models are simple to implement, and take approximately one second to train. By
utilizing the predicted positions, rather than the planned positions to calculate dose distributions
it was shown that gamma passing rates can be increased, and that errors in MLC positions that
impact dose volumetric parameters can be reduced. The methodology developed in this study
was shown to be generalizable to other institutions by assessing their own institutional data.
By incorporating and correcting for the predicted errors in MLC positions, optimization
routines for encoding MLC leaf positions may be improved, and would allow for more realis-
tic calculation of the dose distributions as truly delivered to the patient.

Acknowledgments

This work was in part supported by the National Research Foundation of Korea (490-
20150036, 490-20140041 and 5267-20150100) grant funded by the Korea government. The
authors are grateful to the editor and associate editors for their valuable comments and review
of this paper.

References

Almond P R, Biggs P J, Coursey B M, Hanson W F, Huq M S, Nath R and Rogers D W O 1999 AAPM’s
TG-51 protocol for clinical reference dosimetry of high-energy photon and electron beams Med.
Phys. 26 1847–70
Bai  S, Li  G, Wang  M, Jiang  Q, Zhang  Y and Wei  Y 2013 Effect of MLC leaf position, collimator
rotation angle, and gantry rotation angle errors on intensity-modulated radiotherapy plans for
nasopharyngeal carcinoma Med. Dosimetry 38 143–7
Breiman L 2001 Random forests Mach. Learn. 45 5–32
Chen F, Rao M, Ye J-S, Shepard D M and Cao D 2011 Impact of leaf motion constraints on IMAT plan
quality, deliver accuracy, and efficiency Med. Phys. 38 6106–18
Heilemann  G, Poppe  B and Laub  W 2013 On the sensitivity of common gamma-index evaluation
methods to MLC misalignments in Rapidarc quality assurance Med. Phys. 40 031702
Iftimia I, Cirino E T, Xiong L and Mower H W 2010 Quality assurance methodology for Varian RapidArc
treatment plans J. Appl. Clin. Med. Phys. 11 130–43
Kerns J R, Childress N and Kry S F 2014 A multi-institution evaluation of MLC log files and performance
in IMRT delivery Radiat. Oncol. 9 176
Klein  E  E et al 2009 Task group 142 report: quality assurance of medical accelerators Med. Phys.
36 4197–212
Kuhn M and Johnson K 2013 Applied Predictive Modeling (Berlin: Springer)
Kuhn M, Weston S, Keefer C and Coulter N 2014 C code for cubist by Ross Quinlan Cubist: Rule- and
Instance-Based Regression Modeling
Kutcher G J et al 1994 Comprehensive QA for radiation oncology: report of AAPM radiation therapy
committee task group 40 Med. Phys. 21 581
Li J G, Dempsey J F, Ding L, Liu C and Palta J R 2003 Validation of dynamic MLC-controller log files
using a 2D diode array Med. Phys. 30 799
Li R and Xing L 2013 An adaptive planning strategy for station parameter optimized radiation therapy
(SPORT): segmentally boosted VMAT Med. Phys. 40 050701
2530
Phys. Med. Biol. 61 (2016) 2514 J N K Carlson et al

Liaw A and Wiener M 2002 Classification and regression by randomForest R News 2 18–22
Losasso T 2008 IMRT delivery performance with a varian multileaf collimator Int. J. Radiat. Oncol.
Biol. Phys. 71 S85–8
Masi L, Doro R, Favuzza V, Cipressi S and Livi L 2013 Impact of plan parameters on the dosimetric
accuracy of volumetric modulated arc therapy Med. Phys. 40 071718
Miura H, Tanooka M, Fujiwara M, Takada Y, Doi H, Odawara S, Kosaka K, Kamikonya N and Hirota S
2014a Predicting delivery error using a DICOM-RT plan for volumetric modulated arc therapy Int.
J. Med. Phys. Clin. Eng. Radiat. Oncol. 3 82–7
Miura H, Tanooka M, Inoue H, Fujiwara M, Kosaka K, Doi H, Takada Y, Odawara S, Kamikonya N and
Hirota S 2014b DICOM-RT plan complexity verification for volumetric modulated arc therapy Int.
J. Med. Phys. Clin. Eng. Radiat. Oncol. 3 117–24
Mu G, Ludlum E and Xia P 2007 Impact of MLC leaf position errors on simple and complex IMRT plans
for head and neck cancer Phys. Med. Biol. 53 77–88
Oliver M, Gagne I, Bush K, Zavgorodni S, Ansbacher W and Beckham W 2010 Clinical significance
of multi-leaf collimator positional errors for volumetric modulated arc therapy Radiother. Oncol.
97 554–60
Otto K 2008 Volumetric modulated arc therapy: IMRT in a single gantry arc Med. Phys. 35 310–7
Park J M, Park S-Y, Kim H, Kim J H, Carlson J and Ye S-J 2014a Modulation indices for volumetric
modulated arc therapy Phys. Med. Biol. 59 7315–40
Park S-Y, Kim I H, Ye S-J, Carlson J and Park J M 2014b Texture analysis on the fluence map to evaluate
the degree of modulation for volumetric modulated arc therapy Med. Phys. 41 111718
Park J M, Wu H G, Kim J H, Carlson J N K and Kim K 2015 The effect of MLC speed and acceleration
on the plan delivery accuracy of VMAT BJR 88 20140698
R Development Core Team 2010 R: a language and environment for statistical computing R Foundation
for Statistical Computing (http://www.R-project.org/)
Ramsey C R, Spencer K M, Alhakeem R and Oliver A L 2001 Leaf position error during conformal
dynamic arc and intensity modulated arc treatments Med. Phys. 28 67
Rangel A and Dunscombe P 2009 Tolerances on MLC leaf position accuracy for IMRT delivery with a
dynamic MLC Med. Phys. 36 3304
Stell A M, Li J G, Zeidan O A and Dempsey J F 2004 An extensive log-file analysis of step-and-shoot
intensity modulated radiation therapy segment delivery errors Med. Phys. 31 1593
Tatsumi D, Hosono M N, Nakada R, Ishii K, Tsutsumi S, Inoue M, Ichida T and Miki Y 2011 Direct
impact analysis of multi-leaf collimator leaf position errors on dose distributions in volumetric
modulated arc therapy: a pass rate calculation between measured planar doses with and without the
position errors Phys. Med. Biol. 56 N237–46
Yan G, Liu C, Simon T, Peng L-C, Fox C and Li J 2009 On the sensitivity of patient-specific IMRT QA
to MLC positioning errors J. Appl. Clin. Med. Phys. 10 120–8
Zeidan O A, Li J G, Ranade M, Stell A M and Dempsey J F 2004 Verification of step-and-shoot IMRT
delivery using a fast video-based electronic portal imaging device Med. Phys. 31 463
Zygmanski P, Kung J H, Jiang S B and Chin L 2003 Dependence of fluence errors in dynamic IMRT on
leaf-positional errors varying with time and leaf number Med. Phys. 30 2736

2531

Das könnte Ihnen auch gefallen