Sie sind auf Seite 1von 9

Learning to Set Up Numerical Optimizations of Engineering Designs by M. Schwabacher, T. Ellman, H.

Hirsh reviewed by Jack Hall for Design Optimization and Automation, Fall 2011 Introduction While numerical optimization is an invaluable aid to complex design problems, setting up the optimization still requires skill in itself. The harder the problem, the more knobs there are to tune on the algorithm, and tuning the knobs grows more difficult. The work by Schwabacher et.al. aims to alleviate this problem by using inductive learning algorithms to automate the tuning of various knobs in numerical optimization algorithms. Decision trees were used to automate alternately: the choice or generation of initial prototypes before optimization the selection of active constraints in place of an active set scheme or penalty functions the prediction of design feasibility given a goal and constraints

The authors use two design problems as examples: the design of a supersonic aircraft and the design of the hull of a racing yacht. In both cases evaluating the objective function is very expensive, involving timeconsuming simulations. Because the research is intended to be exploratory and conceptual, the dimensionality of each problem is reduced. This reduction allows for more trial and error on a wider range of learning applications. This review will begin with a discussion of decision trees the class of machine learning algorithms used in the paper. I will then move on to describe each of the problem domains Schwabacher et.al. use as examples. The rest of the paper will consist of a discussion of each application of inductive learning. These sections will present specific methods followed by the corresponding results and commentary about both methods and results.

Inductive Decision Trees Decision trees are a form of machine learning that predicts an outcome from a set of attributes. Starting

at the root node with a set outcomes, the algorithm chooses the attribute that represents the largest information gain, as defined by Shannon's information theory. The algorithm then creates a test and the same number of child nodes as there are results of the test. In a discrete space, a test on a ternary attribute would result in three child nodes. The child nodes are leaf nodes for now, because they do not themselves have any children. Each leaf node has an outcome. At the next step, the learner checks each attribute of each leaf node for the attribute that will give the largest information gain, as in the first step. At that node, the algorithm creates another test and the child nodes to go with it. This node is no longer a leaf node, but its children are. The algorithm continues to iterate until the correct outcome can be predicted perfectly by traversing the tree from the root node to a leaf given a set of attribute values. The concept of information gain merits some further explanation. Information entropy in the outcomes is the number of bits required to represent possibilities in the data. For instance, the set of outcomes from a coin flip can be represented by one bit. Each node of the decision tree has an information entropy that describes the degree of randomness in the data. The learner then branches from that node by applying a test on the one attribute that will result in the largest decrease in information entropy for the child nodes in other words the largest information gain. Done repeatedly, an inductive decision tree is similar to finding the principal components of a set of data. Just as PCA will attempt to minimize error by adding more regressions to describe the next largest component of the error, a decision tree will minimize the information entropy by testing the attribute that represents the largest part of the existing information entropy. Two different inductive learners were used in this research depending on whether the particular application was continuous or discrete. C4.5 was used for discrete learning and CART was used for continuous tasks; both are standard solutions in their respective domains. When decision trees are applied to continuous problems, the algorithm changes slightly. When a node branches, the attribute is usually split by inequalities, and the outcomes for each new leaf are averaged. This is not very different from constant-value regression. Two other important variations to inductive decision trees are pruning and gradient boosting. Pruning can be very important to counteract the tendency of decision trees to overtrain that is, to learn details specific

to the training data rather than patterns that can be predicted for other data sets. Pruning can replace whole branches with leaf nodes or remove intermediate nodes while collapsing child branches to close the subsequent gap. Gradient boosting is used in continuous learners to optimize the result of the constant-value regression at each leaf node.

The Design Problems and Formulations Inductive learning was used to augment numerical optimization for two design problems. Both problems featured a very high time-cost for candidate evaluation, and in both cases Schwabacher et.al. reduced the dimensionality of the problem to make it more tractable for their exploratory research. While this sacrifice is fine for exploration, it prevents the work from acting as a proof of concept. The authors' main motivation was to reduce the computation and time required to iterate designs in high-dimensional and difficult-to-evaluate problems. But by reducing the size of the problems they address in the paper, they solve an easier problem than the one they are really interested in. Inductive learning may not be able to deal with the more complicated, real problems, or it may require too much training data. On the other hand, the work may serve to inform future efforts in this area. It is safe to say that if inductive learning does not work on a small scale, it will not work on a larger one.

Racing Yachts The first problem concerns the design of yacht hulls. Races are held in varying but known circumstances; the wind speed and heading change for each race, but the conditions are known beforehand. It is therefore advantageous to design hulls specific to each race. The set of hull design parameters is reduced here to eight: three to scale the hull in the Cartesian directions, three to streamline the hull in those directions, and two to specify the keel. This is much simpler than the original problem formulation, which the authors did not discuss aside from mentioning that the surfaces were represented by B-splines. The solution is constrained in ways not described. For an optimizer they used a greedy hill-climber altered to ignore small bumps in the objective surface.

Such a simple algorithm may seem to be an inferior choice, but one of the ways they use inductive learning was to choose a good starting point for the algorithm, If the optimization is already near to the global minimum, a greedy search is the most efficient method. They could have compared the inductive learner and greedy approach with a sophisticated, standalone optimizer, but they did not. Designs were evaluated by simulation; the authors tried their methods with two different simulators. One was relatively expensive and high-fidelity, while the other was cheaper and less accurate.

Supersonic Aircraft The goal of the second design problem is to generate concept prototypes for supersonic transport jets. Since many concepts may need to be generated in a given design cycle, the use of inductive learning to streamline repeated optimization runs is justified. They minimized takeoff mass - apparently a common metric at this stage of the design process with an SQP algorithm. The authors again reduced the problem to use eight design variables: - engine size - wing area - wing aspect ratio - fuselage taper ratio - structural thickness - wing sweep - wing taper ratio - fuel annulus width

The problem solution is constrained by bounds on each of four of the variables. The goals specified for each optimization varied mission details like the distance, Mach number, and percentage of the mission flown over land. The SQP algorithm they used had problems finding feasible solutions, especially for randomly-generated mission goals. These issues motivated a study in predicting the existence of feasible solutions for a given goal, as discussed in the Feasible Goal Prediction section below.

Learning to Select and Generate Candidates One of the biggest knobs on any serial optimization algorithm is the starting candidate. This research explored two methods for automating this knob by learning its relationship to the design goal. A continuous

learner generated candidates for the aircraft domain, and a discrete learner chose existing yacht candidates from a database. The correct design is the one that leads the optimizer to the global minimum. Error rates are taken to mean the probability that a given method generates or chooses an initial candidate that does not lead to the global minimum.

Discrete Candidate Selection for Yachts The C4.5 discrete inductive learner was used to predict which design out of an existing database would if used as a starting guess allow a hill-climbing algorithm to reach the known global minimum for that design goal. The work attempted to predict the proper starting point for both high- and low-fidelity simulators. The results are below, presented as they are in the paper. AHVPP is the high-fidelity simulation, and RUVPP is the lower-fidelity simulation.

The course-time increase measures how severe the average error for each method is by giving the average increase in the time needed to run the race course. The inductive learner fares well but is outperformed in the simpler simulation by the best initial evaluation method. The authors did not say exactly what this method is, but I surmise it involves testing each candidate on a shortened version of the simulation. This method does very well with the simpler simulation because simpler evaluation reduces the modality of the objective space. Starting as close as possible works well if there are fewer local minima to get stuck in. Conversely, the inductive learner significantly outperformed all other methods in the more complicated objective space created by the high-fidelity simulation. The learner was evidently able to learn which side of an objective-space ridge to start from for a particular goal.

Continuous Candidate Generation for Aircraft The authors used the CART continuous learner to synthesize a design candidate that would allow the SQP optimization to reach the known global minimum for each test. The results are compared to simpler ways of generating design candidates in the Tables 8 and 9. Again, the tables are taken directly from the paper.

In the first table, the mean method consists of starting from the average of all existing prototypes. The random methods consist of running the optimization from each random starting point and taking the best solution found. Predictably, the random methods require many more function evaluations. The inductive learner performed as well as any method, but did so with markedly fewer function evaluations. The surprising result is how successful the mean candidate method is. This error space must be relatively smooth, and change little with different goals. I question their random candidate selection; if the average worked so well it seems possible that they were selecting random prototypes from parts of the design space not spanned by existing prototypes. They did not specify the exact method of random generation. The second table compares the candidate synthesized by the learner to mean candidate. The error is measured from the known optimal solution. Low numbers indicate an improvement in prediction compared to the mean candidate, meaning that the learner starts the optimization closer to the goal. The learner started closer to the goal in all dimensions but one, and in several dimensions the difference was more than an order of magnitude.

Learning to Select Active Constraints One major problem in optimization is figuring out which inequality constraints should be active. Constraints known to be active can be folded away to reduce the dimensionality of the problem, while improperly active constraints can prevent the optimal solution or even the existence of a feasible solution. For each problem domain studied, Schwabacher et.al. used symbolic algebra software to formulate all constraints independently. There were three possible constraints on the yacht domain and eight possible constraints on the aircraft domain. A discrete decision tree (C4.5) was set up to predict the state of each constraint at the goal: active, inactive or violated.

Table 12 shows the success of inductive constraint selection in the yacht domain, and Table 14 shows the same in the aircraft domain. Each of the middle columns indicates the quality of the constrained optimization, and the right columns compare the time required for the constrained optimization to the that required for an unconstrained optimization. Two rows in Table 14 compare the inductive learner to the most frequent constraint set instead. Using the inductive learner to choose active constraints saved a significant amount of time in both domains, and in the yacht domain it actually improved the solution. In the aircraft domain, only the omniscient case managed to outperform the learner with respect to both time and success. In the yacht domain, the most frequent configuration came very close to matching the inductive learner. The yacht simulations were performed with the low-fidelity simulator, which results in a simpler objective function with fewer modalities. It would be interesting to see the results of this experiment with the more sophisticated simulator. As with initial candidate

selection, the more complicated objective function may give the inductive learner an edge.

Learning to Predict Which Goals are Achievable In the aircraft domain, the authors find that the SQP optimizer has difficulty finding feasible solutions for many of the randomly-generated goals. In order to save the time it takes to run the failed optimizations, the authors set up a decision tree to predict whether or not a given goal is achievable. Each outcome in the tree has a binary value representing the existence of a feasible solution, and the attributes of each node are the goal conditions. The inductive learner was able to predict the outcome for all but 4% of the goals presented to it. The most frequent outcome was wrong for 24% of the goals, and random guessing yielded the expected error of 50%. It would have been nice to see a comparison of the inductive learner to some more intelligent predictor of feasibility, but this experiment was a side note in the research; Schwabacher et.al. did not spend much time discussing it.

Conclusions The research presented in Learning to Set Up Numerical Optimizations of Engineering Design was solid within its limited scope. As discussed in the problem formulation section, the authors reduced the scale of the design problems to allow a broader exploration of the uses of inductive learning. This makes the research much less convincing as a proof of concept, but the success of some of the reduced experiments is promising for future work. Although this is also outside the scope of the research, it would have been nice to see inductive learning applied to different classes of problem. Large combinational optimizations may also benefit from augmentation by inductive decision trees, as might path planning. Neither problem type was discussed. The scope of the research also excludes any sort of stochastic optimization. Inductive decision trees are likely to need much more training data to deal with stochastic problems, and elaborate pruning mechanisms may be necessary to prevent overtraining. Overall, there is reason to doubt the usefulness of decision trees in problems involving a high degree of uncertainty. Maybe future work will address more problem types.

The main problem with this research is that machine learning tends to suffer greatly from the curse of dimensionality. By doubling the number of dimensions, you may quintuple the amount of training data needed. In situations involving repeated, related optimizations (like this research), the optimizations needed to generate training data could present a significant time overhead, potentially enough to render inductive learning useless. In the case of constraint selection, the authors also failed to take into account the time required to formulate each constraint independently, which could be nontrivial. In some cases, independent formulation may be impossible. The authors should have done more to show that optimizations augmented with inductive learning are feasible in general.

References Schwabacher, M., Ellman, T., Hirsh, H. (1998) Learning to Set Up Numerical Optimizations of Engineering Design. Artificial Intelligence for Engineering Design, Analysis and Manufacturing, 12, 173 192. Rokach, L.; Maimon, O. (2005). "Top-down induction of decision trees classifiers-a survey". IEEE Transactions on Systems, Man, and Cybernetics, Part C 35 (4): 476487.

Das könnte Ihnen auch gefallen