ESWA-D-08-00701.pdf

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

2 Aufrufe

ESWA-D-08-00701.pdf

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

- Comparison of Results
- Calibration Intervals From Variables Data(2)
- 12 Probability and Statistics
- Lecture1-3-MEC223
- Comparison of alternative versions of the job demand-control scales
- Economic Performance and Satisfaction With Democracy
- Monitoring and Analysis of Vertical Deformation of Palm House Benin City Using Digital Level
- Ambient Air CEN/TS 16450:2015
- iim Point Estimation and Interval Estimation.pdf
- Practice Exam 3
- 2320 Final Pass Paper
- Stats chips.docx
- 2007 AP Statistics Multiple Choice Exam
- Cálculo de buffer
- Iim Point Estimation and Interval Estimation
- term project
- Session 16.pptx
- Two-pop-Matched-paired Data Exercises 4and5 (1)
- 2016 Lumc Beamer Template 2 Dot 0
- math 1040

Sie sind auf Seite 1von 36

Manuscript Draft

Title: An Ordinal Optimization Theory Based Algorithm for a Class of Simulation Optimization Problems and

Application

Article Type: Full Length Article

Keywords: ordinal optimization, stochastic simulation optimization, artificial neural network, genetic algorithm,

wafer probe testing

Corresponding Author: Assistant Professor Shih-Cheng Horng, Ph.D.

Corresponding Author's Institution: Chaoyang University of Technology

First Author: Shih-Cheng Horng, Ph.D.

Order of Authors: Shih-Cheng Horng, Ph.D.; Shieh-Shing Lin, Ph.D.

Cover Letter

Dear Editor,

We would like to submit the enclosed manuscript entitled An Ordinal Optimization

Theory Based Algorithm for a Class of Simulation Optimization Problems and

Application, which we wish to be considered for publication in Expert Systems with

Applications.

Correspondence and phone calls about the paper should be directed to Shih-Cheng

Horng at the following address, phone and fax number, and e-mail address

University of Technology

Adress: 168 Jifong E. Rd., Wufong Township Taichung County, 41349, Taiwan, R.O.C.

Phone: +886-4-23323000 ext 7633

Fax: +886-4-23742375

e-mail : schong.ece90g@nctu.edu.tw

Thank you very much your considering our manuscript for potential publication. I'm

looking forward to hearing from you soon.

Sincerely yours,

Shih-Cheng Horng

* Manuscript

Click here to view linked References

for a Class of Simulation Optimization

Problems and Application

Shih-Cheng Horng and

schong@cyut.edu.tw

Shieh-Shing Lin

sslin@mail.sju.edu.tw

Applications

as a REGULAR PAPER

Correspondent : Assistant Professor Shih-Cheng Horng

Institute: Department of Computer Science & Information Engineering

Chaoyang University of Technology

Address: 168 Jifong E. Rd., Wufong Township, Taichung County, 41349,

Taiwan, R.O.C.

Phone: +886-4-23323000 ext 7801

Fax: +886-4-23742375

e-mail : schong@cyut.edu.tw

Shih-Cheng Horng is currently an assistant professor of the Department of Computer Science and

Information Engineering at Chaoyang University of Technology, Taiwan, R.O.C. Shieh-Shing Lin is

now a professor of the Department of Electrical Engineering at St. John's University, Taiwan, R.O.C.

This work was partially supported by National Science Council in Taiwan, R.O.C. under Grant

NSC96-2622-E-129-005-CC3.

Abstract

In this paper, we have proposed an ordinal optimization theory based two-stage algorithm to

solve for a good enough solution of the stochastic simulation optimization problem with huge

input-variable space . In the first stage, we construct a crude but effective model for the

considered problem based on an artificial neural network. This crude model will then be used

as a fitness function evaluation tool in a genetic algorithm to select N excellent settings

from . In the second stage, starting from the selected N excellent settings we proceed

with the existing goal softening searching procedures to search for a good enough solution of

the considered problem.

We applied the proposed algorithm to the reduction of overkills and retests in a wafer probe

testing process, which is formulated as a stochastic simulation optimization problem that

consists of a huge input-variable space formed by the vector of threshold values in the testing

process. The vector of good enough threshold values obtained by the proposed algorithm is

promising in the aspects of solution quality and computational efficiency. We have also

justified the performance of the proposed algorithm in a wafer probe testing process based on

the ordinal optimization theory.

Key Words: ordinal optimization, stochastic simulation optimization, artificial neural

network, genetic algorithm, wafer probe testing.

1.

Introduction

whose outputs can only be evaluated by simulations (Fu et al., 2005). Thus, the objective of

simulation optimization is to find the optimal settings of the input variables to the simulated

system that makes the output variables at their best or optimal conditions. Various methods

had been developed for this purpose such as the Gradient Search based methods (Nocedal &

Wright, 2006; Kim, 2006), the Stochastic Approximation methods (Theiler & Alper, 2006;

Spall, 2003), the Sample Path methods (Hunt, 2005), the Response Surface methods (Myers

et al., 2004), and Heuristic search methods. These methods had been thoroughly discussed in

(April et al., 2003). Among them, the Heuristic search methods including the Genetic

Algorithm (GA) (Haupt & Haupt, 2004), the Simulated Annealing (SA) method (Suman &

Kumar, 2006), and the Tabu Search (TS) method (Hedar & Fukushima, 2006) are frequently

used in simulation optimization (Blum & Roli, 2003; Tekin & Sabuncuoglu, 2004). According

to an empirical comparison of these algorithms (Lacksonen, 2001), GA showed the capacity

to robustly solve large problems and performed well over the others in solving a wide variety

of simulation problems. Despite the success of several applications of the above heuristic

methods (Ahmed, 2007; Fattahi et al., 2007), many technical hurdles and barriers to broader

application remain as indicated in (Dro et al., 2006). Chief among these is speed, because

using the simulation to evaluate the output variables for a given setting of the input variables

is already computationally expensive not even mention the search of the best setting provided

that the input-variable space is huge. Furthermore, simulation often faces situations where

variability is an integral part of the problem. Thus, stochastic noise further complicates the

simulation optimization problem. The purpose of this paper is to resolve this challenging

stochastic simulation optimization problem effectively.

The considered stochastic simulation optimization problem is stated in the following

min J ( )

(1)

expected output or a function of expected outputs of the simulated system. To cope with the

computational complexity of this problem, we will employ the Ordinal Optimization (OO)

theory based goal softening strategy (Lau & Ho, 1997; Ho, 1999), which seeks a good enough

solution with high probability instead of searching the best for sure based on the expectation

that the performance order of the input-variable settings is likely to be preserved even

evaluated by a crude model.

modeling noise. From here on, we will use the word setting to represent the setting of input

variables.

The basic idea of the OO theory based goal softening strategy is to reduce the searching

space gradually, and its existing searching procedures can be summarized in the following

(Lau & Ho, 1997): (i) Uniformly select N , say 1000, settings from . (ii) Evaluate and

order the N settings using a crude model of the considered problem, then pick the top s ,

say 50, settings to form the Selected Subset (SS), which is the estimated Good Enough Subset

(GS). A Good Enough Subset is defined as the subset consisting of the top n% solutions in

the input-variable space. (iii) Evaluate and order all the s settings in SS using the exact

model, then pick the top k ( 1) settings. In OO theory (Lau & Ho, 1997), the model noise is

used to describe the degree of roughness of the crude model. The OO theory had shown that

for N =1000 in (i) and a crude model with significant noise in (ii), the top setting (i.e., k=1)

selected from (iii) with s 50 must belong to the GS with probability 0.95, where GS

represents a collection of the top 5% actually good enough settings among N . This means the

actual top setting in SS selected from (iii) is among the actual top 5% of the N settings with

probability 0.95. However, the good enough solution of problem (1) that we are searching for

should be a good enough setting in instead of the N settings unless is as small as

3

N (Chen et al., 1999; Ho et al., 2007). As indicated in a recent paper by Lin and Ho (Lin &

Ho, 2002), under a moderate modeling noise, the top 3.5% of the uniformly selected N

settings will be among the top 5% settings of a huge with a very high probability ( 0.99),

and the best case can be among the top 3.5% settings of provided that there is no

modeling error. However, for with size of 10 30 , a top 3.5% setting is a setting among the

top 3.5 10 28 ones. This certainly not seems to be a good enough solution in the sense of

practical optimization; however, it is acceptable only when consists of lots of good

settings so that even if the performance order of the selected setting is not practically good

enough, the corresponding objective value is. As a matter of fact, most of the practical

stochastic simulation optimization problems do not have lots of good settings; otherwise,

finding a good enough solution wont be difficult. Therefore to apply the existing goal

softening searching procedures, we need to develop a new scheme to select N excellent

settings from to replace (i) so as to ensure the final selected-setting is a good enough

solution of (1) from the practical viewpoint.

Heuristic methods for obtaining N excellent settings may depend on how well ones

knowledge about the considered system. For instance in the optimal power flow problems

with discrete control variables, Lin et al. proposed an algorithm based on the OO theory and

engineering intuition to select N excellent discrete control vectors (Lin et al., 2004).

However, the engineering intuition may work only for specific systems. Thus, in this paper,

we will propose an OO theory based systematic approach to select N excellent settings

from and combine with the existing goal softening searching procedures to find a good

enough solution of (1). The presentation of this OO theory based two-stage algorithm to solve

(1) for a good enough solution is a novel approach in the area of simulation optimization and is

one of the contributions of this paper.

Reducing overkills and retests is an important issue in semiconductor wafer probe testing

4

process. Taking the chip demand into account, we have formulated this problem as a

stochastic simulation optimization problem, which possesses a huge input-variable space and

is most suitable for demonstrating the validity of the proposed OO theory based two-stage

algorithm. This novel formulation as well as the novel solution methodology for this

important and practical stochastic optimization problem is another contribution of this paper.

We organize our paper in the following manner. In Section 2, we will describe the OO

theory based two-stage approach and present the proposed two-stage algorithm. In Section 3,

we will introduce the stochastic optimization problem of reducing overkills and retests in

semiconductor wafer probe testing process and present the application of the proposed

algorithm. In Section 4, we will show the test results of applying the proposed algorithm on a

real case and demonstrate the solution quality and the computational efficiency by comparing

with a vast number of randomly generated solutions and competing methods, respectively.

We have also justified the performance of the proposed algorithm in a wafer probe testing

process based on the ordinal optimization theory. Finally, we will make a conclusion in

Section 5.

2.

with huge discrete input-variable space . However, to evaluate the true objective value of a

setting , we need to perform a stochastic simulation of infinite test samples for the .

Although infinite test samples will make the objective value of (1) stable, in fact, this is

practically impossible. Thus, sufficiently large test samples are utilized in place of infinite test

samples to make the objective value of (1), J ( ) , sufficiently stable.

The proposed OO theory based approach consists of two stages to solve (1) for a good

enough setting. The first stage is an exploration stage. In this stage, we will employ a Genetic

Algorithm (GA) to search through using an off-line trained Artificial Neural Network

5

(ANN) as a crude model for fitness evaluation and select N (=1024) excellent settings. The

heuristic generation of N (=1024) is based on the OO theory (Lau & Ho, 1997). The second

stage is an exploitation stage to find a good enough setting from the N settings obtained in

first stage with more refined crude models. A more refined crude model is defined as a model

that is tolerant of a small modeling noise. Suppose we use the exact model to evaluate all the

N settings, we can obtain the best setting in the N , however at the cost of too much

computation time, which is against our objective. Therefore, we will divide the second stage

into multiple subphases. The more refined crude models for estimating J ( ) of a setting

employed in these subphases are stochastic simulations of various lengths ranging from very

short (crude model) to very long (exact model). The candidate solution set in each subphase (or

the estimated good enough subset resulted from previous subphase) will be reduced gradually.

In the last subphase, we will use the exact model to evaluate all the settings in the most

updated candidate solution set, and the one with smallest J ( ) is the good enough setting

that we seek. Therefore, the computational complexity can be drastically decreased, because

the size of the candidate solution set had been largely reduced when the crude model is more

refined. In the following, we will present the details of the OO theory based two-stage

approach.

2.1 The First Stage Approach

Since the order of settings are relatively immune to effects of estimation noise, performance

order of the settings is likely to be preserved even evaluated using a crude model. Thus, to

select N excellent settings from without consuming much computation time, we need

to construct a crude but effective model to evaluate the objective value J ( ) for a given

setting , and use a selection scheme to select N excellent settings. Our crude model is

constructed based on ANN (Graupe, 2007), and our selection scheme is GA (Haupt & Haupt,

2004).

6

ANN is considered to be a universal function approximator due to its genetic and

convenient property to model complicated nonlinear input-output relationships. Considering

the inputs and outputs as the settings and the corresponding objective values J ( ) ,

respectively, we can use an ANN to implement the mapping from the inputs to the outputs

(Graupe, 2007). To construct such an ANN, first of all, we will select a training data set by

randomly sampling M settings without replacement from . The formula to calculate the

number of random sample (RS) of a given input-variable space is as follows (Moore &

McCabe, 1999):

RS

p (1 p ) z 2 CI 2

1 [ p (1 p ) z 2 CI 2 ]

(2)

where z is 1.96 and 2.57 for 95% and 99% confidence level, respectively; p is the

percentage picking a choice, which is 0.5 used for calculating sample size; and CI is

confidence interval which expresses as decimal. The confidence level is the estimated

probability that a population estimate lies within a given margin or error. The confidence

interval measures the precision with which an estimate from a single sample approximates the

population value. Consider an input-variable space with 1030 , the number of random

sample determining by (2) is 16641 for confidence level 99% and confidence interval 1%.

Then we will evaluate the objective values of these M =16641 settings using an exact model,

which can be a stochastic simulation with sufficient large number of test samples as indicated

in (Chen et al., 1999). These collected M input-output pairs of ( , J ( ) ) will be used to

train the ANN to adjust its arc weights. Once this ANN is trained, we can input any setting

to obtain an estimation of the corresponding J ( ) from the output of the ANN; in this

manner, we can avoid an accurate but lengthy stochastic simulation to evaluate J ( ) for a

given . This forms our crude model to roughly estimate the objective value of (1) for a

7

given setting . Effectiveness of this crude model is justified by the OO theory as mentioned

above, because what we care here are the relative order of s, not the value of J ( ) s.

2.1.2 The Genetic Algorithm (GA)

GA is a stochastic search algorithm based on the mechanism of natural selection and natural

genetics. By the aid of the above effective objective value (or the so-called fitness value in

GA terminology) evaluation model, we can select N excellent settings from using GA,

which is briefly described as follows. Assuming an initial random population produced and

evaluated, genetic evolution takes place by means of three basic genetic operators: (a) parent

selection; (b) crossover; (c) mutation. The chromosome in GA terminology represents a

setting in our problem, and each chromosome is encoded by a string of 0s and 1s. Parent

selection is a simple procedure whereby two chromosomes are selected from the parent

chromosome based on their fitness values. Solutions with high fitness values have a high

probability of contributing new offspring to the next generation. The selection rule we used in

our approach is a simple roulette-wheel selection. Crossover is an extremely important

operator for the GA. It is responsible for the structure recombination (information exchange

between mating chromosomes) and the convergence speed of the GA and is usually applied

with relatively high probability, say 0.7. The chromosomes of the two parents selected are

combined to form new chromosomes that inherit segments of information stored in parent

chromosomes. There are many crossover scheme, we employ the single-point crossover in our

approach. While crossover is the main genetic operator exploring the information included in

the current generation, it does not produce new information. Mutation is the operator

responsible for the injection of new information. With a small probability, random bits of the

offspring chromosomes flip from 0 to 1 and vice versa and give new characteristics that do

not exist in the parent chromosome. In our approach, the mutation operator is applied with a

relatively small probability 0.02 to every bit of the chromosome.

There are two criteria for the convergence of GA. One is when the fitness value of the best

chromosome does not improve from the previous generation, and the other is when evolving

enough generations. The initial populations of the GA employed in our first stage approach

are I , say 5000, randomly selected settings from . After the applied GA converges, we

rank the final generation of these I chromosomes based on their fitness values and pick the

top N chromosomes, which form the N excellent settings that we look for.

2.2 The Second Stage Approach

Starting from the selected N excellent settings, in the second stage, we will proceed

directly with step (ii) of the existing goal softening searching procedures described in Section

1. In this stage, we will evaluate the objective value of each setting using a more refined

model than the crude one employed in the first stage. This more refined model uses stochastic

simulation with various lengths (i.e. number of test samples) L . We let Ls = 100000

represent the sufficiently large L . In the sequel, we define the exact model of (1) as when the

simulation length L Ls . For the sake of simplicity in expression, we let J s ( ) denote the

objective value of a setting computed by exact model, i.e. L Ls .

First, we define a basic simulation length L0 =500. We set the simulation length of subphase

i , denoted by Li , to be Li kLi 1 (or Li k i L0 ), i 1,2,..., where the positive integer

k ( 2 ) denotes the parameter for controlling the simulation length Li . We let N 1 N and

set the size of the selected estimated good enough subset in subphase i to be N i N i 1 / k

(or N i N 1 / k i ), i 2,3,... . We denote nk as the total number of subphases, and n k is

determined by n k arg{ min ( L0 k nk 1 Ls L0 k nk , 1 N nk 10)} , where Ls = 100000. The

nk

above formula determines n k to be the minimum of the following: (i) the n k such that

simulation length L0 k nk exceeds the length of exact model, Ls , and (ii) the size of the

selected estimated good enough subset resulted in subphase n k is small enough, i.e.

1 N nk 10 . Once n k is determined, we set Lnk Ls , which imply that in the last subphase

(i.e. subphase n k ), the crude model is in fact the exact model of (1), and the setting with

smallest J ( ) is the good enough setting that we seek. Suppose k is very large such that

L1 kL0 Ls , then there will be only one subphase, and each of the N setting will be

evaluated by the exact model, which will consume too much computation time even though

the resulted setting is exactly the best among the N . However, it is not easy to quantify the

tradeoff between the computation time and the goodness of the obtained good enough setting

into an analytical formula. In fact, what is the best k is really problem dependent, because

some problems may care more on computation time and some others on the goodness of the

obtained solution. Therefore, we will show the computation time and the goodness of the

obtained good enough solution of our problem for various k in Section 3.

2.3 The Two-Stage Algorithm

Now, our OO theory based two-stage algorithm can be stated as follows.

Step 1: Randomly select M s from . Compute the corresponding J s ( ) for each

using simulation length Ls . Train an ANN by adjusting its vector of arc weights

using the obtained M input-output pairs, i.e. the M pairs of ( , J ( ) )s. Let f ( , )

denote the functional output of the trained ANN.

Step 2: Randomly select I settings from as the initial populations. Apply a GA with

the following setup: simple roulette-wheel selection scheme, single-point crossover scheme

with probability pc , and mutation probability p m to these chromosomes by the aid of the

fitness-value evaluation model, 1 f ( , ) . After the algorithm converges, we rank all the

final I chromosomes based on their fitness values and select the best N chromosomes (i.e.

s).

10

Step 3: Use the stochastic simulation with simulation length Li k i L0 to estimate the

J ( ) of the candidate N / k i 1 s, i 1,, nk 1 ; rank the candidate N / k i 1 s based

on their estimated J ( ) and select the best N / k i s as the candidate solution set for

subphase i 1 .

Step 4: Use the stochastic simulation with simulation length Ls to compute the J s ( ) of

the candidate N / k nk s. The with the smallest J s ( ) is the good enough that we

look for.

Steps 3 and 4 represent the procedures of the second stage approach.

3.

The wafer fabrication process is a sequence of hundreds of different process steps, which

results in an unavoidable variability accumulated from the small variations of each process

step. Chips are tested multiple times throughout the design and manufacturing process to

ensure the integrity of the chip design and the quality of the manufacturing process. Thus, to

avoid incurring the significant expense of assembling and packaging chips that do not meet

specifications, the wafer probing in the manufacturing process becomes an essential step to

identify flaws early. The primary components of a wafer probe testing system include probes,

probe card, probe station, and test equipment. Wafer probing establishes a temporary

electrical contact between test equipment and each individual die (or chip) on a wafer to

determine the goodness of a die. In general, an 8-inches wafer may consist of 500 to 15000

dies and each die is a chip of integrated circuits. Although there exist techniques such as the

statistical methods and machine learning methods (Chen et al., 2003; Barnett et al., 2005) for

monitoring the operations of the wafer probes, the probing errors may still occur in many

11

aspects and cause some good dies being over killed; consequently, the profit is diminished.

Figure 1 shows the Cause-and-Effect diagram of overkills.

Thus, reducing the number of overkills is always one of the main objectives in wafer probe

testing process. The key tool to identify or save overkills is retest, which is an additional wafer

probing. However, retest is a major factor for decreasing the throughput. Thus, the overkill and

the retest possess inherent conflicting factors, because reducing the former can gain more

profit, however, at the expense of increasing the latter, which will degrade the throughput and

increase the cost. What implies is that drawing a fine line for deciding whether to go for a

retest to save possible overkills is an important research issue in this optimization problem of

the wafer probe testing process. Considering the economic situation regarding throughput

requirement, it would be most beneficial for us to use the trade-off method (Collette & Siarry,

2003) to solve the current problem. That is to minimize the overkills subject to a tolerable

level of retests provided by the decision maker.

Probe Station

Method

Tester

Setup

Test Program

Probes

Overkills

Customer

request

Others

Probe

card

Device

Eng.

mistake

Material

Operator

There may be different testing procedures in different chip manufacturers. After the wafer

probing, a bin number is used to label each bad die of the wafer. A bin number denotes a

classification of circuitry-defect failure in a die. The bin number goes from 1 to a certain

number as defined by engineers. But, no matter what testing procedures are used, the decision

for carrying out the retest should be based on whether the number of good dies and the

12

number of bins in a wafer exceed the corresponding threshold values. Thus, determining these

threshold values so as to minimize the overkills under a tolerable level of retests is the main theme of

the optimization problem considered here. Furthermore, since the goodness of a die and the

probing errors are of stochastic nature, the considered problem becomes a stochastic simulation

optimization problem. Thus, this computationally intractable problem is most suitable for the

application of our OO theory based two-stage algorithm to seek for good enough threshold

values.

3.2 Problem Statements and Mathematical Formulation

In this section, we employ typical testing procedures used in a renowned wafer foundry in

Taiwan, which is briefly described in the following.

For every wafer, the wafer probing is performed twice. The second probing applies only to

those dies failed in the first one. A die is considered to be good if it is good in either probing.

We let wi ( wi ) denote the number of good (bad) dies in wafer i , and let Bij denote the

number of bin j in wafer i . Assume there are J types of bins in a wafer, then

J

wi Bij and wi TDi wi , where TDi denotes the total number of dies in wafer i .

j 1

Following the two times of wafer probing, a two-stage checking on the number of good dies

is performed to determine the necessity of carrying out a retest, i.e. an additional wafer

probing. We let Wmin denote the threshold value of the number of good dies in a wafer to

determine whether to pass or hold the wafer; we let b j max , j 1,..., J , denote the threshold

value of the number of dies of bin j in the hold wafer to determine whether to perform a

retest. The mechanism of the two-stage checking can be summarized below. If wi Wmin , we

pass wafer i ; otherwise, we will hold this wafer and check its bins. For those hold wafers, if

Bij b j max , we will perform retests for all dies of bin j to check whether there are probing

13

errors that cause overkills. This particular class of policies for deciding retest based on the

threshold values is commonly practiced in wafer fabrication processes. Thus, the relationship

between the inputs and the outputs of the considered problem can be described in Figure 2, in

which Wmin , b j max , j 1,..., J , are the input variables, V and R are the output variables,

and the tested wafers are part of the testing procedures. V

1 L

1 L

and

V

R

i

Ri

L i1

L i1

represent the average overkills and retests per wafer, respectively, in which Vi and Ri

denote the number of overkills and retests in wafer i , respectively, and L denotes the total

number of the tested wafers as shown in Figure 2.

Wmin

bj max , j 1,, J

Input

variables

L tested wafers

V,R

Wafer probe testing

procedures

Output

variables

Figure 2: Relationship between the inputs and the outputs of wafer probe testing procedures.

Details of the testing procedures for a wafer are shown in the flow chart of Figure 3, in

which the calculations of the number of overkills and retests are also included. For the

purpose of simulations, we randomly generate Bij based on a Poisson probability

distribution with mean j to represent the results of two times of wafer probing, which are

not performed in current computer simulation and thus shown in the dashed-line square in

Figure 3. Once Bij is generated, we can randomly generate the number of overkills in Bij ,

denoted by vijo , based on a Poisson probability distribution with mean j Bij , where j is

the proportional coefficient for bin j . The number of overkills in a bin is, in general,

proportional to the number of dies of that bin; that means the former will be less provided that

the latter is less. The values of j and j can be found from the real manufacturing data.

14

Next wafer

i=i+1

Current wafer i

types of bins; calculate w

i

ij

j 1

and wi TDi wi

Is

wi W min ?

Yes

J

No

Vi

Pass

wafer i

Are all

v

ij

j 1

Ri 0

Yes

B ij b j max , j 1,..., J ?

No

j=1

Next bin

j=j+1

Next bin

j=j+1

No

Is

Pass

bin j

Bij bj max ?

vij vij

rij 0

Yes

Perform retests on all

dies of bin j

No

Is

jJ?

Yes

v ij 0

rij B ij

No

Is

jJ?

Vi

Yes

ij

j 1

Ri

ij

j 1

15

In contrast to vijo , we let vij denote the number of overkills for bin j of wafer i after

completing the testing procedures and let rij denote the corresponding number of retests. In

these testing procedures, although we may pass the wafer when the threshold-value test is a

success, there may be overkills. As indicated in Figure 3, for the passed wafer i , the number

J

of overkills Vi vijo and the number of retests Ri =0. The same logic applies to the passed

j 1

bin j of the hold wafer i that vij = vijo and rij =0. However, for any retested bin, the

probability of any unidentified overkill is extremely small, because the dies had been probed

three times, which include two times of wafer probing before any retest. Thus, for any

retested bin j , we have vij =0 and rij = Bij as indicated in Figure 3. The resulting values of

Vi and Ri of wafer i shown in Figure 3 will be used to calculate V and R .

From Figure 3, we see that if we increase Wmin while decreasing b j max , there will be more

retests and less overkills. Thus, to reduce overkills under a tolerable level of retests, we will

set minimizing the average number of overkills per wafer, V , as our objective function while

keeping the average number of retests per wafer, R , under a satisfactory level. Thus, using

the trade-off method (Collette & Siarry, 2003), this optimization problem can be formulated

as the following constrained stochastic simulation optimization problem:

min V

xX

1 L

Vi

L i1

subject to R

1 L

Ri rT ,

L i 1

(3)

where x [Wmin , b j max , j 1,..., J ] denotes the vector of threshold values, that is the vector of

input variables; X

16

Remark 1: The value of rT is determined by the decision maker based on the economic

situation. When the chip demand is weak, the throughput, in general, is not a critical problem

in the manufacturing process; therefore, we can allow a larger rT so as to save more

overkills to gain more profit. On the other hand, if the chip demand is strong, then the

throughput is more important, thus we should set the value of rT smaller. Taking the chip

demand into account is a distinguished feature of our formulation.

This constrained stochastic simulation optimization problem (3) is to find an optimal vector

of threshold values, x , to minimize V subject to the employed testing procedures and the

constraint on R . Therefore, we can use a penalty function to transform (3) into the following

unconstrained stochastic simulation optimization problem:

min F ( x) V P( R rT ) ( R rT )

xX

(4)

where P( R rT ) denotes a continuous penalty function for the constraint R rT , such that

P( R rT ) 0 for R rT and P( R rT ) 0 for R rT .

3.3 Application of the Two-Stage Algorithm

The stochastic simulation optimization problem (4) has the same form as (1) by treating x

as , X

input-variable space X is huge; for example, for an 8-inches wafer, which consists of a

typical number of 588 dies, the possible ranges of the integer values Wmin and b j max are [1,

588] and [1, 588], respectively. Consequently for a typical number of bin types K 10 , the

size of X will be more than 2.9 1030 . Thus, this stochastic simulation optimization

problem (4) is most suitable for the application of our two-stage algorithm.

3.3.1 Applying Step 1

To apply Step 1 of the two-stage algorithm to problem (4), we need to construct the crude

17

model based on the ANN first, which consists of two parts: (A) collecting the training data set,

and (B) train the ANNs. We employ two three-layer feed-forward back propagation ANNs

(Graupe, 2007). Assume there are J types of bins in a wafer, J 1 , 2 ( J 1) and 1

neurons are used in the input, hidden and output layers, respectively. The activation functions

of the neurons in the hidden and output layers are the hyperbolic tangent sigmoid and linear

functions, respectively. The inputs for both ANNs are x X ; while for the outputs, one is the

corresponding V and the other is R . We obtain the set of training data for the two ANNs

by the following two steps. (a) Narrow down the input-variable space X by excluding the

irrational threshold values and denote the reduced input-variable space by X . In general, the

yield rate and statistical distribution of the number of any bin for typical products can be

collected from a wafer foundry. Thus the threshold values, Wmin and b j max , should lie in a

reasonable range determined based on their corresponding mean values of wi and Bij ,

respectively. (b) Randomly select M =16641 vectors from X and compute the corresponding

outputs V and R using a stochastic simulation of large number of test wafers (Chen et al.,

1999), that is to perform the simulations of the testing procedures shown in Figure 3 for

Ls =100000 wafers. This constitutes part (A) of constructing the crude model.

We denote the M

training problems for adjusting the arc weights of the above two ANNs are:

M

(5)

i 1

and

M

min c2 [ Ri f 2 ( xi | c2 )]2 ,

(6)

i 1

where c1 and c2 denote the vectors of the arc weights of the ANN for V and the ANN

18

for R , respectively;

f1 ( xi | c1 )

and

f 2 ( xi | c 2 )

corresponding ANNs when the input vector is x i . Thus, the training problems are trying to

adjust the vector of arc weights c1 and c2 to make the actual outputs f1 ( xi | c1 ) and

f 2 ( xi | c2 ) as close to the desired outputs Vi and Ri as possible. To speed up the

convergence of the back propagation training, we employed the BFGS quasi-Newton method

(Gill et al., 1981; Stanevski & Tsvetkov, 2004) and the one step secant method (Battiti, 1992;

Fiore et al., 2004) to solve (5) and (6), respectively. Stopping criteria of the above two

training algorithms are when any of the following two conditions occurs: (i) the sum of the

mean squared errors, i.e. the objective value of the training problem, is smaller than 10-3, and

(ii) the number of epochs exceeds 300. This constitutes part (B) for constructing the crude

model. Once these two ANNs are trained, we can input any vector x to the two ANNs to

estimate the corresponding V and R , which will be used to estimate F ( x ) . This forms

our crude but effective model to estimate F ( x ) for a given input vector x .

3.3.2 Applying Step 2

With the above crude but effective objective value (or the so-called fitness value in GA

terminology) evaluation model, we are ready to apply Step 2 of the two-stage algorithm to

select N ( 1024) excellent input vectors from

employed for all the vectors in X is rather straightforward, because each component of the

vector x is an integer. We start from I ( 5000) randomly selected vectors from X as our

initial populations. The fitness value of each vector is calculated from F ( x ) based on the

outputs of the two ANNs. Apply a GA with the following setup: simple roulette-wheel

selection scheme,

pm 0.02 to these

chromosomes. After the GA evolves for 20 iterations, we rank the final generation of the

I ( 5000) chromosomes based on their fitness values and pick the top N ( 1024)

19

3.3.3 Applying Step 3

Starting from the N ( 1024) input vectors obtained in Step 2, we will compute F ( x ) for

each input vector using a more refined model than ANNs, that is a stochastic simulation with

various number of test wafers. The basic number of test wafers is L0 =500, and n k is

determined by n k arg{ min ( L0 k nk 1 Ls L0 k nk , 1 N nk 10)} , where Ls = 100000. From

nk

F ( x ) of the candidate N / k i 1

x s based on their

estimated F ( x ) and select the best N / k i x s as the candidate solution set for subphase

i 1.

In this step, we will compute the objective value of (4) for each of the N nk input vectors

obtained in Step 3 using the exact model that is a stochastic simulation with sufficiently large

number of test wafers (i.e. Ls =100000) that makes the estimated objective value sufficiently

stable. Then the input vector among N nk associated with the smallest F ( x ) is the good

enough solution that we seek.

4.1 Test Results and Comparisons

Our simulations are based on the following data collected from a practical product of a

renowned wafer foundry in Taiwan. The product is made in 6-inches wafers. Each wafer

consists of 206 dies. There are 10 bins in the wafers of this product, and the values of their

means j , j 1,...,10, are respectively the following 10 positive real numbers: 0.5, 0.5, 1.1,

1.3, 0.8, 3.7, 3.5, 40, 45, and 13. The yield rate of this product is 46.6%. The mean of the

20

overkills that occurred in bin j is 0.03 Bij for j =1,,10, that is j 0.03 for all j .

The input-variable space is X { x [W min , b j max , j 1,...,10] | Wmin [1,206], b j max [1,206], j 1,...,10}.

We used the sigmoid-type function as our penalty function P( R rT ) in (4), i.e.,

P( R rT ) =

that

1

1 e ( R rT )

max Vi

i{1,..., M }

max Ri

i{1,..., M }

Remark 2: The reason we use 6-inches wafer products is for easier identification of the bins

and overkills in experiments. In fact, our results can apply to any size of wafer.

Specific data in the two-stage algorithm applying to this product are given in the following.

In Step 1 of the first stage approach, we have that (a) X is narrowed down rationally

however conservatively to X { x [W min , b j max , j 1,...,10] | W min [50,206], b j max [1,6 j ], j 1,...,10} ,

and (b) M =16641 and Ls =100000 wafers. In Step 2, I =5000, N =1024, and the

convergence criteria we employed for our GA is when the evolving number of generations

exceed 20. It should be noted that all the test results shown in this section are simulated in a

Pentium IV PC.

In the second stage approach, we test the computation time and the goodness of the

obtained good enough solution of our problem for various k in order to choose the suitable

one. We show the F ( x g ) (vertical axis) of the good enough solutions, x g , obtained and the

corresponding CPU time (horizontal axis) consumed by our algorithm with k 2,3,4,5,6

and 200 for the case of rT =10 in Figure 4. In general, smaller k corresponds to less CPU

time consumption because of less simulation replications. However, there is no guarantee that

21

larger k will lead to smaller F ( x g ) . Nonetheless, for sufficiently large k such as k =200,

the corresponding F ( x g ) is the least and CPU time consumption is the longest among all the

tested k s as we expect. Therefore, the choice of k is really problem dependent regarding

how fast one intends to obtain the solution or how good one cares about the obtained solution.

As can be observed in Figure 4, the consumed CPU time for k =2 in this test is within 2

minutes, it is suitable to choose k =2 for the sake of real-time application. Therefore, the

parameters in second stage of our algorithm are set as follows: k =2, L0 =500, Ln = 2 n L0 ,

n2 =8 and N n = 1024 2 n 1 .

Figure 4: The F ( x g ) obtained and the corresponding CPU time consumed by our algorithm

with k 2,3,4,5,6 and 200 for the case of rT =10.

Table 1 shows the simulation length and candidate solution set in each subphase of second

stage. In the last subphase, we use the stochastic simulation with simulation length Ls =

100000 to compute the F ( x g ) of the N n2 8 candidate solutions. The x with the smallest

Fs ( x ) is the good enough vector of threshold values x g that we look for.

22

Table 1: Number of candidate solution and simulation length in each subphase of second

stage.

1

subphase

1024

Nn

1000

Ln

2

512

3

256

4

128

2000

4000

8000

5

64

6

32

7

16

8

8

The good enough vector of threshold values and the average overkill percentage for three

cases rT =10, 40 and 80 we obtained from the two-stage algorithm are shown in Table 2.

From this table, we can observe that when rT increases, the values of Wmin increases as

shown in row 2, and the values of leading b j max , j =8 and 9, which accounts for most of the

retests, decrease as shown in rows 10 and 11, respectively. This indicates that if we allow

more retests, that is increasing rT , we can set more stringent threshold values, that are

increasing Wmin and decreasing the leading b j max , so as to save more overkills, that is

decreasing the average overkill percentage, as indicated in the last row of Table 2. This also

demonstrates the conflicting nature between the two objectives reducing overkills and

retests. We use 590 real test wafers, whose bins Bij and overkills before retest vijo are

known, to test the performance of the vector of threshold values obtained by our algorithm for

the three cases shown in Table 2. The corresponding results of the pair of the average

overkills per wafer, V ( 1

590

V ) ,

590

i 1

590

R ),

590

i 1

for

these 590 test wafers are shown in Figure 5 as the points marked by , , with the

corresponding rT shown on the top right corner of the figure. We also use 2000 randomly

selected vectors of threshold values to test the same 590 test wafers; the resulted pairs of V

and R are shown as the points marked by in Figure 5.

23

Table 2: The good enough vector of threshold values and the average overkill percentage for

three different rT s.

rT

10

40

80

132

2

1

5

5

5

3

6

64

78

29

146

2

1

2

4

5

8

4

55

62

11

157

3

1

3

4

3

5

4

14

52

13

1.36%

0.85%

0.23%

Good enough

vector x g

Wmin

b1 max

b2 max

b3 max

b4 max

b5 max

b6 max

b7 max

b8 max

b9 max

b10 max

V

100%

206

Figure 5: The resulted pairs of ( V , R ) obtained by our algorithm and the randomly generated

vector of threshold values.

We have also used typical GA and Simulated Annealing (SA) algorithm to solve (4) for the

case of rT =40. As indicated at the beginning of Section 1, the global searching techniques are

computationally expensive in solving (4). We stop the GA and SA when they consumed 30

24

times of the CPU time consumed by the two-stage algorithm, and the objective values of (4)

they obtained are still 11.8% and 19.9% more than the final objective value obtained by the

two-stage algorithm, respectively. Using the threshold values they obtained to test the 590

wafers, the resulted ( V , R ) pairs from GA and SA are marked by and + in Figure 5,

respectively. We found that using two-stage algorithm, we can save 11.8% and 19.9% more

overkills than using the GA and SA for R 40 , respectively. In addition, both GA and SA do

not generate the optimal solution, because the best so far solution they obtained for one hour

of CPU time are still far away from the optimal solution of (4).

We see that for R 40 , the V resulted by the good enough vectors of threshold values

obtained by our algorithm is almost the minimum compared with the randomly selected

vectors of threshold values. Similar conclusions can be drawn for the cases of rT =10 and 80.

From Figure 5, we can see that the results we obtained for the cases of rT =10, 40 and 80 are

almost on the boundary of the region resulted from the randomly generated vectors of

threshold values; this implicit boundary represents the ( V , R ) pairs resulted by the optimal

vectors of threshold values. The above result implies that our algorithm not only controls the

level of retests but also obtain a near optimal solution.

4.2 Performance Evaluation

It should be interesting to address how excellent the N selected vectors are among the

various types of input-variable space X so as to demonstrate the validity of our first stage

approach. Although there exists in-depth analysis of the approximation errors for ANN to

approximate continuous functions, the accuracy of approximating the input and output

relationships of a discrete event simulated system is usually addressed using empirical results.

Thus, it is not surprising that we do not get any analytical result for the quality of the N

vectors selected in our first stage approach. Since the input-variable space for test product is

25

X { x [Wmin , b j max , j 1,...,10 ] | Wmin [1,206 ], b j max [1,206 ], j 1,...,10} , the size of the

input- variable space is X 20611 .

The methodology for our performance evaluation is to simulate based on the Ordered

Performance Curves (OPCs) (Lau & Ho, 1997) and the employed crude model. The Order

Performance Curve (OPC) of all the ordered vectors x1 , x2 ,..., x| X | in X is determined by

the spread of the order performance F[1] , F[ 2] ,..., F[| X |] , where F[i ] denotes F ( xi ) . Without

loss of generality,

F[i ] s

can

equally,

are

also

mapped

into

the

range

[0,1]

such

that

for

i 1,2,..., | X |,

z ( xi ) z[i ] (i 1) /(| X | 1) . There are five broad categories of OPC models: (i) lots of good

vectors, (ii) lots of intermediate but few good and bad vectors, (iii) equally distributed good,

bad and intermediate vectors, (iv) lots of good and lots of bad but few intermediate vectors,

and (v) lots of bad vectors. Figure 6 shows a graphical expression of these five types of OPCs.

More precisely, a standardized OPC can be determined by a two-parameter smooth curve

B 1 ( z | , ) = B( z |

1 1

, ) , where B( z | ,) is the Incomplete Beta function of the two

parameters (,) . In general, <1, >1 corresponds to the OPC of type (i); >1, >1

corresponds to the OPC of type (ii); =1, =1 corresponds to the OPC of type (iii); <1,

<1 corresponds to the OPC of type (iv); >1, <1 corresponds to the OPC of type (v).

As indicated in Section 1, we need not consider the types of X consisting of lots of good

vectors in this evaluation, thus we take only the three OPC types (ii), (iii) and (v) into

account.

26

The roughness of the ANN model can be described by adding a uniform noise to the

normalized performances yi s (Lau & Ho, 1997; Ho, 1999). That means, the model of ANN

can be described by the noisy model yi + i , where the random noise i representing a

large modeling noise that is generated from a uniform distribution random variable. We

assume various magnitudes of modeling noise of uniform distribution to represent the

approximation errors caused by the proposed ANN based model and make the following

simple experiments to compare the quality of the N vectors selected by GA based on the

ANN model with those selected in random from the solution space. We let U [-0.1,0.1]

denote the uniform distribution of a random noise ranging from -0.1 to 0.1 to be added to the

normalized performance, i.e. the normalized objective value, of the exact model. The

normalized performance for all solutions in a solution space is equally-spaced ranging from 0

to 1 with 0 as the top performance.

We studied a total of 28 OPCs distributed uniformly among the three broadly generic types,

(ii), (iii) and (v), formed from the following parameters: =1.0, 2.0, 4.0, 5.0 and =0.2, 0.4,

0.8, 1.0, 2.0, 4.0, 5.0. We carried out a Monte Carlo study for vast number of OPCs similar to

that in (Lau & Ho, 1997) for an assumed noise distribution and pick the top N vectors using

GA. In all of our Monte-Carlo calculations, we simulate 10000 realizations of noisy OPCs.

Consider three modeling noise distributions, U [-0.01,0.01], U [-0.05,0.05] and U [-0.1,0.1],

27

the top 5% solutions in N , which are selected by GA, are at least a top 2.37 10 6% , top

8.95 10 4 %, and top 1.19 10 3 % solution in X with probability 0.95, respectively.

However, the top 5% solutions in N , which are selected in random, is at best (i.e. no

modeling error) a top 5% solution in X only. Therefore, we have greatly improved the

quality of the N vectors by replacing the existing uniformly selecting procedure.

Remark 3: Though we do not investigate the actual order of the N vectors for the OPC

types (i) and (iv), our first stage approach can still be applied for problems with of these

two types of OPCs. This is because even if the order of the obtained N vectors of the two

types of OPC may not be as good as those of the other three OPC types due to the sharp

sensitivity of the noise to the performance in these two types, however their actual objective

values will still be good enough due to the existence of lots of good vectors. That means in

both OPC types (i) and (iv), there can be a big difference in the order of good vectors but the

difference in objective values are very small. Thus, no matter what types of OPC we are

facing, our first stage approach works the same way.

5. Conclusions

To cope with the computationally intractable stochastic simulation optimization problems,

we have proposed an ordinal optimization theory based two-stage algorithm to solve for a

good enough solution using reasonable computational time. To demonstrate the applicability

of the proposed algorithm, we have used it to solve for a vector of good enough threshold

values to reduce overkills and retests in a wafer probe testing process of a wafer foundry. We

have tested the performance of the solution we obtained using the real data and found that the

resulting average number of overkills and retests per wafer lie almost on the boundary

resulted from the optimal vector of threshold values of the considered stochastic optimization

problem. This indicates that the proposed algorithm will not only control the tolerable level of

28

retests by taking the various chip demand into account but also provide a near optimal vector

of threshold values. We have demonstrated the computational efficiency of the proposed

algorithm by comparing with the genetic algorithm and the simulated annealing method and

found that when the latter two methods consume more than 30 times of the CPU times

consumed by the proposed algorithm, the best so far objective values they obtained are still

not better than that obtained by the proposed algorithm. We have also justified the

performance of the proposed algorithm in a wafer probe testing process based on the ordinal

optimization theory.

References

Ahmed, M.A. (2007). A modification of the simulated annealing algorithm for discrete

stochastic optimization. Engineering Optimization, 39(6), 701-714.

April, J., Glover, F., Kelly, J.P. & Laguna, M. (2003). Practical introduction to simulation

optimization. In: Proceedings of the 2003 Winter Simulation Conference, vol.1 (pp.71-78).

New Orleans, LA.

Barnett, T.S., Grady, M., Purdy, K. & Singh, A.D. (2005). Exploiting prediction defect

clustering for yield and reliability. IEE Proceedings-Computers and Digital Techniques,

152(4), 407-413.

Battiti, R. (1992). First and second order methods for learning: Between steepest descent and

Newton's method. Neural Computation, 4(3), 141-166.

Blum, C. & Roli, A. (2003). Metaheuristics in combinatorial optimization: overview and

conceptual comparison. ACM Computing Surveys, 35(4), 268-308.

Chen, C.-H., Wu, S.D. & Dai, L. (1999). Ordinal comparison of heuristic algorithms using

stochastic optimization. IEEE Transactions on Robotics and Automation, 15(1), 44-56.

Chen, F.L., Lin, S.C., Doong, Y.Y. & Young, K.L. (2003). LOGIC product yield analysis by

wafer bin map pattern recognition supervised neural network. In: Proceedings of the 2003

29

CA.

Collette, Y. & Siarry, P. (2003). Multiobjective optimization: principles and case studies. New

York: Springer-Verlag.

Dro, J., Ptrowski, A., Siarry, P. & Taillard, E. (2006). Metaheuristics for hard optimization:

methods and case studies. Berlin: Springer-Verlag.

Fattahi, P., Mehrabad, M.S. & Jolai, F. (2007). Mathematical modeling and heuristic

approaches to flexible job shop scheduling problems. Journal of Intelligent Manufacturing,

18(4), 331-342.

Fiore, C.D., Fanelli, S. & Zellini, P. (2004). An efficient generalization of Battiti-Shanno's

Quasi-Newton algorithm for learning in MLP-networks. In: Proceedings of ICONIP 2004,

Lecture Notes in Computer Science, Vol. 3316 (pp. 483-488). Berlin: Springer.

Fu, M.C., Glover, F.W. & April, J. (2005). Simulation optimization: a review, new

developments, and applications. In: Proceedings of the 2005 Winter Simulation

Conference (pp. 83-95). Orlando, FL.

Gill, P.E., Murray, W. & Wright, M.H. (1981). Practical optimization. New York: Academic

Press.

Graupe, D. (2007). Principles of artificial neural networks. 2nd ed. New Jersey: Hackensack.

Haupt, R.L. & Haupt, S.E. (2004). Practical genetic algorithms. 2nd ed. New York : John

Wiley.

Hedar, A.R. & Fukushima, M. (2006). Tabu Search directed by direct search methods for

nonlinear global optimization. European Journal of Operational Research, 170(3),

329-349.

Ho, Y.C. (1999). An explanation of ordinal optimization: Soft computing for hard problems.

Information Sciences, 113(3-4), 169-192.

30

Ho, Y.C., Zhao, Q.C. & Jia, Q.S. (2007). Ordinal optimization: Soft optimization for hard

problems. New York: Springer-Verlag.

Hunt, F.Y. (2005). Sample path optimality for a Markov optimization problem. Stochastic

Processes and Their Applications, 115(6), 769-779.

Kim, S. (2006). Gradient-based simulation optimization. In: Proceedings of the 2006 Winter

Simulation Conference (pp. 159-167). Monterey, CA.

Lacksonen, T. (2001). Empirical comparison of search algorithms for discrete event

simulation. Computers & industrial Engineering, 40(12), 133-148.

Lau, T.W.E. & Ho, Y.C. (1997). Universal alignment probability and subset selection for

ordinal optimization. Journal of Optimization Theory and Applications, 39(4), 455-489.

Lin, S.-Y. & Ho, Y.C. (2002). Universal alignment probability revisited. Journal of

Optimization Theory and Applications, 113(3), 399-407.

Lin, S.-Y., Ho, Y.C. & Lin, C.-H. (2004). An ordinal optimization theory based algorithm for

solving the optimal power flow problem with discrete control variables. IEEE

Transactions on Power Systems, 19(1), 276-286.

Moore, D. & McCabe, G. (1999). Introduction to the practice of statistics. 3rd ed. New York:

W.H. Freeman and Company.

Myers, R.H., Montgomery, D.C., Vining, G.G., Borror, C.M. & Kowalski, S.M. (2004).

Response surface methodology: A retrospective and literature survey. Journal of Quality

Technology, 36(1), 53-77.

Nocedal, J. & Wright, S.J. (2006). Numerical Optimization. 2nd ed. New York: Springer

Verlag.

Spall, J.C. (2003). Introduction to stochastic search and optimization estimation, simulation,

and control. New Jersey: John Wiley & Sons.

Suman, B. & Kumar, P. (2006). A survey of simulated annealing as a tool for single and

31

1143-1160.

Tekin, E. & Sabuncuoglu, I. (2004). Simulation optimization: A comprehensive review on

theory and applications. IIE Transactions, 36(11), 1067-1081.

Theiler, J. & Alper, J. (2006). On the choice of random directions for stochastic

approximation algorithms. IEEE Transactions on Automatic Control, 51(4), 476-481.

Stanevski, N. & Tsvetkov, D. (2004). On the quasi-Newton training method for feed-forward

neural networks. In: Proceedings of the International Conference on Computer Systems

and Technologies (pp. II.12-1-5). Rousse, Bulgaria.

32

List of Figures

On

First

page reference

page

12

12

14

14

15

14

22

21

24

23

27

26

Figure 2 Relationship between the inputs and the outputs of wafer probe

testing procedures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Figure 3 Flow chart of the wafer probe testing procedures. . . . . . . . . . . . . .

Figure 4 The F ( x g ) obtained and the corresponding CPU time consumed

by our algorithm with k 2,3,4,5,6 and 200 for the case of

rT =10. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Figure 5 The resulted pairs of ( V , R ) obtained by our algorithm and the

randomly generated vector of threshold values. . . . . . . . .

Figure 6 Five types of standardized OPCs. . . . . . . . . . . . . . . . . . . . . . . . . . .

List of Tables

On

First

page reference

page

Table 1

subphase of second stage.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Table 2

23

22

24

23

overkill percentage for three different rT s . . . . . . . . . . . . . . . . . .

33

- Comparison of ResultsHochgeladen vonAdam Stevenson
- Calibration Intervals From Variables Data(2)Hochgeladen vonValya Ruseva
- 12 Probability and StatisticsHochgeladen vonJeenu Babu
- Lecture1-3-MEC223Hochgeladen vonjai
- Comparison of alternative versions of the job demand-control scalesHochgeladen vonMonika Turi
- Economic Performance and Satisfaction With DemocracyHochgeladen vonMontesein Rafael
- Monitoring and Analysis of Vertical Deformation of Palm House Benin City Using Digital LevelHochgeladen vonEditor Ijasre
- Ambient Air CEN/TS 16450:2015Hochgeladen vonDiego Nonzoque
- iim Point Estimation and Interval Estimation.pdfHochgeladen vonVishnu Prakash Singh
- Practice Exam 3Hochgeladen vonSergio
- 2320 Final Pass PaperHochgeladen vonAnonymous 7CxwuBUJz3
- Stats chips.docxHochgeladen vonChristina Luu
- 2007 AP Statistics Multiple Choice ExamHochgeladen vonHyun Bon
- Cálculo de bufferHochgeladen vonjosegg
- Iim Point Estimation and Interval EstimationHochgeladen vonVishnu Prakash Singh
- term projectHochgeladen vonapi-233141673
- Session 16.pptxHochgeladen vonShahriar Haque
- Two-pop-Matched-paired Data Exercises 4and5 (1)Hochgeladen vonDiego Ortiz
- 2016 Lumc Beamer Template 2 Dot 0Hochgeladen vonProfesor Serch
- math 1040Hochgeladen vonapi-219553415
- skittle project compilationHochgeladen vonapi-316865914
- confidence intervals weather project 1Hochgeladen vonapi-242200333
- A Combinatorial Approach to Hybrid Enzymes Independent of DNA HomologyHochgeladen vonArdiellaputri
- 19438892Hochgeladen vonhuynh dung
- ES10rexHochgeladen vonBilal Hassan
- estimation.docxHochgeladen vonArmand Say Segbede
- Control DispatchHochgeladen vonIgnacio Andres Portilla Alvarado
- Market Research PhdHochgeladen vonNitali Vatsaraj
- aeu110Hochgeladen vonAmir Farid
- tema1Hochgeladen vonalex

- multiagente.pdfHochgeladen vonNashinyPro
- Modelling and SimulationHochgeladen vonSahul Hameed
- Stochastic Simulation IntroductionHochgeladen vonTuany R. Cassiano
- Gillespie General MethodHochgeladen vonGeorge Michael Alvarado Lopez
- Rumus MPC 2 - UASHochgeladen vonEddy Junaidi
- sb3-hw13Hochgeladen vonRichard Liu
- BioinformaticaHochgeladen vonGiovanna Munive Shong
- Bonta Desarrollo Jy Utilidad de La Curva de HuffHochgeladen vonAnonymous YtXbtd
- Use Use of Simulation Modeling in Sport Facility Resource Utilizationof Simulation Modeling in Sport Facility Resource UtilizationHochgeladen vonSaurav Kumar
- Simulation Moduling & Analysis 1Hochgeladen vonMario Filino
- Leeb ThesisHochgeladen vonHu100x
- r StreamHochgeladen vondiedie_herv
- CacchianiHochgeladen vonRod_do
- FAIHochgeladen vonaamhser
- 1b0. Intro to M and SHochgeladen vonIvan Eric Olea
- Sample_Chapter.pdfHochgeladen vonAmrik Singh
- Grade Control Based on Economic Ore Waste Classification Functions and Stochastic Simulations Examples Comparisons and ApplicationsHochgeladen vonelbatiston
- Stochastic Modelling for Systems BiologyHochgeladen voncarlos Arozamena
- Strategic Mine Planning Under UncertaintyHochgeladen vonFelipe Andres Barrios Pheng
- Discrete Event Simulation Book by Jerry BanksHochgeladen vonAditya Dhal
- ch3.pdfHochgeladen vonArdi Sujiarta
- The use of simulation in fleet selection and equipment sizing in mining - SpringerHochgeladen vonPedro Campos
- Rumus MPC 2 - UASgfhgHochgeladen vonEddy Junaidi
- Integrating Workforce Planning,Hochgeladen vonHelen Jekelle
- Simulation DESHochgeladen vonoptisearch
- 1495376376Hochgeladen vonndc6105058
- Geoff Bohling SimulationHochgeladen vonManael Ridwan
- CS108L Mideterm ExamHochgeladen vonteresa
- Emdad and Lol Stochastic SimulationHochgeladen vonAhmad Muhammad
- Grade Control Based on Economic Ore WasteHochgeladen vonjudaswashere