0 Bewertungen0% fanden dieses Dokument nützlich (0 Abstimmungen)

173 Ansichten23 SeitenDavid Pisinger

© © All Rights Reserved

PDF, TXT oder online auf Scribd lesen

David Pisinger

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

0 Bewertungen0% fanden dieses Dokument nützlich (0 Abstimmungen)

173 Ansichten23 SeitenDavid Pisinger

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

Sie sind auf Seite 1von 23

David Pisinger

Dept. of Computer Science, University of Copenhagen,

Universitetsparken 1, DK-2100 Copenhagen, Denmark.

May, 1994

Abstract

The Multiple-Choice Knapsack Problem is defined as a 0-1 Knapsack Problem

with the addition of disjoined multiple-choice constraints. As for other knapsack

problems most of the computational effort in the solution of these problems is used

for sorting and reduction. But although O(n) algorithms which solves the linear

Multiple-Choice Knapsack Problem without sorting have been known for more than

a decade, such techniques have not been used in enumerative algorithms.

In this paper we present a simple O(n) partitioning algorithm for deriving the

optimal linear solution, and show how it may be incorporated in a dynamic programming algorithm such that a minimal number of classes are enumerated, sorted

and reduced. Computational experiments indicate that this approach leads to a

very efficient algorithm which outperforms any known algorithm for the problem.

Keywords: Packing; Knapsack Problem; Dynamic Programming; Reduction.

Introduction

j Ni has a profit pij and a weight wij , and the problem is to choose one item from each

class such that the profit sum is maximized without having the weight sum to exceed c.

The Multiple-Choice Knapsack Problem (MCKP) may thus be formulated as:

max

z=

k X

X

pij xij

i=1 jNi

subject to

k X

X

wij xij c,

i=1 jNi

(1)

xij = 1,

i = 1, . . . , k,

jNi

i = 1, . . . , k, j Ni .

All coefficients p ij , wij , and c are positive integers, and the classes N 1 , . . . , Nk are mutually

P

disjoint, class Ni having size ni . The total number of items is n = ki=1 ni .

Negative coefficients pij , wij in (1) may be handled by adding a sufficiently large constant to all items in the corresponding class as well as to c. To avoid unsolvable or trivial

situations we assume that

k

X

i=1

min wij

jNi

c <

k

X

i=1

max wij .

jNi

(2)

Linear Multiple-Choice Knapsack Problem (LMCKP). If each class has two items, where

(pi1 , wi1 ) = (0, 0), i = 1, . . . , k, the problem (1) corresponds to the 0-1 Knapsack Problem

(KP). The linear version of KP will be denoted by LKP.

MCKP is NP-hard as it contains KP as a special case, but it can be solved in pseudopolynomial time through dynamic programming (Dudzinski and Walukiewicz 1987). The

problem has a large range of applications: Capital Budgeting (Nauss 1978), Menu Planning (Sinha and Zoltners 1979), transforming nonlinear KP to MCKP (Nauss 1978),

determining which components should be linked in series in order to maximize fault tolerance (Sinha and Zoltners 1979), and to accelerate ordinary LP/GUP problems by the

dual simplex algorithm (Witzgal 1977). Moreover MCKP appear by Lagrange relaxation

of several integer programming problems (Fisher 1981).

Several algorithms for MCKP have been presented during the last two decades: e.g.

Nauss (1978), Sinha and Zoltners (1979), Dudzinski and Walukiewicz (1984), and Dyer,

Kayal and Walker (1984). Most of these algorithms start by solving LMCKP in order

to obtain an upper bound for the problem. LMCKP is solved in two steps: 1) The LPdominated items are reduced by sorting the items in each class according to nondecreasing

weights, and then applying some dominance criteria to delete unpromising states. 2) The

reduced LMCKP is solved by a greedy algorithm.

After these two initial steps, upper bound tests may be used to fix several variables in

each class to their optimal value. The reduced MCKP problem is then solved to optimality

through enumeration (Dudzinski and Walukiewicz 1987).

The development in KP however indicates, that MCKP may be solved easier: Balas

and Zemel (1980) proposed for the KP only to consider a small subset of the items

the so-called core where there was a large probability for finding an optimal solution.

Such a core can be found in O(n) time through a partitioning algorithm, and since the

restricted KP defined on the core items is easy to solve for several classes of data instances,

it means that many instances may be solved in linear time (Martello and Toth 1988,

Pisinger 1994a).

Although O(n) algorithms for LMCKP have been known for a decade (Zemel 1984,

Dyer 1984), making it possible to derive a core reasonably easy, a similar technique

2

has not been used for MCKP. It has been claimed (Martello and Toth 1990) that the

reduction of LP-dominated items in any case was necessary in order to derive upper

bounds in a branch-and-bound algorithm. The current paper refutes this conjecture, but

several questions had to be treated:

Which items or classes should be included in the core?

How should we derive upper bounds in a branch-and-bound algorithm when LPdominated items have not been deleted?

How should a core be derived? Zemel (1984) and Dyer (1984) only give algorithms

for solving LMCKP, so some modification is necessary.

The partitioning algorithms by Zemel and Dyer operate on the dual to LMCKP

making them complicated to implement for practical purposes. Some simplifications

like those by Martello and Toth (1988) or Pisinger (1994a) for KP, seem necessary.

The present paper is a counterpart to a minimal algorithm for KP by Pisinger (1993): A

simple algorithm is used for solving LMCKP, and for deriving an initial feasible solution to

MCKP. Starting from this initial solution we use dynamic programming to solve MCKP,

adding new classes to the core by need. By this technique we are able to show that a

minimal number of classes are considered in order to solve MCKP to optimality.

This paper is organized in the following way: First, Section 2 brings some basic definitions, and shows fundamental properties of MCKP, while Section 3 presents a simple

partitioning algorithm for the solution of LMCKP. Next, Section 4 shows how gradients

may be used in an expanding-core, as well as presenting some logical tests which may

be used to fix some variables at their optimal value, before a class is added to the core.

Section 5 gives a description of the dynamic programming algorithm, and Section 6 shows

how we keep track on the solution vector in dynamic programming. Finally Section 7

gives the main algorithm proving the minimality, and Section 8 brings computational

experiments.

Fundamental properties

wir wis and pir pis ,

(3)

Proposition 1 Given two items r, s N i . If item r dominates item s then an optimal

solution to MCKP with xis = 0 exists.

Proof Let x be an optimal solution to (1) with x is = 1. Then a solution x equal to x

except that xir = 1, xis = 0 will be feasible and it will have an objective value at least as

good as x due to (3).

3

r

t

rs

r

Definition 2 If items r, s, t N i , with

wir wis wit and pir pis pit ,

(4)

and the projection of vector (wis wir , pis pir ) on the normal to (wit wir , pit pir ) is

negative, i.e. if

det(wis wir , pis pir , wit wir , pit pir ) =

(5)

then we say that item s is LP-dominated by items r and t. See Figure 1.

solution to LMCKP with xis = 0 exists.

Proof See Sinha and Zoltners (1979).

As a consequence, we only have to consider LP-undominated items R i in the solution

of LMCKP. Note, that these items form the upper convex boundary of the set Ni , as

illustrated in Figure 2. The set of LP-undominated items may be found by ordering the

p

4

items in each class N i according to increasing weights, and successively test the items

according to criteria (3) and (6). If two items have the same weight and profit, choose an

arbitrary of them. Now LMCKP may be solved by using the greedy algorithm:

Algorithm 1 Greedy.

1 Find the LP-undominated classes R i (ordered by increasing weights) for all classes

Ni , i = 1, . . . , k.

2 Choose the lightest item from each class (i.e. set x i1 = 1, xij = 0 for j =

2, . . . , |Ri |, i = 1, . . . , k) and define the chosen weight and profit sum as W =

Pk

Pk

i=1 wi1 , resp. P =

i=1 pi1 .

3 For all items j 6= 1 define the slope ij as

ij

pij pi,j1

, i = 1, . . . , k, j = 2, . . . , |Ri |.

wij wi,j1

(6)

instead of item j 1 in class R i (Zemel 1980). Clearly a greedy algorithm should

choose the most profitable changes first, therefore order the slopes { ij } in nondecreasing order.

4 Take the next slope ij from {ij }. If W + wij > c goto step 5. Otherwise set

xij = 1, xi,j1 = 0 and update the sums W = W + wij wi,j1, P = P + pij pi,j1.

Repeat step 4.

5 If W = c we have an integer solution and the optimal objective value to LMCKP

(and MCKP) is z = P . Otherwise let ij be the next slope in the list. We have two

cW

fractional variables x ij = wij

wi,j1 respectivly xi,j1 = 1 xij , which both belong

to the same class. The optimal objective value is

z = P + (c W )ij .

(7)

Although several orderings of { ij } exist in step 3 when more items have the same slope,

we will assume in the following definitions that one specific ordering has been chosen.

The LP-optimal choices b i obtained by Algorithm 1 are those variables, where x ibi = 1.

The class containing two fractional variables in step 5 will be denoted the fractional class

Na , and the fractional variables are x aba , xaba (possibly with xaba = 0). An initial feasible

solution to MCKP may be constructed by choosing the LP-optimal variables, i.e. setting

xibi = 1 for i = 1, . . . , k and xij = 0 for i = 1, . . . , k, j 6= bi . The solution will be denoted

the break solution and the corresponding weight and profit sum is W resp. P .

Proposition 3 An optimal solution x to LMCKP satisfies the following: 1) x has at

most two fractional variables x aba and xaba . 2) If x has two fractional variables they must

be adjacent variables within the same class N a . 3) If x has no fractional variables, then

the break solution is an optimal solution to MCKP.

5

r

+

bi r

r

r

ir

r

r

r

r

r

w

Figure 3: Gradients +

i , i in class Ni .

P

The presented greedy algorithm demands O( ki=1 ni log ni ) for the sorting and determination of LP-undominated classes which gives the complexity O(n log max i=1,...,k ni ).

P

Next, the ordering of slopes is done in O(n log n) time, with n = ki=1 |Ri |. Thus the

overall complexity is O(n log n). It should be mentioned, that when the classes form a

KP, algorithm 1 is exactly the greedy algorithm for LKP, and the objective value (7)

corresponds to the Dantzig upper bound for KP (Dantzig 1957).

An optimal solution to MCKP generally corresponds to the break solution, except for

some few classes where other items than the LP-optimal choices have been chosen. This

property may be illustrated the following way: Define the positive and negative gradient

+

i and i for each class Ni , i 6= a as (see Figure 3)

pij pibi

max

, i = 1, . . . , k, i 6= a,

jNi , wij >wibi wij wib

i

pibi pij

=

min

, i = 1, . . . , k, i =

6 a,

jNi , wij <wibi wib wij

i

+

=

i

(8)

(9)

and we set +

i = 0 (resp. i = ) if the set we are maximizing (resp. minimizing) over

is empty. The gradients are a measure of the expected gain (resp. loss) per weight unit

by choosing a heavier (resp. lighter) item from N i instead of the LP-optimal choice b i .

The gradient of the fractional class N a is defined as

paba paba

.

waba waba

(10)

i and show how

often the IP-optimal solution to MCKP differ from the LP-optimal choice in each class

Ni . The figure is a result of 5000 randomly generated data instances (k = 100, n i = 10),

where we have measured how often the IP-optimal choice j (satisfying w ij > wibi since we

are considering forward gradients) differ from the LP-optimal choice b i in each class Ni .

It is seen, that when +

i is decreasing, so is the probability that b i is not the IP-optimal

6

+

% differences

i

1.0 25

+

i

0.2 5

frequency

class Ni

10

100

Figure 4: Frequency of classes Ni where IP-optimal choice differ from LP-optimal choice,

compared to gradient +

i .

i to

This observation motivates considering only a small number of the classes N i , namely

i or i are sufficiently close to . Thus at any stage the core is

simply a set of classes {N r1 , . . . , Nrm } where r1 , . . . , rm {1, . . . , k}. Initially the core

consists of the break set Na and we expand the core by need; alternately including the

i or smallest i .

Since a complete enumeration of the core demands considering up to n r1 nr2 nrm

states, care should be taken before including a new class to the core. We use a simple

upper bound test to fix as many variables at their optimal value as possible in the class

before it is included in the core. If only one item remains, the class may be fathomed.

Otherwise we order the remaining variables by nondecreasing weight and use test (3) to

delete dominated items. The remaining class is added to the core and the new choices

are enumerated through dynamic programming.

% differences

i

1.0 25

0.2 5

frequency

class N

i

10

100

Figure 5: Frequency of classes Ni where IP-optimal choice differ from LP-optimal choice,

compared to gradient

i .

7

Dyer (1984) and Zemel (1984) independently of each other developed O(n) algorithms

for LMCKP. Both algorithms are based on the convexity of the LP-dual problem to (1),

which makes it possible to pair the dual line segments, so that at each iteration at least

1/6 of the line segments are deleted. When the classes form a KP the algorithms reduce

to that of Balas and Zemel (1980) for LKP. As Martello and Toth (1988) modified the

Balas and Zemel algorithm for LKP to a primal approach which is easier to implement,

we will now modify the Dyer and Zemel algorithm for LMCKP in a similar way.

Assume that Na is the fractional class and that items b a and ba are the fractional

variables in Na , such that xaba + xaba = 1, possibly with xaba = 0. Moreover let bi be

the LP-optimal choice in class N i , i = 1, . . . , k, i 6= a. Due to the properties of LMCKP

given in Proposition 3, LMCKP may be reformulated as finding the slope

=

p

w

paba paba

,

waba waba

(11)

X

wibi + waba

c <

i6=a

wibi + waba ,

(12)

i6=a

and

det(wij , pij , w, p) det(wibi , pibi , w, p), i = 1, . . . , k, j = 1, . . . , ni .

(13)

Here (12) ensures that Na is the fractional class, and (13) ensures that each item b i Ni

is at the upper convex boundary of the set.

The formulation (11)-(13) allows us to use a partitioning algorithm for finding the

optimal slope . In the following algorithm we assume that the classes of items N i are

represented as a list [N1 , . . . , Nk ] and items in each class are also represented as a list

[j1 , . . . , jni ]. Elements may be deleted from a list by swapping the deleted element to the

end of the list, and subsequently decreasing the lists length. Thus at any step k and n i

refer to the current number of elements in the list. The partitioning algorithm looks like

this:

Algorithm 2 Partition.

0 Preprocess. For all classes i = 1, . . . , k let i and i be indices to the items having

minimal weight (resp. maximal profit) in N i (see Figure 6). In case of several items

satisfying the criterion, choose the item having largest profit for i and smallest

weight for i .

P

P

If ki=1 wii > c no solution exists so stop. If ki=1 wii c we have a trivial solution

consisting of the items i , so stop.

Set W = P = 0, and remove those items j 6= i which have wij wii and pij pii ,

since these are dominated by item i . If the class Ni has only one item left, save

the LP-optimal choice b i = i and set W = W + wibi , P = P + pibi , then delete

class Ni .

8

r i

r

r

r

r

r i

1 Choose median. For M randomly chosen classes N i define the corresponding slope

pi = pii pii . Using an O(n) median algorithm (Aho et. al. 1974) let

i = w

wii wii

i

p be the median of these M slopes.

= w

2 Find the conclusion. For each class N i find the items which maximize the projection

on the normal to (w, p), i.e. which maximize the determinant

det(wij , pij , w, p) = wij p pij w.

(14)

See Figure 7. We swap these items to the beginning of the list such that they have

indices {1, . . . , i } in class Ni .

3 Determine weight sum of conclusion. Let g i , hi be the lightest (resp. heaviest) item

among {1, . . . , i } in class Ni , and let W and W be the corresponding weight sums.

P

P

Thus W = W + ki=1 wigi and W = W + ki=1 wihi .

optimal. First, choose the lightest items from each class by setting b i = gi , W =

W +wibi , P = P +pibi . Then while W wigi +wihi c run through the classes where

i 6= 1 and choose the heaviest item by setting b i = hi , W = W wigi + wibi , P =

P pigi + pibi . The first class where W wigi + wihi > c is the fractional class N a and

p

r i

r

r

r

r

dp)

(w,

r

r i

(w, p)

Figure 7: Conclusion of Ni .

9

i r

r

r

r i

an optimal objective value to LMCKP is z LM CKP = P + (c W ). If no fractional

class is defined, the LP-solution is also the optimal IP-solution. Stop.

5 Partition. We have one of the following two cases: 1) If W > c then the slope

was too small (see Figure 8). For each class N i choose i as the lightest item in

{1, . . . , i } and delete items j 6= i with wij wii . 2) If W < c then the slope

p

= w was too large. For each class Ni choose i as the heaviest item in {1, . . . , i }

and delete items j 6= i with pij pii (items j with w ij wii are too light, and

items with w ij > wii , pij pii are dominated). If the class Ni has only one item

left, save the LP-optimal choice b i = i and set W = W + wibi , P = P + pibi , then

delete class Ni . Goto Step 1.

The above algorithm may be further improved by introducing LP-dominance reductions:

Each time i or i is changed in a class Ni , we use criterion (6) to test whether any items

are LP-dominated by i and i . In this way each class will only consist of the items

close to the upper convex boundary. Computational experiments do however indicate,

that Algorithm 2 does not become considerably faster by the addition of LP-dominance

criteria.

Depending on the choice of M in step 1, we obtain different behavior of the algorithm.

The best performance is obtained by choosing as the median of all slopes i , i = 1, . . . , k

(i.e. choose M = k) but for practical purpose M 15 works well. Note that in the KP

case, Algorithm 2 becomes the partitioning algorithm of Balas and Zemel (1980).

Proposition 4 If we choose = p as the exact median of M different slopes i = pi

w

wi

in step 1 of Algorithm 2, at least M/2 items are deleted at each iteration.

Proof Since is the median of the M classes, we have i for M/2 classes, so

for these classes at least one item j 6= i exists which maximizes (14). Similarly we

have i for M/2 classes, so for these classes at least one item j 6= i exists which

maximizes (14). If W > c in step 5, at least M/2 items { i } will be deleted. Otherwise

if W < c, at least M/2 items { i } will be deleted.

10

yielding a complexity of O(n 2 ).

Proposition 5 If M = k and the size of each class n i is bounded by a constant C,

Algorithm 2 runs in O(n).

Proof Due to Proposition 4 at least k2 items are deleted at each iteration. Since n i is

1

bounded by C it means that at least 2C

n items are deleted at each iteration, yielding the

complexity.

Expanding core

Balas and Zemel (1980) proposed for KP to consider only a small amount of the items

the so-called core where there was a large probability for finding an optimal solution.

However the core cannot be identified a-priori, implying that in some cases optimality of

the core solution cannot be proved.

Pisinger (1994a) noted, that even though the core cannot be identified before KP

is solved, it can be identified while the problem is solved by using an expanding core.

Moreover Pisinger developed an algorithm which is always using a minimal core (Pisinger

1993).

We will use the same concept for MCKP, but now the core consists of the smallest

possible number of classes N i , such that an optimal solution may be determined and

proved. Where the core for KP naturally consists of items having profit-to-weight ratio

close to that of the break item, there is no natural way of ordering the classes in MCKP.

Instead we use the gradients to identify a core: Define the positive and negative gradient

+

i and i for each class Ni , i 6= a by (8) and (9). Due to (13) we have that

+

p

w

i .

(15)

i } according to nonincreasing values, and L = {i } according

to nondecreasing values. Initially the core C only consists of the fractional class N a , and

then we repeatedly add classes Ni corresponding to the next gradient from the ordered

sets L+ and L . Since each class occur twice (once in each set L + and L ), we always

check whether the class already has been added to the core.

4.1

Class reduction

from the class. We check whether each item j N i has an upper bound larger than the

currently best solution z. For this purpose we use the weak upper bound, obtained by

relaxing the constraint in (1) on the fractional variables b a , ba Na from xba , xba {0, 1}

to xba , xba R. The upper bound on item j Ni is then

uij = P pibi + pij + (c W + wibi wij ),

11

(16)

and if uij < z + 1 we may fix xij to 0. Since the bound (16) is evaluated in constant time,

the complexity of reducing class N i is O(ni ).

If the reduced set Ni has only one item left, we fathom the class, since no choices have

to be done. Otherwise we order the items in N i according to nondecreasing weights and

delete dominated items by applying (3). The computational effort is concentrated on the

sorting, yielding a complexity of O(n i log ni ) where ni is the size of Ni . In Section 8 it will

be demonstrated that a large majority of the items may be fixed at their optimal value

by the reduction (16), thus significantly decreasing the number of items which need to be

sorted.

The core is a set of currently included classes C = {N r1 , . . . , Nrm }, so the set of partial

vectors in C is given by

YC

(17)

where each variable yi determines that variable x iyi = 1 while the remaining binary

variables in Ni are set to zero. The weight and profit sum of a vector y i = (y1 , . . . , ym ) YC

corresponds to the weight and profit sum of the chosen variables y ri when Nri C, and

to the LP-optimal choices b i when Nri 6 C. Thus

i =

wiyi +

Ni C

i =

wibi ,

(18)

pibi .

(19)

Ni 6C

piyi +

Ni C

Ni 6C

given above, and vi is a (not necessarily complete) representation of y i . According to the

principle of optimality (Ibaraki 1987) we may fathom some of these states:

Definition 3 Given two states ( i , i , vi ) and (j , j , vj ). The state i is said to dominate

the state j if i j and i j .

Proposition 6 If a state i dominates another state j we may fathom the dominated

state j.

Proof Similar to Proposition 1.

We will keep the set Y C = {(1 , 1 , v1 ), . . . , (m , m , vm )} ordered according to increasing

weight and profit sums (i < i+1 and i < i+1 ) in order to easily fathom dominated

states.

When a new class N is added to the core C, we must enumerate the new choices and

delete dominated states. A clever way of doing this is by using a divide and conquer

algorithm (Pisinger 1994b), which takes advantage of the ordering of Y C and N.

12

The idea is to divide N recursively in two equally sized parts N A and NB until

hereby obtained sets have size 1. A set NA of size 1 is trivially multiplied with the

Y simply by adding the remaining item (w, p) to each state in Y , and the product

YA will still be ordered. Finally the sets Y A and YB are merged two by two, and

dominated states are removed.

the

set

set

the

Algorithm 3

procedure divide(Y , N, var Y );

, vm )}, and }

{ the multipliers are Y = {( 1 , 1 , v1 ), . . . , (m , m , vm )} resp. N = {(wf , pf ), . . . , (wl , pl )}. }

if (f = l) then

for i := 1 to m do (i , i , vi ) := (i wb + wf , i pb + pf , vi {f }); rof;

m := m;

else

d := (f + l)/2;

NA := {(wf , pf ), . . . , (wd , pd )};

NB := {(wd+1 , pd+1 ), . . . , (wl , pl )};

divide(Y, NA , YA ); divide(Y, NB , YB ); conquer(YA , YB , Y );

fi;

procedure conquer(Y , Y , var Y );

, vm )}, and the }

, vm )}. }

+1 , vm +1 ) := (, , {});

if (1 1 ) then (1 , 1 , v1 ) := (1 , 1 , v1 ); i := 2; else (1 , 1 , v1 ) := (1 , 1 , v1 ); j := 2; fi;

repeat

if (i j ) then { Choose smallest weight to ensure ordering. }

if (i , i , vi ) is not dominated by ( k , k , vk ) then

if (i , i , vi ) does not dominate (k , k , vk ) then k := k + 1; fi;

(k , k , vk ) := (i , i , vi );

fi; i := i + 1;

else

if (j , j , vj ) is not dominated by ( k , k , vk ) then

if (j , j , vj ) does not dominate (k , k , vk ) then k := k + 1; fi;

(k , k , vk ) := (j , j , vj );

fi; j := j + 1;

fi;

until (i = m + 1) and (j = m + 1);

m := k;

To extend the core C with a new class Ni simply call divide(Y C , Ni , YC{Ni } ) to obtain

the new set of states YC{Ni } . The algorithm has time complexity O(mn i log2 (ni )) where

m is the size of YC and ni the size of Ni . For most data instances a great majority of

13

the new states are deleted by dominance such that far less than the expected mn i states

are generated. The structure of Algorithm 3 implies that many of the dominated states

may be deleted already when the first sets are merged, leading to a considerably faster

computation.

5.1

Reduction of states

Although the number of states in Y C at any time is bounded by the capacity c due to the

dominance criterion (3), the enumeration may be considerably improved by adding some

upper bound tests in order to delete unpromising states.

Assume that the core C is obtained by adding classes corresponding to the first m

gradients from L and L+ and that Ns and Nt are the next classes to be added from each

set. Thus the gradients satisfy

max

i

s,

(20)

min +

+

i

t .

(21)

Ni 6C

Ni 6C

u(i) =

i + (c i )+

t

if

i c,

(22)

i + (c

i )

s

if

i > c.

+

is empty,

For conveniency we set +

is empty, and

t = 0 if the set L

s = if L

ensuring that states which cannot be improved further are fathomed.

The bound (22) may also be used for deriving a global upper bound on MCKP. Since

any optimal solution must follow a branch in Y C , the global upper bound corresponds to

the upper bound of the most feasible branch in Y C . Therefore a global upper bound on

MCKP is given by

uMCKP =

max u(i).

y i YC

(23)

t will be decreasing during the solution process, and the gradient s

will be increasing, u MCKP will become more and more tight as the core is expanded. For

a complete core C = {N1 , . . . , Nk } we get uMCKP = z for the optimal solution z.

the optimal solution vector x may be found by backtracking through the sets of states,

implying that all sets of states should be saved during the solution process. In the computational experiments it is demonstrated that the number of states may be half a million

in each iteration and since the number of classes may be large (k = 10000) we would need

14

to store billions of states. Pisinger (1993) proposed to save only the last a changes in the

solution vector in each state (, , v). If this information is not sufficient for reconstructing the solution vector, we simply solve a new MCKP problem with a reduced number of

variables. This is repeated till the solution vector is completely defined. More precisely

we do the following:

Whenever an improved solution is found during the construction of Y C , we save the

corresponding state (, , v). After termination of the algorithm, the solution vector is

reconstructed as far as possible. First, all variables are set to the break solution meaning

that xibi = 1 for i = 1, . . . , k and xij = 0 for i = 1, . . . , k, j 6= bi . Then we run through

the vector v in the following way:

Algorithm 4

procedure definesolution(, , v);

for h := 1 to a do { v = {v1 , . . . , va } are the last a changes in the solution vector. }

i := vh .i; { i is the class corresponding to vh . }

j := vh .j; { j is the variable corresponding to v h . }

:= wij + wibi ; := pij + pibi ; xij := 1; xibi := 0;

rof;

If the backtracked profit and weight sums , correspond to the profit and weight sums

of the break solution W, P , we know that the constructed vector is correct. Otherwise we

solve a new MCKP, this time with capacity c = , lower bound z = 1, and global

upper bound u = . The process is repeated until the solution vector x is completely

defined. This technique has proved very efficient, since generally only a few iterations

are needed. With a = 10, a maximum of 4 iterations has been observed for large data

instances, but generally the optimal solution vector is found after the first iteration.

Main algorithm

Algorithm 5

procedure mcknap;

Solve LMCKP through a partitioning algorithm.

Determine gradients L+ = {+

i } and L = {i } for i = 1, . . . , k, i 6= a.

Partially sort L+ in decreasing order and L in increasing order.

z := 0; C := {Na }; YC := reduceclass(Na );

repeat

reduceset(YC ); if (YC = ) then break; fi;

Ni := L

s ; s := s + 1; { Choose next class to be included. }

if (Ni is not used) then

Ri := reduceclass(Ni );

if (|Ri | > 1) then add(YC , Ri );

fi;

15

Ni := L+

t ; t := t + 1; { Choose next class to be included. }

if (Ni is not used) then

Ri := reduceclass(Ni );

if (|Ri | > 1) then add(YC , Ri );

fi;

forever;

Find the solution vector.

The first step of the algorithm is to solve the LMCKP as sketched in Section 3. Hereby

we obtain the fractional class N a , the break solution {bi } as well as the corresponding

weight and profit sum W and P .

The gradients +

i and i are determined and the sets L and L are ordered. Since we

initially do not need a complete ordering, we use a partial ordering as presented in Pisinger

(1994a): Using the quicksort algorithm for sorting (Hoare 1962), we always choose the

interval containing largest values (resp. smallest for L ) for further partitioning, while

the other interval is pushed onto a stack. In this way we continue until the largest (resp.

smallest) values have been determined. Later in Algorithm 5, if more values are needed,

we simply pop the next interval from the stack by need and partition it further.

Our initial core is the fractional class N a , which is reduced by procedure reduceclass.

Here we apply criterion (16) to fix as many variables as possible at their optimal value. If

the reduced class has more than one item left, we sort the items according to increasing

weight, and then apply criterion (3) to remove dominated items. Hereby we obtain the

reduced class Ra which is the current set of states YC .

The set of states YC is reduced by procedure reduceset which apply criterion (22) to

fathom unpromising states. Moreover the procedure checks whether any feasible state

( c) has improved the lower bound z, and updates the current best solution in that

case.

Now we alternately include classes from L + and L , each time reducing the class to

see if it must be added to the core. The reduced class R i is added to the set of states YC

by using Algorithm 3, indicated by procedure add above.

The iteration stops when no more states are feasible, meaning that no improvements

+

t = 0 when L is empty, and s = when L is empty,

meaning that the iteration in any case will stop when all classes have been considered.

7.1

Minimality

We claim Algorithm 5 solves MCKP to optimality with a minimal core and with minimal

effort for sorting and reduction. More precisely we have:

Definition 4 Given a core C and the corresponding set of states Y C . We say that the

core problem has been solved to optimality if one (or both) of the following cases occur:

1 YC = meaning that all states were fathomed due to an upper bound test.

16

Definition 5 MCKP has been solved with a minimal core if the following invariant holds:

A class Ns (resp. Nt ) is only added to the core C if the corresponding core problem could

not be solved to optimality, and the set N s (resp. Nt ) has the smallest gradient

s (resp.

largest gradient +

).

t

The definition states, that a class Ns (resp. Nt ) should be added to the core only if it

cannot be avoided by any upper bound test, and if all classes with smaller (resp. larger)

gradients have been considered. The definition ensures that if MCKP has been solved

to optimality with a minimal core C, no smaller subset core C C exists. Anyway a

smaller sized core C may exist if C 6 C and C 6 C but according to our definition

such cores are not comparable. Algorithm 5 finds the minimal core (of several possible)

which is symmetric around N a .

Definition 6 The effort used for reduction has been minimal if a class N i is reduced only

when the core C could not be solved to optimality, and N i is the next class to be included

according to the rule in definition 5.

Definition 7 The sorting effort has been minimal if 1) A class N i is sorted only when

the current core C could not be solved to optimality, 2) N i is the next class to be included

according to definition 5, and 3) only items which have passed the reduction criterion (22)

are sorted.

Proposition 7 The presented algorithm solved MCKP with a minimal core, using minimal reduction and sorting effort.

Proof A consequence of Section 4 (minimal core), and Algorithm 5 (minimal reduction,

sorting).

Computational experiments

The presented algorithm has been implemented in ANSI-C, and a complete listing is

available from the author on request. The following results have been achieved on a

HP9000/730 computer.

We will consider how the algorithm behaves for different problem sizes, test instances,

and data-ranges. Five types of randomly generated data instances are considered, each

instance tested with data-range R1 = 1000 or R2 = 10000 for different number of classes

k and sizes ni :

Uncorrelated data instances (uc): In each class we generate ni items by choosing

wij and pij randomly in [1, R].

Weakly correlated data instances (wc): In each class, w ij is randomly distributed in

[1, R] and pij is randomly distributed in [w ij 10, wij + 10], such that pij 1.

17

Strongly correlated data instances (sc): For KP these instances are generated as w j

randomly distributed in [1, R] and p j = wj + 10, which are very hard indeed. Such

instances are trivial for MCKP, since they degenerate to subset-sum data instances,

but hard instances for MCKP may be constructed by cumulating strongly correlated

KP-instances: For each class generate ni items (wj , pj ) as for KP, and order these

P

by increasing weight. The data instance for MCKP is then w ij = jh=1 wh , pij =

Pj

h=1 ph ,

upper convex set.

Subset-sum data instances (ss): wij randomly distributed in [1, R] and p ij = wij .

Such instances are hard since any upper bound will yield u ij = c.

Sinha and Zoltners (sz): Sinha and Zoltners (1979) constructed their instances in

a special way. For each class construct ni items as (wj , pj ) randomly distributed in

[1, R]. Order the profits and weights in increasing order, and set w ij = wj , pij =

pj , j = 1, . . . , ni . Note that such data instances have no dominated items.

The constant M in Algorithm 2 is chosen as M = 15 and for each data instance the

capacity c is

k

1X

min wij + max wij .

(24)

c =

jNi

2 i=1 jNi

We construct and solve 100 different data instances for each problem type, size and range.

The presented results are average values or extreme values.

First Table I shows the average core size (measured in classes) for solving MCKP to

optimality. For most instances only a few classes need to be considered in the dynamic

programming. The strongly correlated data instances however demand that almost all

classes are considered.

Table II shows how many classes have been tested by criterion (16). It is seen, that

when many classes are present, only a few percent of the classes are reduced, meaning

that we may solve the problem to optimality without even considering a large majority

k

10

100

1000

10000

10

100

1000

10

100

ni

10

10

10

10

100

100

100

1000

1000

UC

R1 R2

2

2

8

9

15 20

10 28

2

3

7 10

6 17

1

2

1

6

WC

R1 R2

8

8

11 16

7 12

1 10

4

5

3

6

1

4

2

2

0

2

SC

R1

R2

8

9

85

84

791

775

7563 7800

8

8

84

95

839

915

4

8

25

82

SS

R1 R2

2

4

2

4

0

2

0

0

1

2

0

1

0

0

0

1

0

0

SZ

R1 R2

6

5

17 17

18 33

11 33

7

8

15 34

11 41

6

9

9 30

18

k

10

100

1000

10000

10

100

1000

10

100

ni

10

10

10

10

100

100

100

1000

1000

UC

R1 R2

52 55

46 63

20 52

0 26

42 60

23 56

1 29

10 48

1 20

WC

R1 R2

87 88

14 19

1

1

0

0

43 55

3

6

0

0

16 20

0

2

SC

R1 R2

85 88

87 86

82 82

78 80

81 81

84 95

84 92

45 79

25 82

SS

R1 R2

23 37

2

4

0

0

0

0

10 19

0

1

0

0

0 11

0

0

SZ

R1 R2

83 82

68 80

18 70

0 19

80 90

23 82

2 21

66 97

11 39

Table II: Percentage of all classes which have been tested by weak upper bound. Average

of 100 instances.

of the classes. The strongly correlated data instances again demonstrate that almost all

classes must be considered.

The efficiency of the weak upper bound (16) is given in Table III. The entries show

how many percent of the tested items which are reduced. Generally a large majority of

the variables are fixed to their optimal value this way.

To illustrate the hardness of the dynamic programming, we measure the largest set

of states YC for each data instance in Table IV. It is seen that strongly correlated data

instances may result in more than half a million states. Still this is far less than the c

states, which is the guaranteed maximum.

Finally Table V gives the average computational times. Easy data instances are solved

in a fraction of a second. Only the strongly correlated instances demand more computational effort, but are still solved within 30 minutes.

The above results indicate that the presented algorithm outperforms any algorithm

for MCKP (see Sinha and Zoltners 1979, Armstrong et. al 1983, Dyer, Kayal and Walker

1984), implying that the stated minimal properties cause drastical reductions in the computational times.

k

10

100

1000

10000

10

100

1000

10

100

ni

10

10

10

10

100

100

100

1000

1000

UC

R1 R2

83

84

88

88

89

90

86

90

98

98

99

99

98

99

100 100

100 100

WC

R1 R2

48 27

62 56

68 49

80 68

75 61

87 68

94 86

87 58

94 85

SC

R1 R2

45 34

51 51

53 54

50 52

84 79

85 85

84 85

50 94

50 94

SS

R1 R2

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

SZ

R1 R2

70 73

86 86

88 89

72 90

86 85

93 97

94 98

89 90

93 96

Table III: Percentage of tested items which are reduced. Average of 100 instances.

19

k

10

100

1000

10000

10

100

1000

10

100

ni

10

10

10

10

100

100

100

1000

1000

UC

R1 R2

0

0

0

0

1

0

1

4

0

0

0

0

0

1

0

0

0

0

WC

R1 R2

1 10

4 52

4 39

5 46

4 40

4 40

3 43

3 35

3 36

R1

3

7

20

84

1

4

10

3

25

SC

R2

24

68

194

572

10

26

106

4

9

SS

R1 R2

4 47

4 38

0 28

0

0

3 28

3 28

0

0

0 30

0 20

SZ

R1 R2

0

0

0

0

2

3

4 12

1

1

3

3

4

8

3 10

4 31

Table IV: Largest set of states in dynamic programming (in thousands). Maximum of 100

instances.

k

10

100

1000

10000

10

100

1000

10

100

ni

10

10

10

10

100

100

100

1000

1000

UC

R1

R2

0.00 0.00

0.00 0.00

0.03 0.03

0.25 0.31

0.00 0.00

0.02 0.02

0.14 0.17

0.02 0.03

0.12 0.15

WC

R1

R2

0.01 0.05

0.02 0.28

0.03 0.23

0.24 0.42

0.03 0.58

0.03 0.55

0.16 0.43

0.12 2.75

0.18 1.11

SC

R1

0.01

0.37

7.30

169.94

0.02

0.33

9.57

1.64

173.69

R2

0.09

5.16

92.46

1628.57

0.19

6.93

195.75

0.14

2.97

SS

R1

0.01

0.01

0.01

0.17

0.06

0.01

0.13

0.02

0.13

R2

0.17

0.11

0.09

0.17

1.05

0.68

0.13

12.55

0.15

SZ

R1

R2

0.00 0.00

0.01 0.01

0.04 0.05

0.33 0.41

0.01 0.02

0.05 0.07

0.24 0.32

0.19 0.74

0.41 2.66

Conclusions

We have presented a complete algorithm for the exact solution of the Multiple-Choice

Knapsack Problem. To our knowledge, it is the first enumerative algorithm which makes

use of the partitioning algorithms by Dyer (1984) and Zemel (1984). In order to do this,

it has been necessary to derive new upper bounds based on the positive and negative

gradients, as well as choosing a strategy for which classes should be added to the core.

The algorithm satisfies some minimality constraints as defined in Section 7.1: It solves

MCKP with a minimal core, since variables only are added to the core if the current core

could not be solved to optimality, and the effort used for sorting and reduction is also

minimal according to the stated definitions.

Computational experiments document that the presented algorithm is indeed very

efficient. Even very large data instances are solved in a fraction of a second; only strongly

correlated data instances demand more computational effort. It is our hope that the

appearance of this algorithm will promote the use of the MCKP model, since it is far

more flexible than e.g. a simple KP model.

20

k

100

1000

10000

100000

minknap

0.00

0.00

0.01

0.10

R1

mcknap

0.00

0.02

0.13

1.37

preproc

0.00

0.01

0.12

1.36

minknap

0.00

0.00

0.02

0.16

R2

mcknap

0.00

0.02

0.21

1.53

preproc

0.00

0.01

0.12

1.34

Table VI: Total computing time in seconds for solving 0-1 Knapsack Problems. Uncorrelated data instances. Average of 100 instances.

The algorithm developed, may equally well be used for solving 0-1 Knapsack Problems,

but this will naturally yield some overhead compared to specialized algorithms for the 0-1

Knapsack Problem.

Table VI compares the running times of mcknap with those of minknap (Pisinger

1994c). It is seen, that generally mcknap spends about 10 times more computational

time for the solution than minknap. However column preproc shows, that most of

the overhead is spent for the preprocessing (sorting and removal of dominated items)

where minknap obviously is able to use a faster algorithm for these steps, as there are

no dominated items in the classes of a 0-1 Knapsack Problem.

In spite of the higher computational times for mcknap it is seen, that the developed

algorithm has a stable behavior, even in this extreme case.

References

Aho, A.V., J.E. Hopcroft and J.D. Ullman, The design and analysis of Computer

algorithms, Addison-Wesley, Reading, MA, 1974.

Armstrong, R.D., D.S. Kung, P. Sinha and A.A. Zoltners, A Computational

Study of a Multiple-Choice Knapsack Algorithm, ACM Transactions on Mathematical

Software, 2 (1983), 184-198.

Balas, E. and E. Zemel, An Algorithm for Large Zero-One Knapsack Problems,

Operations Research, 28 (1980), 1130-1154.

Bellman, R.E., Dynamic Programming, Princeton University Press, Princeton, N.J.,

(1957).

Dantzig, G.B., Discrete Variable Extremum Problems, Operations Research, 5 (1957),

266-277.

Dudzinski K. and S. Walukiewicz, A fast algorithm for the linear multiple-choice

21

Dudzinski K. and S. Walukiewicz, Exact methods for the knapsack problem and

its generalizations, European Journal of Operational Research, 28 (1987) 3-21.

Dyer M.E., An O(n) algorithm for the multiple-choice knapsack linear program, Mathematical Programming, 29 (1984) 57-63.

Dyer M.E., N. Kayal and J. Walker, A branch and bound algorithm for solving

the multiple choice knapsack problem, Journal of Computational and Applied Mathematics 11 (1984) 231-249.

Fisher, M.L., The Lagrangian Relaxation Method for Solving Integer Programming

Problems, Management Science, 27 (1981), 1-18.

Hoare, C.A.R., Quicksort, Computer Journal, 5, 1 (1962), 10-15.

Ibaraki, T., Enumerative Approaches to Combinatorial Optimization - Part 2, Annals

of Operations Research, 11 (1987).

Martello, S. and P. Toth, A New Algorithm for the 0-1 Knapsack Problem, Management Science, 34 (1988), 633-644.

Martello, S. and P. Toth, Knapsack Problems: Algorithms and Computer Implementations, Wiley, England, 1990.

Nauss, R.M., The 0-1 knapsack problem with multiple choice constraint, European

Journal of Operational Research, 2 (1978), 125-131.

Pisinger, D., On the solution of 0-1 knapsack problems with minimal preprocessing,

Proceedings NOAS93, Trondheim, Norway, June 11-12. (1993).

Pisinger, D., An expanding-core algorithm for the exact 0-1 Knapsack Problem, To

appear in European Journal of Operational Research (1994a).

Pisinger, D., Solving hard knapsack problems, DIKU, University of Copenhagen,

Denmark, Report 94/24 (1994b).

Pisinger, D., A minimal algorithm for the 0-1 Knapsack Problem, DIKU, University

of Copenhagen, Denmark, Report 94/23 (1994c).

Sinha, A. and A.A. Zoltners, The multiple-choice knapsack problem, Operations

Research 27 (1979) 503-515.

22

Bureau of Standards (1977).

Zemel, E., The linear multiple choice knapsack problem, Operations Research 28

(1980) 1412-1423.

Zemel, E., An O(n) algorithm for the linear multiple choice knapsack problem and

related problems, Information Processing Letters, 18 (1984) 123-128.

23