Sie sind auf Seite 1von 25

Draft as of September 17, 2012

Choose Blocks for Angelic Nondeterministic Macro Processing


Richard J. Cichelli, Robert J. Harwick, William Bader Software Consulting Services, LLC. Abstract: Choose is a new block-oriented control structure for programming backtracking searches. We built support for choose into the general-purpose, language-neutral Spice Macro PreProcessor (SMPP). SMPP is a recently developed component of the Software Consulting Services, LLC [1] tool set designed to support generative programming [2]. An example 8-Queens program[3], [4], [5], [6], [7] generator is presented to illustrate both how choose works and the design goals for it and for the SMPP. Introduction: "Oh, you can't always get what you want You can't always get what you want You can't always get what you want But if you try sometimes you just might find You get what you need." The Rolling Stones Mick Jagger, Keith Richards It has been nearly forty years since Floyd, Dijkstra and Wirth explored programming paradigms. They often illustrated their thinking with solutions for the ubiquitous 8queens problem. Many of the issues they considered then remain unresolved even today. What is the best framework for producing programs? Perhaps the answer lies in tools for managing and manipulating code. Current generative programming (GP) research and practice approaches writing programs that write programs using deterministic processes. Higher level abstractions are defined and mapped deterministically into lower level ones. Providing tools for nondeterminism [4] in generative programming opens a new GP model, one that allows trial and error via search. It is only through trial and error that software can be composed using self-organizing methods [7]. Automated, heuristically-guided search can complement and sometimes replace human problem solving. With suitable tools it can do so for generative programming [8]

The problems that can be solved by nondeterministic methods are often far more interesting and much more useful than those solved by strictly deterministic methods [9]. At SCS we use search in determining where classified [10] and display ads [11] should go in newspaper editions. We also compute copy fitting and press impositions [12] via search. One example in the GP literature is SQL query optimization. Other GP examples are rare, due, I believe, to how hard it is to conceive and implement nondeterminism using current GP tools [13] [14]. We implemented choose, a block-oriented control structure that specifies depth-first searches within a general purpose macro processor. We overcame all the complex well-known bookkeeping issues involved in doing so. With choose, we have a necessary (but not necessarily sufficient) piece of the nondeterministic generative programming infrastructure. With choose we expect to get not just what we want, but what we need. Review of research Macro processors: Historically, macro processors have been used in everyday programming. When certain programming tasks were expected to be coded in low level assembly language, the assemblers' macro processors allowed programmers to raise the level of abstraction of assembly language programming. Some programmers even defined structured programming control structures for the assembly language they worked with. In the early 70's external macro processors were used to add block-oriented control structures to languages like FORTRAN. Ratfor and Flex were implemented using the general purpose m4 macro processor [15]. M4 offers these facilities:

a free-form syntax, rather than line based syntax a high degree of macro expansion (arguments get expanded during scan and again during interpolation) text replacement parameter substitution file inclusion string manipulation conditional evaluation arithmetic expression system interface programmer diagnostics programming language independence 2

human language independent provides programming language capabilities

Unlike most earlier macro processors, m4 does not target any particular computer or human language; historically, however, its development originated for supporting the Ratfor dialect of Fortran. Unlike some other macro processors, m4 is Turing-complete as well as a practical programming language. Many high level programming languages have their own macro processor built in. Bliss [16] is well known for its own preprocessor as is C [17]. However, since the C preprocessor does not have features of other preprocessors, such as recursive macros, selective expansion according to quoting, string evaluation in conditionals, and Turing completeness, it is very limited in comparison to a more modern, true GPP (general purpose preprocessor) such as m4. For instance, the inability to define macros using other macros requires code to be broken into more sections than would be required otherwise. There is a hierarchy of functionality to these tools. First there is text inclusion and/or substitution, then parameterized text substitution and then conditional text substitution [18]. Macro processors and generative programming: It is more important than ever to build systems that build systems. Take, for example, the task of supporting a web site that supports application level functionality. Application servers prepare dynamic content in this paradigm. The server application interacts with users usually through web browsers. The dynamic content is in a version of HTML, CSS and JavaScript, i.e., code. Today there is no more urgent need for programs that write programs than for web-based applications. As an aside, one cannot read of recommendations for developing web-enabled applications without reading of the model, view, controller (MVC) paradigm [19]. Yet much application code written with current popular tools such as JSP or ASP typically lacks separation of concerns. The rules of good MVC programming are ignored as PHP, HTML, CSS and JavaScript are mixed poorly. A general purpose macro processor provides a good tool for keeping together that which belongs together and apart that which should be separate. Macro processors can be flexible enough to handle cross cutting concerns well. Using a macro processor also allows a top-down approach to generative programming, starting with patterns and templates and populating them to build variants. An early such tool is Bassett Frames [20]. Bassett Frames allow the programmer to automate and systematize the copy-paste-edit-integrate way programming is often done manually. The process is top down because the resulting program (or text) is deterministically derived for its specification frame. - "A specification frame (SPC) is an entire assembly's topmost, hence most context-sensitive frame." [21]. 3

XVCL [22] from the University of Singapore implements Bassett Frames with XML notation. XVCL helped me fully understand the semantics of Bassett's Frame commands. Our macro processor implements those semantics using a syntax that naturally expands a parameterized macro call into a block that indicates how the macro is to be adapted by inserting or deleting text at designated points within the macro definition. Programming in general is becoming more and more about building software that writes software. Our business is writing applications for newspapers. Our software composes text and combines it with images to make pages. Modern typography is based on PostScript output. PostScript is itself a programming language. We find it natural to manipulate programs as text and do so in numerous applications. Besides Postscript, we generate text for ad copy and page furniture like folios. Our generators compose entire web sites, code for drawing programs, high level language programs that compile into PostScript and a macro processor input for building proposals, contracts, RFP responses, invoices, etc. We even use it to generate source code for our fully algorithmic application programming language. Nondeterminism in a macro processor: You might not think of a macro processor as needing to support nondeterminism. None that I know of do so. It seems their nature that they process input deterministically. As filter programs, they take in input, process it and emit output. Recall, however, that our goal is to automate the process of systematically searching through a potentially large space of alternatives in order to find (or construct) one or more correct programs. Since our macro processor is the tool that programmers use for generative programming, it is natural to add nondeterminism to it in order to achieve that goal. Implementing backtracking in a macro processor presents a unique challenge. In a procedural programming language, we can implement backtracking using a recursive procedure. Here's a program that solves the eight queens problem, where the goal is to determine all the ways in which eight queens can be placed on a chess board in such a way that no queen attacks another:

procedure queens; var board: array [0..7] of integer; col: array [0..7] of Boolean; up: array [0..14] of Boolean; down: array [-7 .. +7] of Boolean; procedure initialize; var i: integer; begin for i := 0 to 7 do col[i] := true; for i := 0 to 14 do up[i] := true; for i := -7 to +7 do down[i] := true; end { initialize }; procedure generate(r: integer); var c: integer; procedure printboard; var r: integer; begin for r := 0 to 7 do writes(stdout, ' ' // f$string(board[r], 2)); writeln(stdout); end { printboard }; procedure setsquare( r, c: integer; val: Boolean); begin col[c] := val; up[r+c] := val; down[r-c] := val; end { setsquare }; begin { generate } for c := 0 to 7 do if col[c] and up[r + c] and down[r-c] then begin { square free } board[r] := c; setsquare(r, c, false); if r = 7 then { board full } printboard else generate(r + 1); setsquare(r, c, true); end; end { generate };

begin { queens } initialize; generate(0); end { queens }; $ $TOOLS/pcleval queens2 | head 0 0 0 0 1 1 1 1 1 1 4 5 6 6 3 4 4 5 5 6 7 7 3 4 5 6 6 0 7 2 5 2 5 7 7 0 3 6 2 5 2 6 7 1 2 2 0 3 0 7 6 3 1 3 0 7 7 7 3 4 1 1 4 5 6 5 5 2 6 0 3 4 2 2 4 3 2 4 4 3

This version is written in PCL (Programming Command Language), a Pascal-like, domain specific language (DSL) engineered by SCS to be an alternative to operating system platform dependent job control languages. SCS software has run (and continues to run) on many different platforms. Recursive procedures thus make specifying backtracking searches easy. Macro processors, however, are typically more declarative than procedural. As a result, a program specified using a macro processor avoids procedure calls. The problem is that implementing backtracking without recursive procedures requires bookkeeping that is complex and intellectually challenging, as is easily seen by examining the Fortran-based solutions proposed by Floyd and Cohen [23]. The program needs to maintain stacks not only to hold the states of computations, but also to maintain the state of the input and output streams. What is needed is a construct that allows the programmer to implement backtracking without explicitly writing the required bookkeeping. References [24], [3], [23] and [25] propose the use of an assigment statement that assigns a variable to a "choice" function which automatically generates a new value each time it is called. Each of those values constitutes a new choice to be tried. In general, the program tests each value against the current processing state in order to determine how to handle that choice (possibly by doing nothing; that is, by moving immediately to the next choice). For example, in Helsgaun's implementation of the eight queens problem, the "choice" function is called once for each row, successively returning values 1-8 to indicate the column in that row where a queen might be placed. If the square given by the row/column pair is attacked by a queen that has been placed in a previous row, the program simply calls "choice" again, which immediately returns the number of the next column. When it reaches the last column (column 8), the choice function automatically backtracks to the previous row.

The use of an assignment statement to generate a choice is counterintuitive in one way. One thinks of assignment statements as modifying the state of the computation. An assignment to a choice function, however, does more than that. In effect, it not only changes the state, but also indicates control over the order in which statements are executed. In traditional structured programming, control over processing is done through blockoriented control structures. For this reason, a "choose" in our macro processor is specified using a block-structured control statement, which more closely aligns with its role as a statement controller. A block structure also allows declarative syntax for expressing a set of choices. Ideally, a program should simply indicate the choices, how each is handled, and the conditions under which each is handled. A properly designed choose block thus specifies a backtracking search at a higher level than does a recursive procedure. Further, using a choose block makes the nesting of choose statements easy to express. There is no reason that there shouldn't be backtracking within backtracking. Progressive deepening in chess programs relies on this concept. Choose defined: Below is the EBNF definition of choose blocks. TagStart is the string <| and TagEnd is |/>. These assure that SMPP commands are distinguishable from other copy, thus preserving host language independence.
ChooseBlock = ChooseDirective ChooseBlockContents EndChooseDirective . ChooseDirective = TagStart "#choose" [ ChooseName ] TagEnd . EndChooseDirective = TagStart "#endchoose" TagEnd . ChooseName = ExpressionReference . ChooseBlockContents = [ ChooseVarsBlock ] [ Iterators ] [InitializeBlock ] Choices . ChooseVarsBlock = ChooseVarsDirective { VariableDeclaration } EndChooseVarsDirective . ChooseVarsDirective = TagStart "#choosevars" TagEnd . EndChooseVarsDirective = TagStart "#endchoosevars" TagEnd . Iterators = { IteratorDirective } . IteratorDirective = TagStart "#iterator" VarName [ "over" ] IntegerRangeCall TagEnd . IntegerRangeCall = call to the built-in "IntegerRange" macro . InitializeBlock = InitializeDirective InitializeBlockContents EndInitializeDirective .

InitializeDirective = TagStart "#initialize" TagEnd . EndInitializeDirective = TagStart "#endinitialize" TagEnd . InitializeBlockContents = { AssignCall } . AssignCall = call to the built-in "Assign" macro . Choices = ChoiceBlock { ChoiceBlock } . ChoiceBlock = ChoiceDirective ResolvableContent EndChoiceDirective . ChoiceDirective = TagStart "#choice" [ ExpressionReference ] TagEnd . EndChoiceDirective = TagStart "#endchoice" TagEnd . NextChoiceDirective = TagStart "#nextchoice" [ ExpressionReference ] TagEnd .

Choose Blocks: The choose control structure is a mechanism for "parallel" evaluation among alternatives. It specifies a set of choices at what is called a "choice point". It may look like a case statement, but unlike a case statement, which selects one among a number of alternatives, choose allows the programmer to specify the execution of them all. The effect is as if you forked all the choices from the choice point. As a matter of fact, some implementations of choose could indeed fork a separate process for each choice. It could be implemented by making "n" copies of the current state and proceeding with them all in parallel. One can imagine that in this day of abundant multi-processors such an implementation might be very practical. Choose block syntax: A choose block has four parts: choose variable declarations, iterators, initializations, and choices blocks. All four parts are used to define the sequence of choices to be tried. The first three parts are used to define one or more variables and the values those variables may have. For example, here is how you would specify variables x and y so that x can be 4,7,10,13 and y can be red, green, blue:

<| #define ChooseNumberAndColor |/> <| #choose |/> <| #choosevars |/> x: integer; yindex: integer; y: string[3]; <| #endchoosevars |/> <| #iterator x #IntegerRange(4,13,3) |/> <| #iterator yindex #IntegerRange(1,3,1) |/> <| #initialize |/> <| #assign(y[1] := "red") |/> <| #assign(y[2] := "green") |/> <| #assign(y[3] := "blue") |/> <| #endinitialize |/> <| #choice |/> The number is #{(x)} and the color is #{(y[yindex])}. <| #endchoice |/> <| #endchoose |/> <| #enddefine |/>

In this example, the choose block is used as the definition of a macro named ChooseNumberAndColor. Within the choose block, the iterators establish 4 values for the variable x and 3 values for the variable yindex. This establishes a sequence of 12 choices, one for each combination of x and yindex values. The initialize block maps the three values for yindex into string values, allowing your choice content to output string values. Consider the line of text above:
The number is #{(x)} and the color is #{(y[yindex])}.

The syntax #{(expression)} means to write the value of the expression to the output stream. Thus these two expressions write the current value of the variable x and the value of y at the position given by the current value of yindex. The first time this choose block is encountered, it executes the choice block with the first value for each iterator. Thus the output the first time is:
The number is 4 and the color is red.

When the preprocessor backtracks to this choose block, it executes the choice block again. It iterates the variable for the last iterator that has not yet reached its last value. So this time, yindex is 2, and the output is:
The number is 4 and the color is green.

When the last value of an iterator is reached, that iterator's variable is set back to its first value and the previous iterator's variable is iterated. Thus a choose block allows the programmer to specify a sequence of choices. The preprocessor chooses the first choice in the sequence and then moves on to content 9

after the block. When the preprocessor reaches the end of its input, it flushes its output and then automatically backtracks to the choose block, where it tries the next choice in the sequence and then moves on. If there are no more choices, the preprocessor is finished. When the preprocessor backtracks to do the next choice in a choose block, it restores its state to what it was when it first encountered the choose block. The state includes such items as the location in the input stream, the location in the output stream, the stack of macro calls, the location in the currently called macro, the stack of structured statements such as if and while blocks and the current location in those blocks. Nested includes and quoted text are also tracked. While this example is useful for illustrating the processing flow of a choose block, it does not demonstrate the full power of a choose-based program. The only variables that are modified and maintained in this example are the choose variables x and yindex. More interesting and powerful uses of choose maintain a complex global context that is modified by each choice and automatically restored to its previous state before trying the next choice. We will demonstrate a more powerful example, using choose to implement the eight queens problem, after some more discussion of the features of choose. Multiple choice blocks: A choose block may have more than one #choice block. In this case, the sequence of choice blocks works like an additional iterator. For example:
<| #choose |/> <| #choosevars |/> x: integer; yindex: integer; y: string[3]; <| #endchoosevars |/> <| #iterator x #IntegerRange(4,13,3) |/> <| #iterator yindex #IntegerRange(1,3,1) |/> <| #initialize |/> <| #assign(y[1] := "red") |/> <| #assign(y[2] := "green") |/> <| #assign(y[3] := "blue") |/> <| #endinitialize |/> <| #choice |/> The number is #{(x)} and the color is #{(y[yindex])}. <| #endchoice |/> <| #choice |/> #{(x)}:#{(y[yindex])} <| #endchoice |/> <| #endchoose |/>

The first time the choose block is encountered, the output is the output of the first choice block using the first value from each iterator. In this case, that is:
The number is 4 and the color is red.

10

When the preprocessor backtracks to try the next choice, since there is a second choice block, it tries that block with the same variable values as were used for the first choice. Thus the next line output is the value of #{(x)}:#{(y[yindex])}, still with values x=4 and yindex=1; that is,
4:red

After all choice blocks have been output for a given set of variable values, the variable for the last iterator is iterated and the first choice block is output for that new set of variable values. Thus the next line output is the first choice block with x=4 and yindex=2; that is,
The number is 4 and the color is green.

Thus multiple choice blocks effectively add a newer level of iteration. For each combination of choose variable values, the choose block iterates through its choice blocks. In fact, a choose block need not have any choose variables at all. In that case, the choose block just iterates once through its choice blocks. Conditional choice blocks: The #choice directive has an optional Boolean expression that is tested before the preprocessor outputs the content of the block. If the expression is false, the preprocessor skips that choice and automatically moves to the next choice or set of iterated values. For example:
<| #choice not ( ( x = 10 ) & ( yindex = 2 ) ) |/> The number is #{(x)} and the color is #{(y[yindex])}. <| #endchoice |/>

With this condition, the following line would not be output:


The number is 10 and the color is green.

Multiple choose blocks: The input to the preprocessor may have more than one choose block. This is useful for establishing multiple points where a choice can be made. When the preprocessor reaches the end of input and there are multiple choose blocks, it backtracks to the last choose block that still has untried choices in its defined choice sequence. Once it has tried all of the choices in the sequence for a choose block, it backtracks to the previously encountered choose block. When all choices for all encountered choose blocks have been exhausted, the preprocessor finishes processing. 11

Backtracking: As indicated above, backtracking occurs when the preprocessor reaches the end of its input and has at least one choose block whose choices have not been exhausted. A programmer can explicitly specify backtracking at any time using the nextchoice directive. By default, the nextchoice directive backtracks to the last choose block whose choices have not yet been exhausted, causing the preprocessor to try the next choice in that block. To allow backtracking to the next choice in a choose block other than the last one, the nextchoice directive has an optional string expression argument to indicate the name of the choose block whose next choice is to be tried. This, of course, implies that choose blocks can be named. Here is an example:
<| #choose "Entree" |/> block contents <| #endchoose |/> <| #choose "Dessert" |/> block contents <| #endchoose |/>

Suppose the program issues:


<| #nextchoice "Entree" |/>

If the last choose block encountered is the "Dessert" block, that block is automatically ended, even if not all of its choices have been tried. The preprocessor backtracks to the "Entree" block and tries its next choice. If the preprocessor moves on from there and encounters the "Dessert" block again, it starts over in that block with its first choice. Expressions and variables: In our examples, we have seen two uses of expressions. Embedded within output text, the syntax #{(x)} means to output the value of the expression x. A choice directive has an optional Boolean expression that indicates that the choice is tried only if the expression evaluates to true. The expression evaluator in the macro preprocessor is capable of accessing the following classes of variables. 1. The expression language allows expressions to declare variables that are local to the expression.

12

2. Choose blocks may define choose variables, which may be accessed in #assign expressions within the #initialize block, or in the Boolean condition in a choice directive, or in the content of a choose block. 3. Variables declared within a macro definition are local to the macro. Local variables may be accessed in expressions that are evaluated while the macro is being resolved. 4. Variables declared outside an expression, choose block, or macro definition are global variables that may be accessed anywhere. There are two types of global variables: automatic variables and static variables. When a choose block backtracks to try its next choice, it automatically restores the values of all automatic variables (and all variables local to a macro) to the values they had when the previous choice was tried. The values of static variables, however, are not restored when backtracking. Local variables, automatic variables, and static variables are declared using a syntax similar to the syntax above for declaring choose variables. The difference is instead of choosevars/endchoosevars, local variables use localvars/endlocalvars, automatic variables use autovars/endautovars, and static variables use staticvars/endstaticvars. Naming expressions: Instead of writing an expression directly in a choice directive or other context, the preprocessor allows for the directive to refer to a previously declared expression by name. For example, we could write:
<| #expression ChoiceOK |/> not ( ( x = 10 ) & ( yindex = 2 ) ) <| #endexpression |/>

Then within the choose block:


<| #choice @ChoiceOK |/> The number is #{(x)} and the color is #{(y[yindex])}. <| #endchoice |/>

This is especially useful for very long, complex expressions. It not only avoids a very long choose directive, but it also has a performance advantage for expressions that may be evaluated many times. The evaluation of an expression involves two steps, parsing the expression followed by evaluating the result of the parse using current values of variables. For declared expressions, the preprocessor does the parse step only once, when it encounters the declaration. Only the evaluation step needs to be done each time the expression is referenced by name.

13

THE EIGHT QUEENS PROBLEM: The main body of the eight queens is a while loop that places each queen in a different column from all previous queens. Once the column is selected, a choose block is used to try that queen in each square of the selected column. The main body of the algorithm is:
<| #define q8 |/> printf( "<| ## This comment avoids a newline after the quote |/> <| #localvars |/> QueenNumber: integer; PlacementCol: integer; <| #endlocalvars |/> <| #assign(QueenNumber := 1) |/> <| #while QueenNumber <= 8 |/> <| #assign(@ColumnToTry) |/> <| #choose |/> <| #choosevars |/> PlacementRow: integer; <| #endchoosevars |/> <| #iterator PlacementRow #IntegerRange(1,8,1) |/> <| #choice @PlacementOK |/> #{(PlacementCol)},#{(PlacementRow)} <| ## |/> <| #endchoice |/> <| #endchoose |/> <| #assign(QueenNumber := QueenNumber + 1) |/> <| #endwhile |/> \n" ); <| #enddefine |/> <| #q8 |/>

This program has no code to manage backtracking -- no recursive calls and no stacks. All it does is establish eight choose blocks, one for each queen, as a result of the while loop that has one iteration for each of the eight queens. The queen for each iteration is placed in a column determined by the ColumnToTry expression. Each choose block simply specifies that there are eight choices for each column, as a result of the iterating PlacementRow from 1 to 8. The choice block for each choose block uses the PlacementOK expression to determine whether the PlacementCol, PlacementRow pair is a valid placement for a queen. If so, the content of the choice block displays that placement as two blanks followed by the column number followed by a comma followed by the row number. All that is needed to finish the program is to write the ColumnToTry and PlacementOK expressions. In the next sections, we will show a variety of ways of implementing those expressions to provide various levels of optimization of the search. Before we do that, what follows is an explanation of how the preprocessor executes this macro. Since the choice of a square in each column is controlled by iterating PlacementRow within the choose block for the column, when we run out of rows to try, the preprocessor automatically backtracks to the previous column and tries the next row in that choose block. 14

Since PlacementCol is a local variable, whenever the preprocessor backtracks to a column, the value of PlacementCol is restored to the value it had when the instance of the choose block for that column was encountered. Thus once ColumnToTry selects a PlacementCol for a choose block, the value of PlacementCol never changes for that block. When we have a placement for all eight queens, the preprocessor flushes what is written for each placement, thus flushing: the four blanks followed by printf( " at the top of the macro the eight column-row pairs, blank-separated the /n" ); at the bottom of the macro Since the printf precedes the choose block, it was put into the output buffer and never removed as a result of backtracking. The content of the choice block, however, is removed from the output buffer when backtracking to the column. Thus if a placement is written for a column and the preprocessor backtracks to that column and finds the next valid placement, that placement is written where the old placement used to be. That's why each flushed output line has exactly eight placements. When PlacementRow exceeds 8 for all columns, all choose blocks are finished, so the program terminates. Using letters for columns: A more traditional way to indicate a square on a chess board is to use letters a-h for the columns instead of numbers 1-8, without putting a comma between the column letter and row number. That is, instead of 5,2 we write e2. To do this, we declare a static array that maps 1-8 to a-h and use that array value in the content of the choice block.
<| #staticvars |/> ColumnLetter: string[8]; <| #endstaticvars |/> <| #assign(ColumnLetter[1] := "a") |/> <| #assign(ColumnLetter[2] := "b") |/> <| #assign(ColumnLetter[3] := "c") |/> <| #assign(ColumnLetter[4] := "d") |/> <| #assign(ColumnLetter[5] := "e") |/> <| #assign(ColumnLetter[6] := "f") |/> <| #assign(ColumnLetter[7] := "g") |/> <| #assign(ColumnLetter[8] := "h") |/> <| #endstaticvars |/>

The choice block is then:


<| #choice @PlacementOK |/> #{(ColumnLetter[ PlacementCol ])}#{(PlacementRow)} <| ## |/> <| #endchoice |/>

15

A more terse way to do the same thing is to replace the ColumnLetter array with calls to the Expression Evaluator's character functions:
#{(F$chr(F$ord("a")+PlacementCol-1))}#(PlacementRow)}<| ## |/>

Simple version: In this section, we demonstrate a simple implementation of the ColumnToTry and PlacementOK expressions. The purpose of ColumnToTry is to select a value for PlacementCol for the current queen in the while loop. That value must be different from the value for any queens already placed by previous iterations. The simplest way to do that is to put the first queen in the first column, the second queen in the second column, and so on. Thus ColumnToTry is simply:

{ PlacementCol := QueenNum }

For PlacementOK, we need a way to represent the state of the chess board so that PlacementOK can test if a queen can be placed and, if so, update the state given that placement. The classic way to do this is to recognize that a placed queen attacks the column, row, and both diagonals running through the square where the queen lies. There's no need to use a variable to keep track of which columns are attacked, because ColumnToTry assures that we never put two queens in the same column. We can keep track of the rows, up diagonals, and down diagonals that are attacked by the queens that have been placed so far using three arrays, one that has a Boolean for each row, one that has a Boolean for each up diagonal, and one that has a Boolean for each down diagonal. Since a chess board has 8 rows, 15 up diagonals, and 15 down diagonals, we declare:
<| #autovars |/> RowIsAvailable: Boolean[8]; UpDiagonalIsAvailable: Boolean[15]; DnDiagonalIsAvailable: Boolean[15]; <| #endautovars |/>

Since these variables represent the state of the board, which is changed depending on which choice is made, they are declared as automatic variables. That means that when we backtrack to try a new choice, the state of the board is restored to what it was before the previous choice changed it; that is, the state resulting from queens already placed in previous columns. 16

Initially, every row and diagonal is available, so before calling q8, we initialize these arrays to true, using a built-in macro called FillArray.
<| #FillArray(RowIsAvailable,true) |/> <| #FillArray(UpDiagonalIsAvailable,true) |/> <| #FillArray(DnDiagonalIsAvailable,true) |/>

To write PlacementOK, we need to determine which index of each array to use to test and update the available state for placing a queen on the square given by PlacementCol, PlacementRow. The index to RowIsAvailable is simply PlacementRow. For the up diagonal, note that every square on a given up diagonal has the same value of the difference between PlacementCol and PlacementRow; for example, 5,2 and 6,3 are on the same up diagonal and have a difference of 3. These differences range from 1-8=-7 to 8-1=7. Since our indices range from 1-15, we can just add 8 to the difference to get the index. Thus the index to UpDiagonalIsAvailable is PlacementCol - PlacementRow + 8. For the down diagonal, note that every square on a given down diagonal has the same value of the sum of PlacementCol and PlacementRow; for example, 5,2 and 6,1 are on the same up diagonal and have a difference of 7. These sums range from 1+1=2 to 8+8=16. Since our indices range from 1-15, we can just subtract 1 from the sum to get the index. Thus the index to UpDiagonalIsAvailable is PlacementCol + PlacementRow - 1. Thus here is the PlacementOK expression:
<| #expression PlacementOK |/> { real OK; real LocalC; (* local copy of PlacementCol *) real LocalR; (* local copy of PlacementRow *) LocalC := PlacementCol; LocalR := PlacementRow; OK := 1; if RowIsAvailable[ LocalR ] then { RowIsAvailable[ LocalR ] := false } else { OK := 0 }; if OK & UpDiagonalIsAvailable[ LocalC - LocalR + 8 ] then { UpDiagonalIsAvailable[ LocalC - LocalR + 8 ] := false } else { OK := 0 };

17

if OK & DnDiagonalIsAvailable[ LocalC + LocalR - 1 ] then { DnDiagonalIsAvailable[ LocalC + LocalR - 1 ] := false } else { OK := 0 }; ( OK = 1 ) } <| #endexpression |/>

We keep local copies of PlacementCol and PlacementRow because the expression evaluator can access its own declared variables faster than it can access variables belonging to the preprocessor. OK is 0 if either the row, up diagonal, or down diagonal is unavailable for the given square. The expression sets the values for the given square to false to indicate that they are no longer available if a queen is placed on that square. If OK turns out to be false, we won't place the queen on that square. But suppose that the up diagonal is available, but the down diagonal is not. In that case, by the time the expression finds that the down diagonal is not available, it has already marked the up diagonal unavailable, even though the queen won't be placed there. Thus it is necessary to restore the values that the arrays had before the expression evaluation changed them. It is not necessary, however, for the expression to do that, because when a choice condition evaluates to false, the preprocessor automatically restores the automatic variables to the values they had before evaluation began. Forward Pruning: The efficiency of a backtracking search can often be improved using a technique called forward pruning, in which a choice at a given point can be made based on looking ahead to the consequences of the choice on choices at later points. Helsgaun [25] applies this technique to the eight queens problem by testing whether a queen about to be placed will attack all of the squares in some column that does not yet have a queen. In that case, we can avoid placing a queen on that square, knowing that it cannot possibly result in a solution. To look ahead, our PlacementOK expression needs to do more work. But the trade-off is that we avoid many fruitless solution paths. For our forward pruning implementation, we replace our three board state arrays with two arrays. SquareIsAvailable has one Boolean for each of the 64 squares on the board, indicating whether that square is available for placing a queen (i.e., is not attacked by any queen already on the board). NumSquaresAvailable has one integer per column, indicating the number of squares in that column that are available (not attacked). 18

<| #autovars |/> SquareIsAvailable: Boolean[64]; NumSquaresAvailable: integer[8]; <| #endautovars |/>

The strategy when determining whether a queen can be placed at a given square is to set SquareIsAvailable to false for each square that would be attacked by that queen. If doing that reduces NumSquaresAvailable to 0 for any column that does not yet have a queen, PlacementOK returns false. The PlacementOK expression loops through all columns that do not have a queen. For each such column, the a queen on the square being tested attacks up to three squares: the square in the same row, the square on the same up diagonal, and the square on the same down diagonal as the potential new queen. For each of those squares, if SquareIsAvailable is true, it is set false and NumSquaresAvailable for the column is decremented. As soon as this results in NumSquaresAvailable reaching 0 for some column, PlacementOK returns false. Here is a PlacementOK that uses forward pruning.
<| #expression PlacementOK |/> { real OK; real LocalC; (* local copy of PlacementCol *) real LocalR; (* local copy of PlacementRow *) real CurrentCol; (* column being tested for Available squares *) real UpDiagonalRow; (* row in CurrentCol on same up diagonal *) real DnDiagonalRow; (* row in CurrentCol on same down diagonal *) real ColStartIdx; (* index in SquaresAvailable of start of column *) LocalC := PlacementCol; LocalR := PlacementRow; OK := 1; ColStartIdx := 8 * ( LocalC - 1 ); if not SquareIsAvailable[ ColStartIdx + LocalR ] then { (* This square is unavailable *) OK := 0 } else { CurrentCol := 0; UpDiagonalRow := LocalR - LocalC; DnDiagonalRow := LocalR + LocalC; while OK & ( CurrentCol < 8 ) do { CurrentCol := CurrentCol + 1; UpDiagonalRow := UpDiagonalRow + 1; DnDiagonalRow := DnDiagonalRow - 1; (* Don't test this column or any column that already has a queen *) if ( CurrentCol <> LocalC ) & ( NumSquaresAvailable[ CurrentCol ] > 0 ) then { ColStartIdx := 8 * ( CurrentCol - 1 ); (* Update based on adding a queen in the this row *) if SquareIsAvailable[ ColStartIdx + LocalR ] then { NumSquaresAvailable[ CurrentCol ] := NumSquaresAvailable[ CurrentCol ] - 1; SquareIsAvailable[ ColStartIdx + LocalR ] := false } else { 1 }

19

(* If that leaves no squares in the column, fail *) if NumSquaresAvailable[ CurrentCol ] <= 0 then { OK := 0 } (* Update based on adding a queen on this up diagonal *) else if ( UpDiagonalRow >= 1 ) & ( UpDiagonalRow <= 8 ) & SquareIsAvailable[ ColStartIdx + UpDiagonalRow ] then { NumSquaresAvailable[ CurrentCol ] := NumSquaresAvailable[ CurrentCol ] - 1; SquareIsAvailable[ ColStartIdx + UpDiagonalRow ] := false } else { 1 }; (* If that leaves no squares in the column, fail *) if NumSquaresAvailable[ CurrentCol ] <= 0 then { OK := 0 } (* Update based on adding a queen on this down diagonal *) else if ( DnDiagonalRow >= 1 ) & ( DnDiagonalRow <= 8 ) & SquareIsAvailable[ ColStartIdx + DnDiagonalRow ] then { NumSquaresAvailable[ CurrentCol ] := NumSquaresAvailable[ CurrentCol ] - 1; SquareIsAvailable[ ColStartIdx + DnDiagonalRow ] := false } else { 1 }; (* If that leaves no squares in the column, fail *) if NumSquaresAvailable[ CurrentCol ] <= 0 then { OK := 0 } else { 1 } } else { 1 } }; (* On success, mark that this column has a queen *) if OK then { NumSquaresAvailable[ LocalC ] := 0 } else { 1 } }; ( OK = 1 ) } <| #endexpression |/>

Note that if the placement is OK, NumSquaresAvailable for the placement column is set to 0 at the end. When PlacementIsOK is called for later columns, columns for which NumSquaresAvailable is 0 are not tested, since that indicates that the column already has a queen. Finishing an iteration early: Another efficiency improvement is to cut down on the number of times PlacementOK is called for a given column. In each the above solutions, it is called eight times, once for each value of the iterator for PlacementRow; that is, once for each square in the column. Suppose, however, that the squares for rows 7 and 8 are attacked. Then after we try the square in row 6, we've tried all available squares. That means we can leave the iteration after row 6 instead of continuing through row 8.

20

To do this, we maintain the array NumSquaresTried for each column. When ColumnToTry first selects a column, NumSquaresTried for that column is set to 0. Each time we try an available square in the column, NumSquaresTried is incremented. When NumSquaresTried reaches NumSquaresAvailable for the column, we can stop testing squares in the column, even if we are not yet at row 8. Thus we add:
<| #staticvars |/> NumSquaresTried: integer[8]; <| #endstaticvars |/>

ColumnToTry is:
{ PlacementCol := QueenNum; NumSquaresTried[ PlacementCol ] := 0 }

Instead of just testing SquareIsAvailable at the top of PlacementOK, the new logic is:
if NumSquaresTried[ LocalC ] >= NumSquaresAvailable[ LocalC ] then { OK := 0; PlacementRow := 8 } else if not SquareIsAvailable[ ColStartIdx + LocalR ] then { OK := 0 } else { NumSquaresTried[ LocalC ] := NumSquaresTried[ LocalC ] + 1;

Note that when NumSquaresTried reaches NumSquaresAvailable, we force the iteration to end by setting PlacementRow to its terminal value of 8. NumSquaresTried is a static variable. Since it does not indicate the state of the board, but instead indicates our progress through choices in a column, it should not be reset each time we try a new choice; on the contrary, it is incremented each time. Dynamic search rearrangement: Our algorithm so far tries to place a queen in each column 1 through 8 in succession. A more efficient strategy is to place the next queen in the column that has the fewest available squares. This effectively reduces the number of branches near the top of the search tree, which in turn reduces the number of paths searched. To implement this, instead of just setting PlacementCol to QueenNum, ColumnToTry scans columns that do not yet have a queen for the one that has the fewest available squares.

21

Here again, there is more work involved in column selection, but the trade-off is a reduction in the number of search paths. To mitigate the increased work, ColumnToTry avoids scanning for a minimum when placing the first two queens. Instead, the first queen is always placed in column 4. When there are no queens on the board, all columns have the same number (8) of available squares, so it doesn't matter which column we select. We select a column in the middle of the board because, as any chess player knows, a queen in a middle column attacks more squares in other columns than a queen on one of the end columns, causing greater restriction as we move down the search tree. When placing the second queen, rather than scan all 7 columns other than column 4 for the column with the fewest available squares, ColumnToTry simply places the second queen in column 5. A queen in column 4 always attacks at least as many squares in column 5 as it does in any other column, so our choices for column 5 are limited. Also, having the first two queens in columns 4 and 5 generally leaves fewer available squares on the entire board than would two queens in columns nearer the ends of the board. The ColumnToTry expression is thus:
<| #expression ColumnToTry |/> { real TryColumn; real MinAvailable; if QueenNum = 1 then { PlacementCol := 4 } else if QueenNum = 2 then { PlacementCol := 5 } else { MinAvailable := 9; TryColumn := 0; while TryColumn < 8 do { TryColumn := TryColumn + 1; (* If a column has but one available square, use that column *) if NumSquaresAvailable[ TryColumn ] = 1 then { PlacementCol := TryColumn; TryColumn := 9 } else if ( NumSquaresAvailable[ TryColumn ] > 0 ) & ( NumSquaresAvailable[ TryColumn ] < MinAvailable )then { (* Mark that this has the fewest available so far *) PlacementCol := TryColumn; MinAvailable := NumSquaresAvailable[ TryColumn ] } else { 1 } } }; NumSquaresTried[ PlacementCol ] := 0 } <| #endexpression |/>

While scanning for the column with the fewest available squares, if a column has just one available square, that column is chosen immediately without continuing the scan. No column that does not yet have a queen could have fewer than one available square, 22

because PlacementOK never places a queen if that would cause a future column to have zero available squares. Eight queens solution summary: What a perfect solution! The resulting program consists of just 92 printf statements. There are no variables, just output, and no backtracking. The result is a deterministic program generated using nondeterministic methods. How this differs from PwAN is the refinement process - it is automatic, done by the program not a programmer. Further, it suggests refinements beyond what Floyd and his colleagues anticipated. There are multiple versions of the expressions PlaceColumn and ColumnToTry proposed. If one is thinking about generative programming, one might put these possibilities as choices in a choose block. By defining a goodness function (such as fewest tree nodes visited) one could pick the best generator automatically. Efficiency, alternatives and readability: The SMPP is used to produce software at SCS every day. It runs at interactive speeds, processing nearly one-hundred thousand lines of source code in a second. Using the same source code base, it and its companion template processors emit production quality user customizable web sites in multiple languages. Not only is the content of the web sites dynamic so are the web sites themselves. Others have tried to use tools like the C++ template processor to achieve similar goals. Josh Walker [26] writes: Make your compiler work for you. What is Template Metaprogramming? The prefix meta- is usually used to indicate an additional level of abstraction: metadata is information about data. Similarly, metaprogramming is writing programs about programs. The emerging technique of template metaprogramming takes advantage of C++ templates to write programs that manipulate themselves or other programs at compile time. Template metaprogramming is both a curiosity and a powerful optimisation method. The curiosity is that a standard C++ compiler is a Turing-complete interpreter for a subset of C++. In theory, any computable problem can be solved at compile time without ever executing compiled code; in practice achievable results are limited by inefficiencies of the system and details of the compiler[1]. Metaprogramming can facilitate increased performance by computing performance-critical results at compile time or using compile-time logic to choose the best algorithm for the job.

23

Sounds like all the right ideas, but here near the end of Mr. Walker's article is this: "In fact, I cannot even compile the 8-queens solution because the machine runs out of virtual memory after exhausting all 2GB6! " Was that 2GB or 26GB? Whatever, it doesn't sound like there is a solution within easy reach of C++. Further, I have a clear preference for the readability of SMPP code over C++ template code, and for that matter anything in C++. The SMPP is also Turing-complete. The existence of while loops in a preprocessor is not only unusual, but controversial. At SCS, that controversy was resolved in favor of including loops when we saw the utility of being able to establish multiple choice points without writing a sequence of choose blocks. Acknowledgments: Robert J. Harwick is the author of SMPP, the detailed definition of the current version of choose, its implementation and the here presented versions of queens coded in SMPP. William Bader is the author of the SCS Expression Evaluator used for computing and assigning values during SMPP macro expansion. He is also the author of PCL, the 8queens in PCL and several versions of queens in SMPP, including ones without choose, recursive calls or gotos. Martha Cichelli's support, questioning and proof reading of this have been invaluable.

References:

[1] Cichelli, R. J. Spice - The SCS Development Language. Software Consulting Services, LLC. 2007. http://newspapersystems.com/sites/default/files/Articles/Spice_SCS_Development_Lan guage.pdf [2] Voelter, M. Workshop: Moderne Softwareentwicklungs-Methoden. www.voelter.de/data/presentations/iir.ppt [3] Floyd, R. W. Non-deterministic Algorithms. Journal of the Association for Computing Machinery, 14, 4 (October, 1967), 636-644. [4] Dijkstra, E. W. EWD316: A Short Introduction to the Art of Programming 1971. http://www.cs.utexas.edu/users/EWD/transcriptions/EWD03xx/EWD316.9.html [5] Dahl, O. J., Dijkstra, E. W. and Hoare, C. A. R. Structured Programming. Academic Press, London, 1972. [6] Wikipedia Eight queens puzzle. http://en.wikipedia.org/wiki/Eight_queens_puzzle [7] Wirth, N. Algorithms + Data Structures = Programs. Prentice-Hall, 1976. [8] Cichelli, R. J. and Harwick, R. J. Generative Programming Saves Development Effort. Software Consulting Services, LLC, 2009. http://newspapersystems.com/sites/default/files/Articles/GenerativeProgramming.pdffile s\pdf\Articles\GenerativeProgramming.pdf 24

[9] Cichelli, R. J. Minimal Perfect Hash Functions Made Simple. Communications of the Association for Computing Machinery, 23, January 1980), 17-19. [10] SCS/ClassPag product page. Software Consulting Services, LLC. http://www.newspapersystems.com/Products/SCSClassPag [11] Layout-8000 Product Page. Software Consulting Services, LLC. http://www.newspapersystems.com/Products/layout8000 [12] Cichelli, R. J. ColorAdBoss - Color Planning in the Advertising Department. Software Consulting Services, LLC. http://newspapersystems.com/sites/default/files/Articles/ColorAdBoss.pdf [13] Batory, D. The Road to Utopia: A Future for Generative Programming. SpringerVerlag, 2004. [14] Selinger, P., Astrahan, M. M., Chamberlin, D. D., Lorie, R. A. and Price, T. G. Access Path Selection in a Relational Database System. ACM SIGMOD1979), 23-24. [15] Kernighan, B. W. and Ritchie, D. M. The M4 macro processor (Technical Report). Bell Laboratories, Murray Hill, NJ, 1977. [16] Brender, R. F. The BLISS programming language: a history. Software. Practice & Experience, 32, 955981. 2002. [17] Wikipedia C preprocessor. http://en.wikipedia.org/wiki/C_preprocessor [18] Wikipedia Automatic programming. http://en.wikipedia.org/wiki/Automatic_programming. [19] Wikipedia Model-view-controller. http://en.wikipedia.org/wiki/Model-view-controller [20] Bassett, P. G. Frame-Based Software Engineering. IEEE Software, July,1987, 916. [21] Wikipedia Frame technology. Wikipedia, http://en.wikipedia.org/wiki/Frame_Technology_%28software_engineering%29. [22] Jarzabek, S., Bassett, P., Zhang, H. and Zhang, W. XVCL: XML-based Variant Configuration Language. City, 2003. [23] Cohen, J. and Carton, E. Nondeterministic FORTRAN. The Computer Journal 17, 1 1974), 44-51. [24] Barman, S., Bodk, R., Chandra, S., Galenson, J., Kimelman, D., Rodarmor, C. and Tung, N. Programming with Angelic Nondeterminism. POPL'10, January 2010, 339. [25] Helsgaun, K. CBack: A Simple Tools for Backtrack Programming in C. Software: Practice and Experience, 25, 8. 1995, 905-934. [26] Walker, J. Template Metaprogramming + Programming Topics. Overload Journal 46, Dec 2001.

25

Das könnte Ihnen auch gefallen