\setlength{\unitlength}{5.5mm} \newpage \begin{center} {\Large\bf V} \vspace{3mm} {\large\bf HOW IT WORKS} \end{center} \hugeskip This section assumes that you have a reading knowledge of the C programming language and that you have read \S4 above. Traditionally magicians are not supposed to give away secrets of the trade, but having persevered this far you deserve some reward. So read on, and you too can Amaze Your Friends---or even Impress Your Supervisor---with your wonderful powers over little squares of numbers. We first describe MaGIC itself and then its parallelisation. \vspace{15mm}\noindent {\large\bf \S5.1\hs{3mm}Overview of the Program} \bigskip\noindent The major divisions of MaGIC, excluding IO, initialisation, etc. are \begin{enumerate} \item The main control logic. \item A function to get the parameters of the job to be performed. \item The user interface. \item The search controller. \item The routine for testing matrices for satisfaction of postulates. \item A routine for setting up search spaces ready for searching. \item The isomorphism eliminator. \end{enumerate} These are organised thus: $$\begin{picture}(0,10)(0,-5) \put(-2.5,-0.75){\framebox(5,1.5){{\small Search Control}}} \put(-2.5,2.25){\framebox(5,1.5){{\small Main Logic}}} \put(-2.5,-3.75){\framebox(5,1.5){{\small Set up Space}}} \put(-11.5,-0.75){\framebox(5,1.5){{\small Test Matrix}}} \put(-11.5,-3.75){\framebox(5,1.5){{\small Isomorphisms}}} \put(6.5,-0.75){\framebox(5,1.5){{\small Front End}}} \put(6.5,2.25){\framebox(5,1.5){{\small Get Job}}} \put(0,0.75){\line(0,1){1.5}} \put(9,0.75){\line(0,1){1.5}} \put(0,-0.75){\line(0,-1){1.5}} \put(-9,-0.75){\line(0,-1){1.5}} \put(-6.5,0){\line(1,0){4}} \put(6.5,3){\line(-1,0){4}} \end{picture}$$ Note that this is just a communications diagram, not a flowchart. One point to be noted straight away is that the program is in certain respects very modular. The user interface is detachable: as we have been at pains to stress throughout, the X Windows front end is largely independent of the program it talks to; another version with a very different look and feel could slip into its place with no change in MaGIC itself. More deeply, the whole method of generating algebraic models is modular. The search controller uses the transferred refutations algorithm which has some drawbacks, such as that it is rather memory-intensive. It would be very easy to put a completely different algorithm---say Pritchard's SCD as used by Malkin and Martin for their \cite{MGT}---in its place. Conversely, the search control module exactly as it stands can have different test and setup routines plugged into it and be used for finite constraint satisfaction problems having nothing whatever to do with propositional logic. It does not need to know whether it is generating {\bf FD} matrices or railway timetables. \smallskip Little will be said in these notes about the technique for removing isomorphic copies from the output. Briefly, whenever a new matrix is generated MaGIC checks to see whether it is among those it has stacked up as isomorphs of ones previously generated. If the new matrix is in that stack, it is removed therefrom and forgotten. If not, all its automorphisms giving structures which lie within the search space are generated and added to the stack before the matrix is printed out. As will be seen on running the program, most jobs give rise to rather few of these. \smallskip The best place to start explaining the workings of MaGIC is with the matrix testing routine and the associated search controller. \newpage\noindent {\large\bf \S5.2\hs{3mm}Generating and testing matrices} \bigskip MaGIC has two ways of testing a formula against possible matrices. The simple way is to make assignments of values to the variables in the formula, compute the resultant values of the formula according to the matrix in hand and check whether these are all designated. Under the influence of Polish notation it calls the implication matrix \verb|C|. The partial order of implication is called \verb|ord| and the current highest value is \verb|siz|\@. Now having defined a suitable universal quantifier \begin{verbatim} #define FORALL(x) for (x=0; x<=siz; x++) \end{verbatim} we can write a test, say for the ``assertion" axiom \ $(p\Cdot p\C q\CC q)$ \begin{verbatim} int assertion() { int p, q; FORALL(p) FORALL(q) if ( !ord[p][C[C[p][q]][q]] ) return(0); return(1); } \end{verbatim} Evidently this function returns 1 if the matrix validates the axiom and 0 otherwise. A similar function is easily composed for any other axiom, just by changing the condition on which to return 0. \smallskip So much for the basic test. The basic algorithm for stepping to the next matrix goes like this: \begin{verbatim} int changed() { int p, q; FORALL(p) FORALL(q) if ( C[p][q] == siz ) C[p][q] = 0; else return( ++C[p][q] ); return(0); } \end{verbatim} and the top level, assuming \verb|siz| somehow defined, is simply \begin{verbatim} main() { int p, q; FORALL(p) FORALL(q) C[p][q] = 0; do test(); while changed(); } \end{verbatim} This algorithm is called ``Test and Change", and clearly what it does is to apply \verb|test()| to every possible matrix in a lexicographic order, the low-numbered cells changing most rapidly. That the matrices are two-dimensional is irrelevant to the problem, so for convenience we may think of the matrix cells as coming in a (one-dimensional) vector \verb|Cvec|\@. Underlying MaGIC is a more compact version of the Test and Change algorithm encoded as a recursive procedure \verb|search()| thus: \begin{verbatim} search( x ) int x; { FORALL( Cvec[x] ) if ( x ) search( x-1 ); else test(); } main() { search( (siz+1)*(siz+1) - 1 ) ; } \end{verbatim} The parameter passed to \verb|search()| is the subscript of a matrix cell. Now the call to \verb|test()| is made from within the searching procedure, and the Change half of Test-and-Change is hidden in the iteration and recursion of that procedure. What it does is still exactly the same as the crude Test-and-Change, only the natural description is different: to search the space up to and including the $n$th cell, we fix the value of cell $n$ in all possible ways and in each case search the subspace up to and including cell $n-1$. In either version, Test-and-Change performs the idiot's search, trying every combination of values for cells. MaGIC is not idiotic, though, so we should expect it to do better than this. \smallskip The first way in which it does better is by making use of an array \verb|possval[x][y]| recording whether the y-th value is possible for the x-th cell . Clearly, since \verb|C[a][b]| is designated iff \verb|ord[a][b]| we can mark only designated values as possible for cells where the implication order holds and only undesignated ones as possible where it does not. This setting up of \verb|possval| can be done at an initialisation stage. Then the search reads \begin{verbatim} FORALL( Cvec[x] ) if ( possval)[x][Cvec[x]] ) { if ( x ) search( x-1 ); else test(); } \end{verbatim} Now a separate search is needed for each successive partial order and choice of designated values, but then the initialisation is cheap compared with the search. MaGIC reads the orders from its data file, but there is no reason why a broadly similar program should not generate them for itself. \smallskip The second method MaGIC has of ensuring that the matrices it generates satisfy axioms is to incorporate those axioms in the search space while setting up \verb|possval|. For example, suppose the logic for which matrices are wanted has as one of its axioms ``reductio" in the form $$ \N p\C p\CC p $$ Negation is also read from the data file and is represented as an array \verb|N|. We can capture the reductio axiom thus: \begin{verbatim} FORALL(p) FORALL(q) if ( !ord[q][p] ) possval[N[p]][p][q] = 0; \end{verbatim} or in better C style, since the only values here are 0 and 1, \begin{verbatim} FORALL(p) FORALL(q) possval[N[p]][p][q] *= ord[q][p]; \end{verbatim} Generally speaking this method of incorporating axioms is available only where they can be put in a form which does not involve any nested arrows, but in those cases it is enormously more efficient than the simple test. \smallskip In the case of a fragment including negation of a logic containing {\bf DW} the searching job is greatly reduced, since the cell \verb|| must always have the same value as \verb|| and so the values of the two cells can be set together, only one of them featuring in the call to \verb|search()|\@. Thus the negation postulates are incorporated neither by testing them nor by priming the search space as represented by \verb|possval| but by structuring the recursion of the search procedure appropriately. \smallskip While Test-and-Change works fine for small matrices its long-run complexity is truly horrible. Suppose half of the values are designated and half undesignated, and that no axioms are incorporated at the initialisation stage. Then where there are $n$ values every cell has $n/2$ possible values. There are thus $(n/2)^{n^2}$ matrices to be tested, and this figure dominates the time complexity of Test-and-Change. For $n=4$ the number of matrices tested per partial order (with two designated values) is $2^{16}$ or 65,536 which is just about feasible, but for $n=6$ ( with three designated values) it increases to $3^{36}$ or a little over 150,000,000,000,000,000 which is going to take a while even on your pet supercomputer. \smallskip Consider the following array of values for \verb|C| $$ \mbox{\begin{tabular}{c|cccccc} $\C$ & 0 & 1 & 2 & 3 & 4 & 5\\\hline 0 & 5 & 5 & 5 & 5 & 5 & 5\\ 1 & 0 & 4 & 4 & 4 & 5 & 5\\ 2 & 0 & 3 & 4 & 4 & 4 & 5\\ 3 & 0 & 2 & 2 & 4 & 4 & 5\\ 4 & 0 & 1 & 2 & 3 & 4 & 5\\ 5 & 0 & 0 & 0 & 0 & 0 & 5 \end{tabular}} $$ The implication order is the chain or total order, and the designated values are 4 and 5.\footnote{This matrix is actually tested by MaGIC during the search for {\bf DW} matrices satisfying assertion.} Now assertion fails for this matrix: when \verb|p| is 1 and \verb|q| is 4 we have values 5 for \verb|C[p][q]| and 0 for \verb|C[C[p][q]][q]|, but \verb|ord[1][0]| is 0. Clearly this badness of the matrix is not a global property but pertains to two cells only---cell \verb|<1,4>| and cell \verb|<5,4>|. Any other matrix with the same values in these cells will be bad for just the same reason. Following the terminology of Pritchard \cite{SCD} we say that the pair consisting of \verb|<1,4,5>| and \verb|<5,4,0>| is a {\em refutation}. It is a {\em 2-refutation\/} since its cardinality is 2. \smallskip The first moral is that instead of letting Test-and-Change thrash its way through the intervening matrices, all of which contain the same refutation, we should immediately backtrack to the point at which cell \verb|<1,4>| gets changed. MaGIC does this by means of an external variable \verb|skip| recording the ``skip-cell". On a test failure this gets set to the subscript of the least significant cell used in the refutation, and function \verb|search()| is complicated slightly to read \begin{verbatim} search( x ) int x; { FORALL( Cvec[x] ) if ( possval[x][Cvec[x]] && x >= skip ) { skip = 0; if ( x ) search( x-1 ); else test(); } } \end{verbatim} That's pretty easy, and very old hat. It does make for a significant improvement in efficiency, though. \smallskip The second moral is far less easy and less well-worn. It is that in order to maximise efficiency in the search the program should somehow remember the 2-refutation it has found and avoid incorporating it ever again into any putative matrix. That is, it should build up a database of all the refutations it has run across so far and use this database to guide it in stepping through the search space to the next trial point. It is of course easy to do this in a crude way; the trick that makes MaGIC magic is to do it in such a way as to access the relevant portions of the accumulated data extremely fast. \newpage\noindent {\large\bf \S5.3\hs{3mm}Transferred refutations} \bigskip Recall that the general method of searching a space as above is to fix the value of the most significant cell and then to search the subspace below it. Now consider again the displayed 2-refutation produced by the failure of assertion: \begin{verbatim} <1,4,5> <5,4,0> \end{verbatim} Suppose the program is at the stage of inserting a value in cell \verb|<5,4>|. If it inserts any value other than 0 the displayed refutation is beside the point, but if it inserts 0 then it will go on to search the subspace constituted by ringing the changes on cells \verb|<0,0>| up to \verb|<5,3>|, holding \verb|<5,4,0>| fixed, and within this subspace \verb|<1,4,5>| is a 1-refutation. That is, until such time as the search backs up to \verb|<5,4>| and changes its value to something other than 0, 5 is just not a possible value for cell \verb|<1,4>|. \smallskip In general the only part of a refutation that currently matters is the most significant value-at-cell involved. There it is said to be ``active", while at the less significant places it is ``passive". So before \verb|<5,4,0>| is inserted our sample refutation is active at \verb|<5,4,0>| and passive at \verb|<1,4,5>|. After the insertion it is active at \verb|<1,4,5>| and passive nowhere. Now when the search comes to consider putting a value in a cell it first scans the list of refutations active at that value-at-cell. An active 1-refutation of course blocks the insertion. An active 2-refutation is {\em transferred\/} downwards to the less significant value-at-cell by marking it as an active 1-refutation down there. Generally an active $n$-refutation, for $n>1$, is transferred in the same way by activating the passive ($n-1$)-refutation listed under the next most significant value-at-cell. When the search backs up to the point where a value is removed from a cell the list of active refutations for that value-at-cell is scanned again and each is transferred back up by de-activating at the lower cell. Associated with each value-at-cell, then, are two linked lists of database entries: those recording active refutations and those recording passive ones. In practice of course the active list will be a sublist of the total list of all the refutations involving the value-at-cell, and the passive list will just be the remainder. Only the active list is scanned when the value is inserted in the cell. \smallskip A database entry, then is an object of a structure type \verb|ref|: \begin{verbatim} struct ref { int force, xy; struct ref *up, *down, *back, *forth, *next; } \end{verbatim} Where \verb|r| is a database entry, the integer \verb|r.force| records how many times this entry is currently activated. If this drops to 0 then the entry becomes passive. The integer \verb|r.xy| records that this entry belongs to the list pertaining to the $y$-th value for the $x$-th cell, the two coordinates being simply encoded in one integer. The pointers \verb|r.up| and \verb|r.down| are the links in the doubly linked list of active refutations pertaining to \verb|| while \verb|r.next| links the singly linked list of all refutations, passive as well as active, pertaining to that value-at-cell. Both the active list and the passive list start at an index point \verb|index[x][y]| which is conveniently used to record the active 1-refutations (if any) currently blocking $y$ as a possible value for cell $x$\@. The pointer \verb|r.forth| picks out the entry that this refutation transfers to, which will always be one listed under a less significant value-at-cell. Finally, \verb|r.back| is a kind of converse of \verb|r.forth|, picking out one of the refutations that transfers forth to \verb|r| if there is one, and a dummy if not, for purposes not germane to this sketch. So at the heart of the method of Transferred Refutations is a data structure consisting of a 2-dimensional array of pairs of linked lists of database entries each carrying information in a couple of integers and pointing to others in all directions. This is a bit complicated, and in fact in real cases it rapidly starts to look chaotic, but in principle it encodes a straightforward enough idea. \smallskip The method of transferred refutations is not limited to the problem of generating logical matrices. It can be applied to any finite constraint-satisfaction problem. MaGIC makes use of a general-purpose implementation of the algorithm called TRANSREF\@. This works on a vector of variables without asking what other data structures these represent: to TRANSREF they are just the first variable, the second variable, the third and so on. It selects the values from a corresponding vector of sets of possibilities---essentially the array \verb|possval|---which may also represent anything at all as far as TRANSREF is concerned. The parent program is responsible for initialising the array of possible values, for translating the vectors of actual values into testable data structures, for carrying out the tests and for passing back to TRANSREF another vector recording which variables were used in the latest refutation discovered. So TRANSREF is called with three parameters: an integer recording the length of the vectors and the addresses of two functions, one to carry out the tests and one to prime the search space by removing impossible values from the cells. These two functions in turn call very case-specific routines appropriate to each logic and each pre-defined axiom known to MaGIC\@. The routine for carrying out tests, of course, also calls the one which ensures that isomorphisms are removed. It also does something with the good matrices: in the sequential program it calls the functions which print out results, while in the parallel version it merely stores the matrices in shared memory. \smallskip The rest of TRANSREF is a matter of detail and will not be discussed further here, though another succinct description may be useful: \begin{verbatim} SYNOPSIS #define SZ #define V_LENGTH #include "TR.c" int transref( length, Test, SetUpSpace ) int length; int (*Test)(); int (*SetUpSpace)(); \end{verbatim} \verb|Transref| constructs vectors of integers in the range 0\ldots\verb|SZ| and sends them to be tested for satisfaction of constraints. The constant \verb|V_LENGTH| is the maximum length of such a vector. It and \verb|SZ| must be defined in the calling program. The first parameter \verb|length| is the number of elements (cells) in the vectors currently being constructed. The other two parameters are the addresses of functions: \begin{verbatim} SYNOPSIS int Test( vector ) unsigned int vector[]; int SetUpSpace( vector ) unsigned int vector[]; \end{verbatim} \verb|Test| must decide whether \verb|vector| is ``good" or ``bad". If it is good \verb|Test| returns 1 (TRUE)\@. If it is bad, \verb|Test| finds a subset of elements (the smaller the better) sufficing for the badness and rewrites the vector with 1 in each place corresponding to an element in the chosen subset and 0 everywhere else. It then returns 0 (FALSE) \smallskip\noindent \verb|SetUpSpace| treats the unsigned integers in its input vector as bitwise representations of the sets of possible values for the cells of vectors to be tested. The order of cells is the same as for Test. Value \verb|j| is possible for cell \verb|i| iff \begin{verbatim} vector[i] & (1 << j) \end{verbatim} is nonzero. \verb|SetUpSpace| should not add any new possibilities but may remove values which are impossible. \smallskip Finally, to round out the present treatment, here is the principal part of the main function of a simple version of MaGIC: \newpage \begin{verbatim} while ( job_given() ) { job_start(); while ( got_siz() ) while ( got_neg() ) while ( got_ord() ) while ( got_des() ) transref( Vlength, Good_matrix, set_poss ); job_stop(); } \end{verbatim} There is little to explain here. The various \verb|got...| functions read the setups from the data file, returning zero if `$-1$' is encountered and otherwise doing bits of initialisation before returning nonzero values. The functions \verb|job_start| and \verb|job_stop| take care of such trivia as opening and closing files, setting the clock and, in the case of \verb|job_stop|, sending the runtime statistics to the user's screen. The three parameters for \verb|transref| are the vector length and the two function addresses as described above. Finally, \verb|job_given()| is the dialog with the user which gets parameters for the search. It returns TRUE (a non-zero value) if the user calls for the program to generate matrices and FALSE (zero) if the user elects to quit MaGIC. \newpage\noindent {\large\bf \S5.4\hs{3mm}Parallelisation} \bigskip As noted in the last paragraph, the core of the main logic of MaGIC is a fourfold `while' loop inside which TRANSREF is called to work on each setup in turn. The fundamental strategy of the parallel version of the program is to parallelise this loop since the cases are independent of each other. Preparing and searching a space for models of a logic is a substantial task, so the grain size, though variable, is quite large. \smallskip One part of MaGIC which is inherently serial is the output of matrices. Not only is the printing of an individual matrix a serial matter but they are printed in a standard order. The order of output is even more important in ``ugly" format than in ``pretty". Hence the natural way to allocate work is to assign one process to IO and send the rest away to generate matrices. In MaGIC 1.1P the master process---the main one called into existence when the program is executed---creates the others as its slaves and then itself handles the dialog with the user, the file input and all output of matrices. One outcome of this arrangement is that maximum theoretically possible speed is proportional to the number of ``slaves", not to the total number of processes. As the number of processes increases, of course, this difference decreases in significance. Slaves are created only once, at the beginning of the execution, rather than being re-created every time menu option {\bf g} is selected. Between jobs they hang in a barrier from which the master releases them in due season. This is because process creation or forking is a relatively expensive operation. \smallskip Unlike the serial version, parallel MaGIC reads in the problems from its data file in large chunks rather than singly, filling a buffer of 1000 or so such problems from which the slaves take work. When the buffer is empty the slaves hang until the master refills it. Buffer refill is a serial bottleneck, but not a very serious one given that the buffer is of reasonable size. Allocation of problems from the queue is managed by the GETSUB monitor as defined in the Argonne package.\footnote{See \cite{PPPP} pp.\ 25--29.} This co-ordinates a self-scheduling loop by allowing each process to take the next unit of work as soon as it has finished its previous one, keeping track of the problem queue by incrementing an internal counter. The granularity of MaGIC is extremely variable---one problem may occupy a process for several seconds or even minutes and produce thousands of matrices while the next might yield nothing and be over in a hundredth of a second---so flexibility in the scheduling is essential to getting a good speedup. \smallskip Communication of the results back to the master process for output is by posting the matrices found on a ``blackboard" in global memory. This produces some further complexities. The matrix-communication data structure is such that a unit of communication is an element of a (singly) linked list. A heap of such items is declared to be in shared memory and initially configured as a simple push-down stack. When a slave comes to record a matrix it pops an element from this stack, writes the matrix into it and pushes it onto the queue associated with the subscript of the problem on which the slave is currently working. When the master comes to print out the matrix it pops it from the queue, deals with its contents and pushes it back on the heap. So the rough picture is this $$\begin{picture}(0,12)(0,-6) \put(-8,-4){\line(1,0){16}} \put(-4,3){\line(1,0){11}} \put(-4,5){\line(1,0){11}} \put(-4,-4){\line(0,1){4}} \put(-2,-4){\line(0,1){4}} \put(5,-4){\line(0,1){4}} \put(7,-4){\line(0,1){4}} \put(-1,-4){\line(0,1){2}} \put(1,-4){\line(0,1){2}} \put(2,-4){\line(0,1){6}} \put(4,-4){\line(0,1){6}} \put(-1,-2){\line(1,0){2}} \put(2,2){\line(1,0){2}} \put(-4,-2){\line(1,0){2}} \put(2,-2){\line(1,0){2}} \put(-4,0){\line(1,0){2}} \put(2,0){\line(1,0){2}} \put(5,0){\line(1,0){2}} \put(5,-2){\line(1,0){2}} \put(-6,-3){\line(1,0){2}} \put(-6,-3){\line(0,1){7}} \put(-6,4){\vector(1,0){2}} \put(-3,3){\vector(0,-1){3}} \multiput(-4,3)(2,0){6}{\line(0,1){2}} \put(7.5,3.8){heap} \put(7.5,-5.1){subscripts} \put(7.5,-1){results} \put(7.5,-1.8){queues} \put(-2.6,0.5){slave pushes} \put(-2.6,2.1){slave pops} \put(-9,4.5){master pushes} \put(-8.6,-2.2){master} \put(-8.6,-3){pops} \put(-8,-5){\makebox(0,0){\ldots}} \put(-10,-5){\makebox(0,0){\ldots}} \put(-6,-5){\makebox(0,0){100}} \put(-3,-5){\makebox(0,0){101}} \put(0,-5){\makebox(0,0){102}} \put(3,-5){\makebox(0,0){103}} \put(6,-5){\makebox(0,0){104}} \end{picture}$$ Clearly it is necessary to protect the matrix heap at least with a lock, to prevent collisions between processes trying to access its active end. In fact, MaGIC uses a stack management monitor to allow only one process at a time access to the heap. If the heap is empty any slave trying to pop from it hangs in a DELAY queue\footnote{{\em ibid}. pp.~20--24} until the master performs its next act of garbage collection. All of this is fairly obvious and simple. Less simple is the action to be taken to avoid clashes between the master and the hindmost slave if the former catches up with the latter so that they are both working on the same results queue. When the master starts work with a new problem subscript it first checks to see whether the slave whose problem that was has finished it and gone elsewhere. If so there is no difficulty but if not then master and slave must recognise the fact and go into careful mode to avoid treading on each others toes. Therefore the master signals its presence and hangs until the slave finds the message and releases it. Then the slave gets its matrix communication blanks not from the communal heap but from a special supplementary heap. This is to avoid the possibility that slaves further down the line exhaust the heap causing a deadlock. The slave's results queue is also treated as protected by a queue management monitor so that pushes and pops do not overlap. Slaves working with subscripts greater than that of the master, and the master if working on a completed results queue, stay in carefree mode, adding and processing results without monitor protection. \smallskip Using the Portable Monitor Macro Package to parallelise MaGIC had two important beneficial effects. Firstly there was a great gain in time and programming effort over what would have been necessary to achieve similar results by low-level manipulation of locks and the like. Not least significant in this regard was that such relatively complex operations as declaring, initialising and accessing monitors come already well tried and debugged. Secondly the high level code of MaGIC is now fairly portable. It is expected that the program will run without modification on a Sequent Symmetry, on a Sequent Balance, on an Encore and even on a Butterfly, on all of which machines we plan soon to run the program at Argonne. Furthermore, parallel computing being in a state of very rapid development, it is important that applications should continue to work with a minimum of modification in new environments. The probability of this is increased by the use of programming tools designed to maximise portability. .