.EQ delim $$ .EN .nr PS 13 .ps 13 .nr VS 15 .vs 15 .in 0 .B DISTRIBUTION OF MATHEMATICAL SOFTWARE .br VIA ELECTRONIC MAIL .in 0 .FS .nr PS 8 .ps 8 This work was supported was supported in part by the National Science Foundation under Agreement No. DCR-8419437. Any opinions, findings, conclusions, and recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the National Science Foundation. .FE .nr PS 11 .ps 11 .nr VS 13 .vs 13 .I .in 1i A large collection of public-domain mathematical software is now available via electronic mail. Messages sent to "netlib@anl-mcs" (on the Arpanet/CSNET) or to "research!netlib" (on the UNIX\(rg network) wake up a server that distributes items from the collection. For example, the one-line message "send index" gets a library catalog by return mail. We describe how to use the service and some of the issues in its implementation. .R .in .nr PS 12 .ps 12 .nr VS 14 .vs 14 .in 0 .B JACK J. DONGARRA and ERIC GROSSE .R .nr PS 11 .ps 11 .nr VS 16 .vs 16 .nr PD 0.5v A large pool of high-quality mathematical software is in use at educational, research, and industrial institutions around the country. At present this software is available from a number of distribution agents \(em for example, AT&T Bell Laboratories for the PORT library, IMSL, the National Energy Software Center (NESC), and the Numerical Algorithms Group (NAG). All do a fine job with the distribution of large packages of mathematical software, but there is no provision for convenient distribution of small pieces of software. Currently scientists transmit such software by magnetic tapes, but contacting authors and deciphering alien tape formats wastes an intolerable amount of time. .PP A new system, .I netlib, .R provides quick, easy, and efficient distribution of public-domain software to the scientific computing community on an as-needed basis. It sends electronic mail over Arpanet, CSNET, Telenet, or UNIX uucp. .FS UNIX is a trademark of AT&T Bell Laboratories. .FE .B NETLIB IN USE .R .br Imagine an engineer who needs to compute several integrals numerically. He consults the resident numeric expert, who advises trying the routine $dqag$ for some preliminary estimates and then using $gaussq$ for the production runs. The engineer types at his terminal .I .nf mail research!netlib send dqag from quadpack send gaussq from go . .fi .R In a short time, he receives back two pieces of mail from $netlibd$. The first contains the double precision Fortran subroutine $dqag$ and all the routines from $quadpack$ that $dqag$ calls; the second contains $gaussq$ and the routines it calls. .PP The utility routine $d1mach$ was not included with $gaussq$, since it is probably already installed on his system; if he had wanted it, he could have changed his request to .I "send gaussq from go core" .R to include the ``core library'' of machine constants and basic linear algebra modules in the search list. .PP Should the engineer later decide that the routine $dqags$ would be more effective, he could send the request .I "send dqags but not dqag from quadpack" .R to get $dqags$ and any subroutines not already sent with $dqag$. .PP This engineer happens to be connected to the UNIX network; if, instead, his machine were on the Arpanet, he would use the address .I netlib@anl-mcs. .R If he needed the code in upper case, he would send his request in all caps; to get single precision, he need simply change the names of the routines or the libraries, as appropriate. Finally, he could ask for several routines together: .I .nf SEND RG RS FROM DEISPACK SEND DGECO FROM LINPACK CORE .fi .R .PP Meanwhile, the numerical expert decides she should check on the current contents of netlib. She types .nf .I mail research!netlib send index .fi .R The return mail shows a library $toeplitz$ she is not familiar with, so she sends mail .I "send index for toeplitz" .R to see what is included. Curious to see a typical routine, she tries .I "send only cslz from toeplitz" .R and gets just $cslz$, not any of the routines it calls. .PP More formally, requests have the following syntax: .nf $request_line$: send $options$ $names$ $exclusions$ $libraries$ who is $names$ find $keywords$ $options$: list of only $exclusions$: but not $names$ $libraries$: for $names$ from $names$ .fi where $names$ is a list of words, separated by blanks. "Whois" searches for address and telephone information in a database maintained by Gene Golub; this is soon to be replaced by the membership files of SIAM. "Find" returns a one-line description of all routines in the collection that mention the keywords; this can be more convenient than checking the indexes for each sublibrary that might be relevant. "List of" sends just the file names rather than the contents; this can be helpful when one already has an entire library and just wants to know what pieces are needed in a particular application. .PP Just how quickly these requests are answered depends on the speed of the network communications involved, but five or ten minutes is typical for Arpanet. CSNET or UNIX uucp may require anywhere from minutes to days to transmit a message from sender to recipient. The actual processing time is insignificant. One user wrote back enthusiastically that the system was so fast he preferred using it to hunting around on his own machine for the library software. .PP Netlib has been available since April 1985. To give a feel for the number of requests for software and information, we provide the following data. .bp .TE And in March we received our first request from Japan! .B MATERIAL AVAILABLE THROUGH NETLIB .R .PP Currently netlib offers a wide collection of public domain software as listed below: .TS center; l l. Package Description _ LINPACK Solution of linear equations [10] EISPACK Solution of eigenvalue problems [16, 22] TOEPLITZ Solution of systems of equations where the matrix is toeplitz [1] MINPACK Optimization routines [21] FNLIB Special-function library [15] FMM Codes from book by Forsythe, Malcolm, and Moler [12] QUADPACK Quadrature routines [22] PPPACK Spline routines [3] CALGO Collected algorithms from ACM FISHPAK Finite-difference approximation for elliptic BVP [24] ITPACK Iterative linear-systems solvers [18] BLAS Basic Linear Algebra Subprograms and extensions [19] SCPACK Schwarz-Christoffel conformal mapping program [25] PARANOIA Floating-point test PCHIP Hermite cubics by Fritsch and Carlson[14] MA28 Sparse matrix routine from the Harwell library[11] Y12M Package for sparse linear systems[26] LASO Block Lanczos code[8] ODEPACK Ordinary Differential Equations package .TE In addition there are miscellaneous other items, such as Golub and Welsch's GAUSSQ [17], biharmonic solvers [2], a public subset of FITPACK, and routines for machine constants and error handling and other public routines from the PORT library [13]. The multigrid program PLTMG by Bank and the multiple precision package by Brent are also in the collection, though they are probably too large to send by mail. .PP The various standard linear-algebra libraries are included for convenience, but the real heart of the collection lies in the recent research codes and the ``golden oldies'' that somehow never made it into standard libraries. Almost all of these programs are in Fortran, but some are in C, such as the routine $rainbow$ by Grosse for generating uniformly spaced colors. There is also a collection of errata for numerical books, descriptions and benchmark data for various computers, test data for linear programming collected by Gay, and the ``na-list'' electronic address book maintained by Gene Golub. .PP In addition, netlib itself\(emthat is, the shell scripts and C codes that do the automatic processing of requests\(emis also available. .PP We do \fInot\fP send out entire libraries. A computer center setting up a comprehensive numerical library should get magnetic tapes through the usual channels. .sp .B THE NETLIB SERVER .R .br The netlib server runs under the UNIX operating system (8th edition at Bell Labs and 4.2BSD at Argonne) and consists of a few shell scripts and C programs. The following discussion necessarily assumes some familiarity with UNIX commands. .PP When mail arrives for netlib, it is piped through a process that strips off punctuation, through a sort process that removes duplicates, and into a C program that parses the request, translates the given library names into a search list, and invokes the system loader with the given routine names as external symbols to be resolved. A requested routine may require that many routines be assembled, to resolve all references (perhaps across libraries). The resulting loader map is edited into a list of file names to satisfy the request. These files, along with a time stamp and disclaimer, are then mailed back to the requester. A logfile records the time, return address, number of characters sent, and requested routine and library names. When the incoming mail includes actual names as well as an electronic return address, the correspondence is also logged. .PP The programs can tolerate minor syntax deviations, since we do get requests like "\f2Please send me r1mach from port. Thank you.\f1" from people who don't realize they are talking to a program. Users sometimes submit a single request on the subject line of the mail message, so a "Subject:" prefix is also allowed. One user even sent .I "send index 4 eispack" .R instead of .I "send index for eispack", .R so "4" is a synonym for "for" and "from." .R (This is not such an unreasonable mistake, considering that the instructions for using netlib are often given over the phone.) However, we make no attempt to accept arbitrary English input. .PP One way to start up the mail processing is to have a daemon process that wakes up every few minutes and checks for a nonempty mailbox. In 8th edition UNIX, thanks to Dave Presotto, if a mailbox contains .I Pipe to rcv.cmd, .R then the mail delivery software, instead of appending the incoming text to a mailbox, will pipe the text to the command $rcv.cmd$. (Similar functionality is available from the Berkeley mail alias facility.) .PP The mailbox is owned by userid .I netlibd .R so that the process is run as netlibd; hence the return mail will have this mnemonic name attached. The userid is not just .I netlib .R because if the return mail command fails or if the remote user sends a reply, the message should not go back into the request processor. (Mail once came back announcing that a user had gone on vacation in the few hours before the netlib response had gotten to his mailbox.) Instead, mail to netlibd triggers a message explaining the difference in the two names. But further checks are needed to avoid sending this message to intermediate mail daemons that report problems with the original reply. .PP The file that describes the mapping from library names to loader search lists consists simply of lines of the form "eispack => \-leispack" . Several similar lines allow for variant spellings such as \f2eispac\f1 and \f2eispak\f1. This file is easily updated when new libraries are added to the collection. .sp .B SECURITY AND OTHER PROBLEMS .R .br A subtle security problem arises from the implementation: we construct commands to a shell based on text from a user. It could be catastrophic to blindly send mail to a return address of \f2kgbvax!\`rm -r *\`\f1, since the backquote characters tell the shell to first execute a command that removes all files! Therefore, the request parser checks for dangerous characters. Another potential security problem is that someone might tamper with the program text as it is enroute to the user. For now, we feel that the threat is not serious enough to adopt encryption schemes, though those would be easy to add. .PP Even though there are standards, it is not particularly easy to extract from a request a valid return address. There are comment brackets and anticomment brackets to be recognized and address transformations to be unwound, but we now seem to be correctly answering except when the return address contains blanks. .PP We do not use checksums since the network software already provides a reliable channel. We have received only one complaint, which involved noise on the link from a user's Vax to his PC; we regard that as his responsibility. If checksums were required, we would choose a scheme like that in MOSIS [20] which allows for anticipated, insignificant changes such as addition of trailing blanks on lines. To avoid problems with mail-processing programs in the various networks, our request syntax avoids colons and our replies start with a blank line so that message contents are not processed as header information along the mail route. Problems occasionally arise with computers that are willing to send us mail, but will not allow us to send mail back. Delays for multihop and internetwork mail are more common, but we have no way to collect statistics on these, and in any event it is out of our control. .PP The most difficult problem we have encountered has been length limitation; a few of the programs are more than 100 kilobytes, and that is more than the mail systems at many Arpanet sites will tolerate. Of course, the file transmission protocols can handle larger sizes, but those are too cumbersome and unstandardized for our purposes. We get around this by splitting up large items into several pieces of mail, but would prefer to see the mail systems themselves improved. We considered using Huffman coding to compress the files we send out, but that would save only about a factor of two and would require that we ship decoding programs. However, in setting up the netlib collection of test data for linear programming, David Gay did decide to adopt a program for compressing MPS format files. .PP If the request for a routine comes in to netlib in upper-case characters, then the response is sent out in upper case. Otherwise, netlib delivers the original source. .PP We chose this mode of interaction via electronic mail, keeping the intelligence local to the central depository, because mail is at present the only ubiquitous data communication service. We considered putting an interactive program at remote sites, communicating by mail with the depository. That would allow a better dialogue (``Do you want that in single or double?'') but would be difficult to write in the necessary portable way. .sp .B COMPARISON WITH OTHER SERVICES .R .br The netlib service provides its users with features not previously available: \(bu There are no administrative channels to go through. .sp \(bu Since no human processes the request, it is possible to get software at any time, even in the middle of the night. .sp \(bu The most up-to-date version is always available. .sp \(bu Individual routines or pieces of a package can be obtained instead of a whole collection. (One of the problems with receiving a large package of software is the volume of material. Often only a few routines are required from a package, yet the material is distributed as a whole collection and cannot easily be stripped off.) .PP On the other hand, netlib is simply a clearinghouse for contributed software and therefore subject to various disadvantages that have plagued such projects in the past. The only documents, example programs, and implementation tests are those supplied by the code author or other users. Also, there may be multiple codes for the same task and no help in choosing which is best. We have made an effort not to stock numerous copies of machine constants, but in general we have left submitted codes untouched. .PP In summary, we are not aware of any comparable software distribution service in existence. Our system has a different focus from, say, the Quantum Chemistry Exchange, and a more convenient distribution mechanism. Furthermore, we are more selective than many personal computer ``public bulletin board'' systems: we do not allow users to put their own software automatically in the collection. (This allows us some measure of control; we wish to avoid such problems as having our system confiscated because it contained a stolen telephone charge number.) .sp .B DOCUMENTATION .R .br As noted earlier, we do not distribute entire libraries. In fact, we feel it is unsocial for users to request the complete contents of a library. The reasons are simple: betters ways exist to distribute large quantities of information; we provide little documentation and testing support; and we do not wish to tie up the netlib server and block the flow for others. .PP Several years ago there was a discussion on the Arpanet prompted by a query from Jim Pool as to whether the time was not ripe for a ``portable set of documentation for interactive access by users of a collection of mathematical software.'' His idea was that the SLAC NAPLUG [5] be put into an expert system form. We have not yet tackled that problem in netlib, although we do pass along whatever documentation comes from the original code authors. Since the time of that discussion, local mathematical typesetting with output on terminals has become more common but most of the other objections remain. The user cannot be assumed to describe his problem exactly as the numerical analyst would; thus the program must be able to translate from the engineering to the mathematical domain. Understanding only the general nature of the user's problem is not enough; this leaves too much documentation to wade through. A certain amount of insight is required to realize that a user may not need exactly what he thinks he needs. .IP .I "Do you need the matrix inverse? Maybe you just need the solution to a linear system." .IP .I "This is a correlation matrix, and I really do want to look at the elements." .R .LP The general user will be looking for a library routine only a few times a year; he will certainly not remember more than a few commands. Thus, a sophisticated search language is infeasible. Who is going to write all the documentation in the required format? At least a modest knowledge of numerical analysis and considerable consulting experience will be necessary, but the job is tedious and unrewarding. The best ``interactive documentation system'' is a good numerical analyst interested in the users' problems. Unfortunately, this system has its own difficulties: expensive to reproduce, inconsistent in intelligence and alertness, hard to transport, prone to use buzzwords, often unavailable, specialized, and difficult to keep current. So there have been continuing efforts to build online numerical help facilities, the most successful of these being GAMS at the National Bureau of Standards, the NAG online help facilities and decision trees, and NIT at Oak Ridge. Entirely new writing styles are possible. Beyond the graph-structured text popularized in programmed learning manuals a decade ago, specific documentation might be derived, rather than simply searching for and listing parts of a file. Instead of a single example, an online consultant could provide a complete program tailored to the problem at hand. Also, some knowledge of the previous experience of the reader might be used to modify the level of explanation and avoid needless repetition. .sp .B COST .R .br The main cost of running this service is for communications. If it becomes necessary, we will require uucp users to call the hosts to pick up their return mail so that such costs are distributed fairly. At an average of a few requests per day, the traffic has been small enough to impose a negligible load on the host systems. Disk costs are controlled by discarding files that the host administrators are not themselves interested in keeping. The current collection occupies 57 megabytes. Most important, the human costs for maintaining the collection are modest and consist mainly of collecting software. We do not see how we could run such a widely accessible and low-overhead operation if we had to charge for the service\(emand we are not interested in doing so. (See, however, [4] for a description of the Toolchest electronic ordering system. One problem mentioned there is that users want to see demonstrations of software before purchase.) .sp .B HOPES FOR THE FUTURE .R .br There are several areas where we would like to see netlib expand: \(bu \f2Editors.\f1 The coverage of netlib obviously will tend to reflect the interests of the collectors, so we would welcome ``associate editors'' to augment the collection. .sp \(bu \f2Depositories.\f1 At present, there are just two distribution sites. Mail delays would be reduced if machines on other networks or in other countries were willing to also serve as depositories. (On the other hand, it is difficult even to keep two locations in sync!) .sp \(bu \f2Contributions\f1. The software that netlib uses to reply to mail is itself available from netlib, so it would be fairly easy for someone to, say, announce a service for searching a bibliography that he has collected. .sp .PP Netlib, being free, cannot replace commercial software firms. We provide no consulting, make no claims for the quality of the software distributed, and do not even guarantee the service will continue. In compensation, the quick response time and the lack of bureaucratic, legal, and financial impediments encourage researchers to send us their codes. They know that their work can quickly be available to a wide audience for testing and use. We hope netlib will promote the use of modern numerical techniques in general scientific computing. .sp .I Acknowledgments. .R We express our gratitude to the many authors and editors who have permitted their codes to be freely distributed and to Gene Golub for his encouragement and help in starting this project. Greg Astfalk prepared the master index used by the "find" command. Rich Kensicki made helpful suggestions, including a couple we could not implement: when electronic mail fails, send a tape or paper copy by U. S. mail, possibly by ECOM. The trick of editing a loader map is taken from the GAMS system at the National Bureau of Standards. Finally, the managements of our organizations deserve thanks for sponsoring this public service. .B .nr PS 8 .ps 8 REFERENCES .R .IP 1. .R Arushanian, O.B., et al. The TOEPLITZ package users' guide. Tech. Rep. ANL-83-16. Math. and Computer Science Div. Argonne National Laboratory, Ill., 1983. .IP 2. Bj\o'o/'rstad, P. Fast numerical solution of the biharmonic Dirichlet problem on rectangles. .I SIAM J. on Numerical Analysis 20 .R (1983), 59-71. .IP 3. de Boor, C. .I A Practical Guide to Splines. .R Applied Mathematical Science, Vol. 27. Springer-Verlag, New York, 1978. .IP 4. Brooks, C.A. Experiences with electronic software distribution. .I USENIX Association 1985 Summer Conference Proceedings. .R Portland, Oregon, 1985. .IP 5. Chan, T.F., Coughran, W.M., Grosse, E.H., Heath, M.T., and Luk, F.T. Numerical analysis program library user's guide. SLAC Computing Services User Note 82. Stanford University, 1976. .IP 6. Cody, W.J. The construction of numerical subroutine libraries. .I SIAM Review .R 16 (1974), 36-46. .IP 7. Cody, W.J. Observations on the mathematical software effort. .I Sources and Development of Mathematical Software. .R Ed. W. Cowell. Prentice-Hall, Englewood Cliffs, N.J., 1984, pp. 1-19. .IP 8 Cullum, J.K. and R.A. Willoughby, .I Lanczos Algorithm for Large Symmetric Eigenvalue Computations, Vol. II - Programs. .R Progress in Scientific Computing, Birkhauser, 1985. .IP 9. Dennis, J.E., Gay, D.M., and Welch, R.E. An adaptive nonlinear least squares algorithm. .I ACM Trans. on Mathematical Software, .R 7 (1981) 348-368, 369-383. .IP 10. Dongarra, J.J., Bunch, J.R., Moler, C.B., and Stewart, G.W. .I LINPACK Users' Guide. .R SIAM Publications, Philadelphia, 1979. .IP 11. Duff, I.S. .I MA28 - A set of Fortran Subroutines for Sparse Unsymmetric Linear Equations, .R AERE Harwell Report R.8730, HMSO, London, 1977. .IP 12. Forsythe, G.E., Malcolm, M.A., and Moler, C.B. .I Computer Methods for Mathematical Computations. .R Prentice-Hall, Englewood Cliffs, N.J., 1977. .IP 13. Fox, P.A., Hall, A.D., and Schryer, N.L. The PORT mathematical subroutine library. .I ACM Trans. on Mathematical Software, .R 4 (1978) 104-126, 177-188. .IP 14 F.N. Fritsch and R.E. Carlson, .I Monotone Piecewise Cubic Interpolation, .R SIAM J. Numer. Anal. 17, 2 (April 1980), 238-246. .IP 15. Fullerton, W. .I FNLIB User's Manual. .R AT&T Bell Laboratories, 1981. .IP 16. Garbow, B.S., Boyle, J.M., Dongarra, J.J., and Moler, C.B. .I Matrix Eigensystem Routines - EISPACK Guide Extension. .R Lecture Notes in Computer Science, Vol. 51, Springer-Verlag, Berlin, 1977. .IP 17. Golub, G.H., and Welsch, J.H., Calculation of Gauss quadrature rules. .I Mathematics of Computation, .R 23 (1969) 221-230. .IP 18. .R Kincaid, D.R. Respess, J.R., and Young, D.M. ITPACK 2C: A Fortran package for solving large sparse linear systems by adaptive accelerated iterative methods. .I ACM Trans. on Mathematical Software, .R 8 (1982), 302-322. .IP 19. Lawson, C., Hanson, R., Kincaid, D., and Krogh, F. Basic linear algebra subprograms for Fortran usage. .I ACM Trans. Mathematical Software, .R 5 (1979), 308-371. .IP 20. Lewicki, G., Cohen, D., Losleben, P., and Trotter, D. MOSIS: Present & future. .I 1984 Conf. on Advanced Research in VLSI, .R MIT, January 1984. .IP 21. Mor\*'e, J., Sorensen, D., Garbow, B., and Hillstrom, K. The MINPACK Project. .I Sources and Development of Mathematical Software. .R Ed. W. Cowell. Prentice Hall, Englewood Cliffs, N.J., 1984, pp. 88-111. .IP 22. Piessens, R., deDoncker-Kapenga, E., Uberhuber, C., and Kahaner, D. .I Quadpack: A Subroutine Package for Automatic Integration. .R Series in Computational Mathematics, Vol. 1. Springer-Verlag, Berlin, 1983. .IP 23. Smith, B.T., Boyle, J.M., Dongarra, J.J., Garbow, B.S., Ikebe, Y. Klema, V.C., and Moler, C.B. .I Matrix Eigensystem Routines - EISPACK Guide. .R Lecture Notes in Computer Science, Vol. 6. 2nd edition. Springer-Verlag, Berlin, 1976. .IP 24. Swarztrauber, P.N., and Sweet, R.A. Efficient FORTRAN subroutines for the solution of separable elliptic equations, algorithm 541. .I ACM Trans. on Mathematical Software, .R 5 (1979), 352-364. .IP 25. Trefethen, L.N. Numerical computation of the Schwarz-Christoffel transformation. .I SIAM J. Scientific and Statistical Computing, .R 1 (1980) 82-102. .IP 26. Zlatev, Z, J. Wasniewski, and K. Schaunberg, .I Y12M Solution of Large and Sparse Systems of Linear Equations, .R Lecture Notes in Computer Science, Vol. 121. Springer-Verlag, Berlin, 1981. Authors' Present Address: J. Dongarra, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439-4844 (electronic mail: anl-mcs!dongarra or dongarra@anl-mcs); Eric Grosse, AT&T Bell Laboratories, Murray Hill, NJ 07974 (electronic mail: research!ehg or ehg@btl.csnet). .