\documentstyle{article} % Specifies the document style. % The preamble begins here. \title{Numerical Performance Results from the Shallow Water Equation Test Suite} \author{ John B. Drake, {\em Ed.} \thanks{send correspondence to bbd@ornl.gov or Mathematical Sciences Section, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, Tennessee 37831-8083 } } \date{Updated: 3 May 1993} \begin{document} % End of preamble and beginning of text. \maketitle % Produces the title. \section{Introduction} The DOE Computer Hardware, Advanced Mathematics and Model Physics (CHAMMP) program seeks to provide climate researchers with an advanced modeling capability for the study of global change issues and is interested in the development of new methods for the study of climate dynamics. The shallow water equations have been used as a kernel for both oceanic and atmospheric general circulation models and are useful in evaluating numerical methods for weather forecasting and climate modeling. To promote development of new methods, a set of test cases has been proposed \cite{Williamson-Drake} and example software and reference solutions provided \cite{Jakob-Hack}. This report summarizes the performance of methods that have been applied to the test cases. Promising schemes should be subjected to other tests appropriate to their intended application. It is hoped that the bibliography provided herein will offer pointers to the appropriate literature for more comprehensive studies of the strengths and weaknesses of individual methods. \section{Comparison of Algorithms and Computing Platforms} Table 1 gives a comparison of methods by accuracy and computational performance. In Table 1, execution time is given in seconds and represents the best measurement of how long it takes to perform a 5 day integration on a dedicated machine. Dedicated time is not always available and measurements of time are often peculiar to a given installation. The accuracy reported is the normalized $l_2 (h)$ error as requested in \cite[eq. 83]{Williamson-Drake}. Gflops is an estimate of the number of floating point operations performed per second during the integration. Hardware performance monitors are the preferred measurement method. \begin{table}[htbp] \begin{tabular}{||rrrrrrrr||} \hline Algorithm & Resol. & Machine & $P$ & Accuracy & Gflops & Execution & Notes \\ & . & & & & & Time (sec) & \\ \hline Spectral & T42 & Y-MP & 1 & $10^{-10}$ & 0.162 & 3.5 & 1 \\ Spectral & T42 & Y-MP & 6 & $10^{-10}$ & 0.567 & 1.0 & 2 \\ \hline Spectral & T213 & Y-MP & 1 & $10^{-10}$ & 0.215 & 690.0 & 2 \\ Spectral & T213 & Y-MP & 6 & $10^{-10}$ & 1.210 & 130.0 & 2 \\ \hline TIG Model & 2562 & C90 & 1 & $2.5 \times 10^{-4}$& 0.087 & 20.4 & 3 \\ \hline A-L & $72 \times 44$ & C90& 1 & $2.5 \times 10^{-4}$ & 0.351 & 3.9 & 4 \\ \hline Icosohedral PIC & 10242 & Y-MP & 1 & $7.5 \times 10^{-4}$ & 0.103 & 26.0 & 5 \\ \hline \end{tabular} \caption{Best CPU time - Accuracy for Test Case 2} \end{table} Table 2 compares the parallel performance of methods. $P$ is the number of processors used in the computation and $S_P$ is the parallel speed up with $P$ processors over a single processor time. If $T_P$ denotes the execution time for $P$ processors, then $S_P = \frac{T_1}{T_P}$. The parallel efficiency is given by $E_P = \frac{S_p}{P}$. \begin{table}[htbp] \begin{tabular}{||rrrrrrrr||} \hline Algorithm & Resol. & Machine & $P$ & $S_P$ & $E_P$ & Execution & Notes \\ & & & & & & Time (sec) & \\ \hline Spectral & T42 & Y-MP & 6 & 3.5 & 0.58 & 1.0 & 2 \\ Spectral & T213 & Y-MP & 6 & 5.4 & 0.90 & 130.0 & 2 \\ \hline Spectral & T21 & iPSC/860 & 64 & 5.6 & 0.08 & 1.37 & 6 \\ Spectral & T42 & iPSC/860 & 128 & 18.4 & 0.14 & 3.92 & 6 \\ Spectral & T85 & iPSC/860 & 128 & 49.6 & 0.39 & 16.9 & 6 \\ \hline \end{tabular} \caption{Parallel Performance on Test Case 2} \end{table} \section{Notes} \begin{enumerate} \item Results of STSWM \cite{Jakob-Hack}. Solution exactly representable in spectral expansion so accuracy not representative. The Y-MP results were calculated in 64bit arithmetic. \item Rudy Jacob's results of multitasked STSWM reported at the Third CHAMMP Workshop on Numerical Solution of PDE's in Spherical Geometry. \item TIG is the twisted icosahedral grid method described in \cite{Heikes-Randall}. Execution time estimated from 600 sec timesteps at 0.0284 sec/step on test case 5. \item Arakawa-Lamb as described in \cite{Heikes-Randall}. Execution time estimated from 600 sec timesteps at 0.0284 sec/step on test case 5. \item The PIC method is applied on an icosahedral grid of 10242 points. 90 timesteps were taken for the 5 day simulation. Results presented by John Baumgardner at the Third CHAMMP Workshop on Numerical Solution of PDE's in Spherical Geometry. \item The Intel iPSC/860 results are 32bit arithmetic with accuracy $O(10^{-5})$. The T21 case required 90 timesteps, T42 -- 180, and T85 -- 360, for the five day integration. \end{enumerate} \section{Literature} Seven test cases were proposed in \cite{Williamson-Drake}. These cases collect several tests common in the literature but particularly follow work in \cite{Browning-Hack-Swarztrauber}. A code to solve the shallow water equations using the spectral transform method (STSWM) is described in \cite{Hack-Jakob}. High resolution test case solutions using the spectral code STSWM are given in \cite{Jakob-Hack}. The report \cite{Heikes-Randall} compares solutions using an icosahedral grid twisted to maintain grid symmetry between hemispheres. Parallel algorithms for the spectral transform are discussed in \cite{Worley-PartI,Walker-PartII}. \bibliographystyle{plain} \bibliography{shallow} \end{document} % End of document. .