From research!ihnp4!shell!heller@UCB-VAX 26 85 Feb CST (Tue) 18:53:11 Received: from su-score.arpa by csnet-relay.arpa id a003277; 27 Feb 85 4:28 EST Received: from UCB-VAX.ARPA by SU-SCORE.ARPA with TCP; Wed 27 Feb 85 01:17:39-PST Received: by UCB-VAX.ARPA (4.24/4.42) id AA02630; Wed, 27 Feb 85 01:12:18 pst From: ihnp4!shell!heller@UCB-VAX Message-Id: <8502270912.AA02630@UCB-VAX.ARPA> Date: 26 Feb 85 18:53:11 CST (Tue) Received: by ihnp4.ATT.UUCP; id AA17542; 26 Feb 85 18:53:11 CST (Tue) To: na.dis@SU-SCORE Subject: 2D FFT's Via: Csnet-Relay; 27 Feb 85 5:02-EDT In October I asked for some information on large complex 2D FFT's. Here's a summary of the responses with some comments added. First, the original query: Can anyone supply me with information or references concerning runtime of a 4096 x 512 complex 2-dimensional FFT on current supercomputers? I'm interested in actual runtime, not speculation (we do that in-house), and separate times for work and data manipulation if possible. Some indication of your algorithm would also be appreciated. Many thanks, .................................................. From: Robert Schreiber sorry, ours isnt up yet. (watch for the GuilTech machine) .................................................. From: Richard Bartels I don't have anything to offer in answer to your question. But I would like a few pointers into the 2-D FFT literature (and beyond to higher dimensions) if you have the time. Two references on applications: R.H. Tatham, "Multidimensional Filtering of Seismic Data," IEEE Proceedings, Oct. 1984, pp. 1357-1369 D.W. March and A.D. Bailey, "A review of the two-dimensional transform and its use in seismic processing," First Break, vol. 1, no. 1, 1983, pp. 9-21 (If you can't find this in your library, which is likely, I'll send a photocopy. There are some color photos which do not reproduce well.) References on algorithms: Elliot and Rao Fast Transforms; Algorithms, Analyses, Applications Academic Press, 1982 (relatively little on n-D transforms) papers by Clive Temperton in J. Computational Physics, Paul Swarztrauber in Parallel Computing (new journal). .................................................. From: mazama!stew (Stewart Levin) For the new Amdahl/Fujitsu machine call Dr. Chani Pangali at Amdahl (408)746-3312. He's in charge of benchmarking the machine and 2D FFT's are one of his tests. I believe they were running at about 400 Megaflops though don't quote me. By the way he'll be at the Houston SIAM-SEG-SPE Mathematical Computations meeting in January to present his results. I'll be there too talking about wraparound supression in frequency domain migrations. (Pangali did not attend, after all.) .................................................. From: Roland Sweet Bonnie Bowman of Mobil Expl. and Prod. cornered Paul Swarztrauber and me last year wanting to know how to do 2D ffts on non-core-contained arrays. You might see if she has any insights now. Her phone is (214) 573-7737 -- that is, if you competitors can talk to each other. (no comment) .................................................. From: OSCAR BUNEMAN I have a 2-d FFT on the MFE CRAY system which is ideal for your purpose. If K is the total number of words to be transformed (that would be 2NM for an N by M complex transform) then the number of cycles is 2K(logN + logM) where logs are to base 2. This is because there are logN levels in one dimension and logM levels in the other. In each level each word has to be fetched and stored. All the arithmetic is overlapped with these fetches and stores. There is a ~15% overhead from 'chaining delays'. A cycle takes 12 1/2 nanoseconds on the CRAY 1 and less on the more recent CRAYs, such as XMP's. You can get details of this FFT from the LIBRIS utility at MFE, Livermore. Since your project involves 4 million word storage, you must be thinking of rather an advanced type of CRAY, unless you are considering buffering data in and out of core (I do that in my 3-D plasma particle code). If you do want to buffer, you may be content with packed data precision. I use a 2:1 packer (also available through LIBRIS) to maintain typical IBM single precision. Incidentally, for maximum speed one of the dimension ought not to be divisible by 8. I would embed the 4096*512 complex array in a (512,4097) complex array. The person looking after LIBRIS at MFE is Carol Tull. Her telephone number is (415) 422-1556. She may know of a good way to ship LIBRIS codes out of MFE. The LIBRIS identifier is MATHS:D:D6. By all means, let others know what I told you about my FFT. Can I also sell you a purely real FAST "HARTLEY" TRANSFORM ? (Oscar later broadcast a description of this code.) ........................................... Don Heller Physics and Computer Science Dept. Shell Development Company P.O. Box 481 Houston, Texas 77001 713-663-2341 {ihnp4, pur-ee, ut-sally}!shell!heller .