Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!micro-heart-of-gold.mit.edu!newsswitch.lcs.mit.edu!arclight.uoregon.edu!logbridge.uoregon.edu!enews.sgi.com!cnn.nas.nasa.gov!marcy.nas.nasa.gov!eugene From: eugene@marcy.nas.nasa.gov (Eugene N. Miya) Newsgroups: comp.benchmarks Subject: [l/m 3/17/92] "Equivalence" (20/28) c.be FAQ Date: 20 Dec 1999 13:25:01 GMT Organization: NASA Ames Research Center, Moffett Field, CA Lines: 172 Distribution: world Expires: 4 Message-ID: <83land$k88$1@sun500.nas.nasa.gov> Reply-To: eugene@amelia.nas.nasa.gov (Eugene N. Miya) NNTP-Posting-Host: marcy.nas.nasa.gov Summary: eugene's test Keywords: who, what, where, when, why, how Xref: senator-bedfellow.mit.edu comp.benchmarks:28280 20 "Equivalence" 21 TPC 22 23 24 25 Ridiculously short benchmarks 26 Other miscellaneous benchmarks 27 28 References 1 Introduction to FAQ chain and netiquette 2 3 PERFECT 4 5 Performance Metrics 6 Temporary scaffold of New FAQ material 7 Music to benchmark by 8 Benchmark types 9 Linpack 10 Network Performance 11 NIST source and .orgs 12 Benchmark Environments 13 SLALOM 14 15 12 Ways to Fool the Masses with Benchmarks 16 SPEC 17 Benchmark invalidation methods 18 19 WPI Benchmark A state of equilibrium is likely to be symmetric. --Hermann Weyl Understanding equivalence is necessary for comparison. It is among the least understood and researched portions of benchmarking theory. It is one reason why benchmarking is an art rather than a science. Few have time to think about. Few wish to get into its complications. Most fear to tread into this can of worms because of the economic and political fall out. As such, it is one of the "gotchas" of benchmarking. Benchmarking will not proceed without. Equivalence is not the same for the numerical analyst as it is for the hardware engineer or the end user. When we understand equivalence, we can begin making transformations. We can substitute and composite. We might even be able to predict. The last thing that you should believe is the statement: You are comparing apples versus oranges. The biologists did and they came up with DNA. Only the numerical analyst really appreciates the difference between a 32-machine and a 36-bit machine these days (with some few odd architectures around these days). You can tell the difference between programmers by asking "How many bits are in "SINGLE PRECISION" floating point?" To say nothing of past 48, 60 and 33 bit architectures. Types (Levels?) of Equivalence Equivalent at the bit-level. Equivalent at the [byte|word]-level. Computing differs from mathematics because, machine arithmetic lacks the associatively of mathematics: (a+b)+c != a+(b+c) The mathematician is incensed. The naive student does not understand what all the fuss is about, and the numerical analyst lives with it. Equivalent at the functional-level The results are the same (the time or storage might differ). Do not become fixated by time. Computing is a time-space problem. The naive also believe that operations or instructions execute in a single clock-cycle. And in fact, you will find architectures which can perform one 4x4 matrix multiply in one cycle. Temporal equivalence Parallelism clouds the mind of many. The search for ever faster computing causes some to ignore the different kinds of parallelism. Parallelism is a geometric concept which has properties that time does not. This is because time is irreversible, and because time flows it is the source of many problems. This is the source of Fred Brook's Mythical Man-Month: that we want to believe that we can exchange effort and progress. Why paraphase, just quote: Our estimating techniques fallaciously confuse effort with progress (hiding the assumption of interchangeability). The bearing of a child takes nine months, no matter how many women are assigned. Our techniques of estimating are poorly developed. More seriously, they reflect an unvoiced assumption which is quite untrue, i.e., that all will go well. Een schip op het strand is een baken in zee. [A ship on the beach is a lighthouse to the sea.] --Dutch proverb quoted in Brooks None love the bearer of bad news. --Sophocles Experience is a dear teacher, but fools will learn at no other. --Poor Richard's Almanac Chapter 8, "Calling the Shot." Plan to throw one away. How does a project get to be a year late? ... One day at a time. Similarly, benchmarking has been creating its Mythical MIP and MegaFLOP comparisons for years. This was okay of the uniprocessor architectures of the past. It is a fatal mistake for the parallel architectures of the future. Symmetry is a nice artifact. It makes some of the elegance in understanding . This is why FORTRAN to C translators should be regarded as suspect. Semantically, the usage of arrays and nested loops in FORTRAN are not necessarily equivalent in C. Orthogonality conflicts between the FORTRAN column-major ordered arrays versus C row-major ordering would cause programmers to code loop order in differently. The programming language designer is aware of this. This is particularly critical for vector loops and for architectures with certain types of interleaved memories (e.g., stride problems). Nested loops (overloaded parallelism) are a confusing problem. One solution is loop interchange, but this does not always work. The naive C student systems programmer unfamiliar with FORTRAN does not understand this. This is especially dangerous because of long-term effects: some company is going to hire that programmer down the road. And to managers unable to tell the difference between a systems programmer and applications programmer: we leave that for you to figure out the implications. The danger was that one programmer assumed the equivalence of some state or operation, some statement or some procdure. Unfortunately, at this time, the only way to verify equivalnce is testing. Low-level optimizations should not endanger comparisons, but an optimized program which is functionally equivalent to the end-user and compiler writer may not be equivalent to the benchmarker. The benchmarker sometimes needs to know the exact instructions run. This is where one resorts to analyzing non-portable assembly language and thus the generality of an observation is lost. Can you see this is where substitution plays one role? A necessary but not sufficient condition is to take more information at the "lower-levels" and raise it to higher levels. Benchmarks require increasing awareness of things like operation counts, statement counts, and their "exchange rates" to form an awareness at the higher level. Hopefully you can see now why we want equivalence, substitution, and composition. Yes we are going to have to deal with congruence (equivalance) classes, similarity in the mathematical sense, measure theory, and other theory topics. Some day. SO the next time someone asks you to compare machines or the relation between LINPACK numbers and SPEC number. Pause and think a moment. If none love the bearer of bad news, and if we learn best by our mistakes, then how are we expected to learn? ^ A s / \ r m / \ c h / \ h t / \ i i / \ t r / \ e o / \ c g / \ t l / \ u A / \ r <_____________________> e Language .