Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!micro-heart-of-gold.mit.edu!newsswitch.lcs.mit.edu!arclight.uoregon.edu!logbridge.uoregon.edu!enews.sgi.com!cnn.nas.nasa.gov!marcy.nas.nasa.gov!eugene
From: eugene@marcy.nas.nasa.gov (Eugene N. Miya)
Newsgroups: comp.benchmarks
Subject: [l/m 3/17/92] "Equivalence"		(20/28) c.be FAQ
Date: 20 Dec 1999 13:25:01 GMT
Organization: NASA Ames Research Center, Moffett Field, CA
Lines: 172
Distribution: world
Expires: 4
Message-ID: <83land$k88$1@sun500.nas.nasa.gov>
Reply-To: eugene@amelia.nas.nasa.gov (Eugene N. Miya)
NNTP-Posting-Host: marcy.nas.nasa.gov
Summary: eugene's test
Keywords: who, what, where, when, why, how
Xref: senator-bedfellow.mit.edu comp.benchmarks:28280

20	"Equivalence"						<This panel>
21	TPC
22
23
24
25	Ridiculously short benchmarks
26	Other miscellaneous benchmarks
27
28	References
1	Introduction to FAQ chain and netiquette
2
3	PERFECT
4
5	Performance Metrics
6	Temporary scaffold of New FAQ material
7	Music to benchmark by
8	Benchmark types
9	Linpack
10	Network Performance
11	NIST source and .orgs
12	Benchmark Environments
13	SLALOM
14
15	12 Ways to Fool the Masses with Benchmarks
16	SPEC
17	Benchmark invalidation methods
18
19	WPI Benchmark

A state of equilibrium is likely to be symmetric.
	--Hermann Weyl

Understanding equivalence is necessary for comparison.
It is among the least understood and researched portions of benchmarking
theory.  It is one reason why benchmarking is an art rather than a science.
Few have time to think about.  Few wish to get into its complications.
Most fear to tread into this can of worms because of the economic and
political fall out.  As such, it is one of the "gotchas" of benchmarking.
Benchmarking will not proceed without.

Equivalence is not the same for the numerical analyst as it is for the
hardware engineer or the end user.
When we understand equivalence, we can begin making transformations.
We can substitute and composite.  We might even be able to predict.
The last thing that you should believe is the statement:
	You are comparing apples versus oranges.
The biologists did and they came up with DNA.

Only the numerical analyst really appreciates the difference between
a 32-machine and a 36-bit machine these days (with some few odd
architectures around these days).  You can tell the difference between
programmers by asking "How many bits are in "SINGLE PRECISION" floating point?"
To say nothing of past 48, 60 and 33 bit architectures.

Types (Levels?) of Equivalence

Equivalent at the bit-level.
	Equivalent at the [byte|word]-level.
Computing differs from mathematics because, machine arithmetic lacks the
associatively of mathematics:
	(a+b)+c != a+(b+c)
The mathematician is incensed.  The naive student does not understand what
all the fuss is about, and the numerical analyst lives with it.

Equivalent at the functional-level
The results are the same (the time or storage might differ).
Do not become fixated by time.  Computing is a time-space problem.

The naive also believe that operations or instructions execute in a
single clock-cycle.
And in fact, you will find architectures which can perform one 4x4 matrix
multiply in one cycle.

Temporal equivalence
Parallelism clouds the mind of many.
The search for ever faster computing causes some to ignore the different
kinds of parallelism.
Parallelism is a geometric concept which has properties that time does not.
This is because time is irreversible, and because time flows it is the
source of many problems.
This is the source of Fred Brook's Mythical Man-Month: that we want to believe
that we can exchange effort and progress.  Why paraphase, just quote:

	Our estimating techniques fallaciously confuse effort with progress
	(hiding the assumption of interchangeability).

	The bearing of a child takes nine months,
	no matter how many women are assigned.

	Our techniques of estimating are poorly developed.  More seriously,
	they reflect an unvoiced assumption which is quite untrue,
	i.e., that all will go well.

	Een schip op het strand is een baken in zee.
	[A ship on the beach is a lighthouse to the sea.]
				--Dutch proverb quoted in Brooks

	None love the bearer of bad news.       --Sophocles

	Experience is a dear teacher, but fools will learn at no other.
				--Poor Richard's Almanac
				Chapter 8, "Calling the Shot."
	Plan to throw one away.

	How does a project get to be a year late? ... One day at a time.

Similarly, benchmarking has been creating its Mythical MIP and MegaFLOP
comparisons for years.  This was okay of the uniprocessor architectures of the
past.  It is a fatal mistake for the parallel architectures of the future.

Symmetry is a nice artifact.
It makes some of the elegance in understanding .
This is why FORTRAN to C translators should be regarded as suspect.

Semantically, the usage of arrays and nested loops in FORTRAN are
not necessarily equivalent in C.  Orthogonality conflicts between
the FORTRAN column-major ordered arrays versus C row-major ordering
would cause programmers to code loop order in differently.
The programming language designer is aware of this.
This is particularly critical for vector loops and for architectures
with certain types of interleaved memories (e.g., stride problems).
Nested loops (overloaded parallelism) are a confusing problem.
One solution is loop interchange, but this does not always work.
The naive C student systems programmer unfamiliar with FORTRAN
does not understand this.  This is especially dangerous because of long-term
effects: some company is going to hire that programmer down the road.
And to managers unable to tell the difference between a systems programmer
and applications programmer: we leave that for you to figure out the
implications.

The danger was that one programmer assumed the equivalence of
some state or operation, some statement or some procdure.
Unfortunately, at this time, the only way to verify equivalnce is testing.

Low-level optimizations should not endanger comparisons,
but an optimized program which is functionally equivalent to
the end-user and compiler writer may not be equivalent to the benchmarker.
The benchmarker sometimes needs to know the exact instructions run.
This is where one resorts to analyzing non-portable assembly language
and thus the generality of an observation is lost.
Can you see this is where substitution plays one role?

A necessary but not sufficient condition is to take more information at
the "lower-levels" and raise it to higher levels.  Benchmarks require
increasing awareness of things like operation counts, statement counts,
and their "exchange rates" to form an awareness at the higher level.

Hopefully you can see now why we want equivalence, substitution,
and composition.
Yes we are going to have to deal with congruence (equivalance) classes,
similarity in the mathematical sense, measure theory, and
other theory topics.  Some day.

SO the next time someone asks you to compare machines or the relation
between LINPACK numbers and SPEC number.  Pause and think a moment.
If none love the bearer of bad news, and if we learn best by our mistakes,
then how are we expected to learn?

                   ^ A  
                s / \ r                
               m /   \ c              
              h /     \ h            
             t /       \ i          
            i /         \ t        
           r /           \ e      
          o /             \ c    
         g /               \ t  
        l /                 \ u
       A /                   \ r
        <_____________________> e   
                Language
 

.