FLOP and CHOP

FLOP and CHOP

Detailed information about the amount of work involved in matrix calculations and the resulting accuracy is provided by FLOP and CHOP. The basic unit of work is the "flop", or floating point operation. Roughly, one flop is one execution of a Fortran statement like

      S = S + X(I)*Y(I)

      Y(I) = Y(I) + T*X(I)

In other words, it consists of one floating point multiplication, together with one floating point addition and the associated indexing and storage reference operations.

MATLAB will print the number of flops required for a particular statement when the statement is terminated by an extra comma. For example, the line

      n = 20;  RAND(n)*RAND(n);,

ends with an extra comma. Two 20 by 20 random matrices are generated and multiplied together. The result is assigned to ANS, but the semicolon suppresses its printing. The only output is

        8800 flops

This is n**3 + 2*n**2 flops, n**2 for each random matrix and n**3 for the product.

FLOP is a predefined vector with two components. FLOP(1) is the number of flops used by the most recently executed statement, except that statements with zero flops are ignored. For example, after executing the previous statement,

      flop(1)/n**3

results in

      ANS   =

          1.1000

FLOP(2) is the cumulative total of all the flops used since the beginning of the MATLAB session. The statement

      FLOP = <0 0>

resets the total.

There are several difficulties associated with keeping a precise count of floating point operations. An addition or subtraction that is not paired with a multiplication is usually counted as a flop. The same is true of an isolated multiplication that is not paired with an addition. Each floating point division counts as a flop. But the number of operations required by system dependent library functions such as square root cannot be counted, so most elementary functions are arbitrarily counted as using only one flop.

The biggest difficulty occurs with complex arithmetic. Almost all operations on the real parts of matrices are counted. However, the operations on the complex parts of matrices are counted only when they involve nonzero elements. This means that simple operations on nonreal matrices require only about twice as many flops as the same operations on real matrices. This factor of two is not necessarily an accurate measure of the relative costs of real and complex arithmetic.

The result of each floating point operation may also be "chopped" to simulate a computer with a shorter word length. The details of this chopping operation depend upon the format of the floating point word. Usually, the fraction in the floating point word can be regarded as consisting of several octal or hexadecimal digits. The least significant of these digits can be set to zero by a logical masking operation. Thus the statement

      CHOP(p)

causes the p least significant octal or hexadecimal digits in the result of each floating point operation to be set to zero. For example, if the computer being used has an IBM 360 long floating point word with 14 hexadecimal digits in the fraction, then CHOP(8) results in simulation of a computer with only 6 hexadecimal digits in the fraction, i.e. a short floating point word. On a computer such as the CDC 6600 with 16 octal digits, CHOP(8) results in about the same accuracy because the remaining 8 octal digits represent the same number of bits as 6 hexadecimal digits.

Some idea of the effect of CHOP on any particular system can be obtained by executing the following statements.

      long,   t = 1/10
      long z, t = 1/10
      chop(8)
      long,   t = 1/10
      long z, t = 1/10

The following Fortran subprograms illustrate more details of FLOP and CHOP. The first subprogram is a simplified example of a system-dependent function used within MATLAB itself. The common variable FLP is essentially the first component of the variable FLOP. The common variable CHP is initially zero, but it is set to p by the statement CHOP(p). To shorten the DATA statement, we assume there are only 6 hexadecimal digits. We also assume an extension of Fortran that allows .AND. to be used as a binary operation between two real variables.

      REAL FUNCTION FLOP(X)
      REAL X
      INTEGER FLP,CHP
      COMMON FLP,CHP
      REAL MASK(5)
      DATA MASK/ZFFFFFFF0,ZFFFFFF00,ZFFFFF000,ZFFFF0000,ZFFF00000/
      FLP = FLP + 1
      IF (CHP .EQ. 0) FLOP = X
      IF (CHP .GE. 1 .AND. CHP .LE. 5) FLOP = X .AND. MASK(CHP)
      IF (CHP .GE. 6) FLOP = 0.0
      RETURN
      END

The following subroutine illustrates a typical use of the previous function within MATLAB. It is a simplified version of the Basic Linear Algebra Subprogram that adds a scalar multiple of one vector to another. We assume here that the vectors are stored with a memory increment of one.

      SUBROUTINE SAXPY(N,TR,TI,XR,XI,YR,YI)
      REAL TR,TI,XR(N),XI(N),YR(N),YI(N),FLOP
      IF (N .LE. 0) RETURN
      IF (TR .EQ. 0.0 .AND. TI .EQ. 0.0) RETURN
      DO 10 I = 1, N
         YR(I) = FLOP(YR(I) + TR*XR(I) - TI*XI(I))
         YI(I) = YI(I) + TR*XI(I) + TI*XR(I)
         IF (YI(I) .NE. 0.0D0) YI(I) = FLOP(YI(I))
   10 CONTINUE
      RETURN
      END

The saxpy operation is perhaps the most fundamental operation within LINPACK. It is used in the computation of the LU, the QR and the SVD factorizations, and in several other places. We see that adding a multiple of one vector with n components to another uses n flops if the vectors are real and between n and 2*n flops if the vectors have nonzero imaginary components.

The permanent MATLAB variable EPS is reset by the statement CHOP(p). Its new value is usually the smallest inverse power of two that satisfies the Fortran logical test

            FLOP(1.0+EPS) .GT. 1.0

However, if EPS had been directly reset to a larger value, the old value is retained.