Commands, text, files and macros.

Commands, text, files and macros.

MATLAB has several commands which control the output format and the overall execution of the system.

The HELP command allows on-line access to short portions of text describing various operations, functions and special characters. The entire HELP document is reproduced in an appendix.

Results are usually printed in a scaled fixed point format that shows 4 or 5 significant figures. The commands SHORT, LONG, SHORT E, LONG E and LONG Z alter the output format, but do not alter the precision of the computations or the internal storage.

The WHO, WHAT and WHY commands provide information about the functions and variables that are currently defined.

The CLEAR command erases all variables, except EPS, FLOP, RAND and EYE. The statement A = <> indicates that a "0 by 0" matrix is to be stored in A. This causes A to be erased so that its storage can be used for other variables.

The RETURN and EXIT commands cause return to the underlying operating system through the Fortran RETURN statement.

MATLAB has a limited facility for handling text. Any string of characters delineated by quotes (with two quotes used to allow one quote within the string) is saved as a vector of integer values with '1' = 1, 'A' = 10, ' ' = 36, etc. (The complete list is in the appendix under CHAR.) For example

   '2*A + 3'  is the same as  <2 43 10 36 41 36 3>

It is possible, though seldom very meaningful, to use such strings in matrix operations. More frequently, the text is used as a special argument to various functions.

   NORM(A,'inf')    computes the infinity norm of A .
   DISPLAY(T)       prints the text stored in T .
   EXEC('file')     obtains MATLAB input from an external file.
   SAVE('file')     stores all the current variables in a file.
   LOAD('file')     retrieves all the variables from a file.
   PRINT('file',X)  prints X on a file.
   DIARY('file')    makes a copy of the complete MATLAB session.

The text can also be used in a limited string substitution macro facility. If a variable, say T, contains the source text for a MATLAB statement or expression, then the construction

   > T <

causes T to be executed or evaluated. For example

   T = '2*A + 3';
   S = 'B = >T< + 5'
   A = 4;
   > S <

produces

   B     =

      16.

Some other examples are given under MACRO in the appendix. This facility is useful for fairly short statements and expressions. More complicated MATLAB "programs" should use the EXEC facility.

The operations which access external files cannot be handled in a completely machine-independent manner by portable Fortran code. It is necessary for each particular installation to provide a subroutine which associates external text files with Fortran logical unit numbers.

6. Census example

Our first extended example involves predicting the population of the United States in 1980 using extrapolation of various fits to the census data from 1900 through 1970. There are eight observations, so we begin with the MATLAB statement

   n = 8

The values of the dependent variable, the population in millions, can be entered with

   y = < 75.995   91.972  105.711  123.203   ...
        131.669  150.697  179.323  203.212>'

In order to produce a reasonably scaled matrix, the independent variable, time, is transformed from the interval [1900,1970] to [-1.00,0.75]. This can be accomplished directly with

   t = -1.0:0.25:0.75

or in a fancier, but perhaps clearer, way with

   t = 1900:10:1970;   t = (t - 1940*ones(t))/40

Either of these is equivalent to

   t = <-1 -.75 -.50 -.25 0 .25 .50 .75>

The interpolating polynomial of degree n-1 involves an Vandermonde matrix of order n with elements that might be generated by

   for i = 1:n, for j = 1:n, a(i,j) = t(i)**(j-1);

However, this results in an error caused by 0**0 when i = 5 and j = 1 . The preferable approach is

   A = ones(n,n);
   for i = 1:n, for j = 2:n, a(i,j) = t(i)*a(i,j-1);

Now the statement

   cond(A)

produces the output

   ANS   =

      1.1819E+03

which indicates that transformation of the time variable has resulted in a reasonably well conditioned matrix.

The statement

   c = A\y

results in

These are the coefficients in the interpolating polynomial

                          n-1
      c  + c t + ... + c t
       1    2           n

Our transformation of the time variable has resulted in t = 1 corresponding to the year 1980. Consequently, the extrapolated population is simply the sum of the coefficients. This can be computed by

   p = sum(c)

The result is

   P     =

     426.0950

which indicates a 1980 population of over 426 million. Clearly, using the seventh degree interpolating polynomial to extrapolate even a fairly short distance beyond the end of the data interval is not a good idea.

The coefficients in least squares fits by polynomials of lower degree can be computed using fewer than n columns of the matrix.

   for k = 1:n, c = A(:,1:k)\y,  p = sum(c)

would produce the coefficients of these fits, as well as the resulting extrapolated population. If we do not want to print all the coefficients, we can simply generate a small table of populations predicted by polynomials of degrees zero through seven. We also compute the maximum deviation between the fitted and observed values.

   for k = 1:n, X = A(:,1:k);  c = X\y;  ...
      d(k) = k-1;  p(k) = sum(c);  e(k) = norm(X*c-y,'inf');
   <d, p, e>

The resulting output is

      0   132.7227  70.4892
      1   211.5101   9.8079
      2   227.7744   5.0354
      3   241.9574   3.8941
      4   234.2814   4.0643
      5   189.7310   2.5066
      6   118.3025   1.6741
      7   426.0950   0.0000

The zeroth degree fit, 132.7 million, is the result of fitting a constant to the data and is simply the average. The results obtained with polynomials of degree one through four all appear reasonable. The maximum deviation of the degree four fit is slightly greater than the degree three, even though the sum of the squares of the deviations is less. The coefficients of the highest powers in the fits of degree five and six turn out to be negative and the predicted populations of less than 200 million are probably unrealistic. The hopefully absurd prediction of the interpolating polynomial concludes the table.

We wish to emphasize that roundoff errors are not significant here. Nearly identical results would be obtained on other computers, or with other algorithms. The results simply indicate the difficulties associated with extrapolation of polynomial fits of even modest degree.

A stabilized fit by a seventh degree polynomial can be obtained using the pseudoinverse, but it requires a fairly delicate choice of a tolerance. The statement

   s = svd(A)

produces the singular values

We see that the last three singular values are less than 0.1 , consequently, A can be approximately by a matrix of rank five with an error less than 0.1 . The Moore-Penrose pseudoinverse of this rank five matrix is obtained from the singular value decomposition with the following statements

   c = pinv(A,0.1)*y, p = sum(c), e = norm(a*c-y,'inf')

The output is

The resulting seventh degree polynomial has coefficients which are much smaller than those of the interpolating polynomial given earlier. The predicted population and the maximum deviation are reasonable. Any choice of the tolerance between the fifth and sixth singular values would produce the same results, but choices outside this range result in pseudoinverses of different rank and do not work as well.

The one term exponential approximation

     y(t) = k exp(pt)

can be transformed into a linear approximation by taking logarithms.

     log(y(t)) = log k + pt

               = c  + c t
                  1    2

The following segment makes use of the fact that a function of a vector is the function applied to the individual components.

   X = A(:,1:2);
   c = X\log(y)
   p = exp(sum(c))
   e = norm(exp(X*c)-y,'inf')

The resulting output is

The predicted population and maximum deviation appear satisfactory and indicate that the exponential model is a reasonable one to consider.

As a curiousity, we return to the degree six polynomial. Since the coefficient of the high order term is negative and the value of the polynomial at t = 1 is positive, it must have a root at some value of t greater than one. The statements

   X = A(:,1:7);
   c = X\y;
   c = c(7:-1:1);  //reverse the order of the coefficients
   z = roots(c)

produce

   Z     =

      1.1023-  0.0000*i
      0.3021+  0.7293*i
     -0.8790+  0.6536*i
     -1.2939-  0.0000*i
     -0.8790-  0.6536*i
      0.3021-  0.7293*i

There is only one real, positive root. The corresponding time on the original scale is

   1940 + 40*real(z(1))

     =  1984.091

We conclude that the United States population should become zero early in February of 1984.