Geometric Interpretation of Least Squares

Geometric Interpretation of Least Squares

Typically, the number of frequency constraints is much greater than the number of design variables (filter coefficients). In these cases, we have an overdetermined system of equations (more equations than unknowns). Therefore, we cannot generally satisfy all the equations, and are left with minimizing some error criterion to find the ``optimal compromise'' solution.

In the case of least-squares approximation, we are minimizing the Euclidean distance, which suggests the geometrical interpretation shown in Fig.4.19.

$\begin{psfrags} % latex2html id marker 14771\psfrag{Ax}{{\Large $\mathbf{A}{\underline{\hat{h}}}$}}\psfrag{b}{{\Large ${\underline{d}}$}}\psfrag{column}{{\Large column-space of $\mathbf{A}$}}\psfrag{space}{}\begin{figure}[htbp] \includegraphics[width=3in]{eps/lsq} \caption{Geometric interpretation of orthogonal projection of the vector $\protect{\underline{d}}$\ onto the column-space of $\mathbf{A}$.} \end{figure} \end{psfrags}$
Thus, the desired vector ${\underline{d}}$ is the vector sum of its best least-squares approximation $\mathbf{A}{\underline{\hat{h}}}$ plus an orthogonal error :

$\displaystyle {\underline{d}}\eqsp \mathbf{A}{\underline{\hat{h}}}+ {\underline{e}}.$

(5.42)

In practice, the least-squares solution ${\underline{\hat{h}}}$ can be found by minimizing the sum of squared errors:

$\displaystyle \hbox{Minimize}_{\underline{h}}\Vert{\underline{e}}\Vert _2 \eqsp \Vert{\underline{d}}-\mathbf{A}{\underline{h}}\Vert _2$

(5.43)

Figure 4.19 suggests that the error vector ${\underline{d}}-\mathbf{A}{\underline{\hat{h}}}$ is orthogonal to the column space of the matrix $\mathbf {A}$ , hence it must be orthogonal to each column in $\mathbf {A}$ :

$\displaystyle \mathbf{A}^T({\underline{d}}-\mathbf{A}{\underline{\hat{h}}})\eqsp 0 \quad\Rightarrow\quad \mathbf{A}^T\mathbf{A}{\underline{\hat{h}}}\eqsp \mathbf{A}^T{\underline{d}}$

(5.44)

This is how the orthogonality principle can be used to derive the fact that the best least squares solution is given by

$\displaystyle {\underline{\hat{h}}}\eqsp (\mathbf{A}^T\mathbf{A})^{-1}\mathbf{A}^T {\underline{d}}\eqsp \mathbf{A}^\dagger {\underline{d}}$

(5.45)

In matlab, it is numerically superior to use ``h= A h'' as opposed to explicitly computing the pseudo-inverse as in ``h = pinv(A) * d''. For a discussion of numerical issues in matrix least-squares problems, see, e.g., [92].

We will return to least-squares optimality in §5.7.1 for the purpose of estimating the parameters of sinusoidal peaks in spectra.

[How to cite this work] [Order a printed hardcopy] [Comment on this page via email]

``Spectral Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2011, ISBN 978-0-9745607-3-1.
Copyright © 2022-02-28 by Julius O. Smith III
Center for Computer Research in Music and Acoustics (CCRMA), Stanford University

Geometric Interpretation of Least Squares

``Spectral Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2011, ISBN 978-0-9745607-3-1. Copyright © 2022-02-28 by Julius O. Smith III Center for Computer Research in Music and Acoustics (CCRMA), Stanford University

``Spectral Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2011, ISBN 978-0-9745607-3-1.
Copyright © 2022-02-28 by Julius O. Smith III
Center for Computer Research in Music and Acoustics (CCRMA), Stanford University