In an
-th order FDN, the core computations consist of
updates of the
delay lines and a matrix by vector multiplication. The delay line
operations can proceed in parallel. The matrix by vector multiplication
requires in general
operations (multiplications and additions). If
the matrices arise from the scattering coefficients of a waveguide
junction, the computations reduce to
. The same order of complexity
is required by the normalized waveguide junction. For the special case of a
junction of equal-impedance waveguides, the multiplications can be replaced
by shifts when
is a power of
[26]. In all these
efficient cases the eigenvalues of the feedback matrix are constrained to
be at +1 or -1. The circulant matrix offers a more general eigenvalue
distribution. Moreover, the matrix by vector multiplication can be
implemented very efficiently in hardware. This multiplication can be viewed
as a circular convolution of the column vector with the first row of the
matrix. Such a convolution can be performed, when
is a power of
,
using two FFTs (one of which can be precomputed), an elementwise product
between two
-vectors, and an inverse FFT. The complexity of this
algorithm is
. It is easy to implement this matrix-vector
product in VLSI by means of the butterfly or other hypercubic
architectures [12]. These architectures allow computations of
the FFT in
time steps, and the algorithm can be pipelined.
The parallel implementation of waveguide scattering matrices cannot be done
in less than
time steps, because of the scalar product that is
involved in Householder reflections of any kind. Hence, in parallel
implementations we lose the advantage of waveguide scattering matrices over
circulant matrices.
Moreover, we can use number-theoretic Fourier transforms [12] in order to compute the circular convolution. Such transforms work over commutative rings, and can be arranged in such a way that all multiplications are replaced by shifts. Since in the convolution we have both the direct and inverse transforms, the overall result remains correct.
The circulant structure of
is advantageous for purposes of real-time
control as well. In the first place, the entire matrix is determined by one
of its rows or columns, and no matter how a row or column is modified, as
long as the rest of the matrix is modified accordingly to preserve the
circulant structure, the matrix will have unit-modulus eigenvalues as
needed for losslessness. Furthermore, the top row of a circulant matrix is
obtained from its eigenvalues by means of an inverse DFT. Therefore, it is
possible to efficiently generate a continuous family of circulant matrices
by continuously varying the complex phases of the eigenvalues. Moreover,
if the matrix-vector multiplication is implemented in the frequency domain,
the inverse DFT is not needed. Thus, we may move the
eigenvalues to
arbitrary points on the unit circle and generate a wide family of
efficiently computed lossless feedback matrices.