DFT和FFT之间的哪些区别使FFT如此快速？

16

我正在尝试了解FFT，这是到目前为止的内容：

为了找到波形中的频率幅度，必须在两个不同的相位（正弦和余弦）中将电波乘以它们正在搜索的频率，然后对它们进行平均，从而对它们进行探测。该阶段是通过与两者之间的关系找到的，其代码如下所示：

//simple pseudocode
var wave = [...];                //an array of floats representing amplitude of wave
var numSamples = wave.length;
var spectrum = [1,2,3,4,5,6...]  //all frequencies being tested for.  

function getMagnitudesOfSpectrum() {
   var magnitudesOut = [];
   var phasesOut = [];

   for(freq in spectrum) {
       var magnitudeSin = 0;
       var magnitudeCos = 0;

       for(sample in numSamples) {
          magnitudeSin += amplitudeSinAt(sample, freq) * wave[sample];
          magnitudeCos += amplitudeCosAt(sample, freq) * wave[sample];
       }

       magnitudesOut[freq] = (magnitudeSin + magnitudeCos)/numSamples;
       phasesOut[freq] = //based off magnitudeSin and magnitudeCos
   }

   return magnitudesOut and phasesOut;
}

为了非常快地对很多频率执行此操作，FFT使用了许多技巧。

什么使FFT比DFT快得多的技巧？

PS我曾尝试在网络上查看完整的FFT算法，但是所有的技巧都倾向于被精简为一段漂亮的代码，而没有太多解释。在我能理解整个事情之前，我首先需要对这些有效更改中的每一个进行一些介绍作为概念。

谢谢。

fft dft algorithms

— 瑟夫·里德（Seph Reed）
source

7

“ DFT”不是指算法：它是指数学运算。“ FFT”是指用于计算该操作的一类方法。

1

只是想指出，sudo在您的代码示例中使用可能会造成混淆，因为这是计算机领域中众所周知的命令。您可能是指伪代码。

— rwfeather

1

@nwfeather他可能的意思是“伪代码”。

— user207421

20

幼稚实施一个的点DFT基本上是由一个乘法矩阵。这导致的复杂性 $N$ $N \times N$ $\mathcal{O}(N^2)$ 。

radix-2是最常见的快速傅立叶变换（FFT）算法 Cooley-Tukey时间抽取算法。这是基本的分而治之的方法。

首先定义“旋转因子”为：其中

W_{N} ≜ e^{- j \frac{2 π}{N}}

$W_N \triangleq e^{-j\frac{2\pi}{N}}$

是虚数单位，则DFT

的

由下式给出

j ≜ \sqrt{- 1}

$j \triangleq \sqrt{-1}$

X [k]

$X[k]$

x [n]

$x[n]$

如果

是偶数（并且

X [k] = \sum_{n = 0}^{N - 1} x [n] W_{N}^{k n} .

$X[k] = \sum_{n=0}^{N-1} x[n] \, W_N^{kn} \, .$

N

$N$

是一个整数）时，合计然后可以在两个总和被划分如下

\frac{N}{2}

$\tfrac{N}{2}$

X [k] = \sum_{n = 0}^{N / 2 - 1} x [2 n] W_{N}^{2 k n} + \sum_{n = 0}^{N / 2 - 1} x [2 n + 1] W_{N}^{k (2 n + 1)}

$X[k] = \sum_{n=0}^{N/2-1} x[2n]W_N^{2kn} + \sum_{n=0}^{N/2-1} x[2n+1]W_N^{k(2n+1)}$ ，其中用的偶数样本第一求和优惠

的偶数样本，第二个是

的奇数样本。定义

和

，并使用一个事实，即

x [n]

$x[n]$

x [n]

$x[n]$

x_{e} [n] ≜ x [2 n]

$x_e[n] \triangleq x[2n]$

x_{o} [n] ≜ x [2 n + 1]

$x_o[n] \triangleq x[2n+1]$

，和 $W_N^{k(2n+1)} = W_N^{2kn}W_N^k$
$W_N^{2kn} = W_{N/2}^{kn}$

这可以重写为

\begin{aligned} X [k] & = \sum_{n = 0}^{N / 2 - 1} x_{e} [n] W_{N / 2}^{k n} + W_{N}^{k} \sum_{n = 0}^{N / 2 - 1} x_{o} [n] W_{N / 2}^{k n} \\ = X_{e} [k] + W_{N}^{k} X_{o} [k] \end{aligned}

$\begin{align} X[k] &= \sum_{n=0}^{N/2-1} x_e[n] W_{N/2}^{kn} + W_N^k\sum_{n=0}^{N/2-1} x_o[n]W_{N/2}^{kn} \\ & = X_e[k] + W_N^k X_o[k] \end{align}$

X_{e} [k]

$X_e[k]$

X_{o} [k]

$X_o[k]$

\frac{N}{2}

$\tfrac{N}{2}$

x [n]

$x[n]$

N

$N$

\frac{N}{2}

$\tfrac{N}{2}$

2 {(\frac{N}{2})}^{2} + N < N^{2}

$2 \left( \frac{N}{2} \right)^2 + N < N^2$

N > 2

$N > 2$

$\mathcal{O}(N\log N)$ $\mathcal{O}(N^2)$

— 安帕尔
source

您愿意列出每个变量代表什么吗？我是相当新的这一点，所以W，j，X()，N并且k还没有定义我。

— Seph Reed

W

$W$ 已经在我的答案中定义了。我试图更好地定义其他一些符号。

k

$k$ 表示频域中的索引，并且

n

$n$ 时域中的索引。

— anpar

19

http://nbviewer.jupyter.org/gist/leftaroundabout/83df89a7d3bdc24373ea470fb50be629

DFT，尺寸16

FFT，大小16

由此可以明显看出复杂性上的差异，不是吗？

这就是我对FFT的理解。

首先，我总是将傅立叶变换主要视为连续函数的变换，即双射映射 $\operatorname{FT} : \mathcal{L}^2(\mathbb{R}) \to \mathcal{L}^2(\mathbb{R})$ 。从这个角度来看，很明显，实际上并没有必要进入“最深层次”并遍历单个元素，因为“单个元素”是实线上的单点，其中有无数个无限大的点。。

那么，这种转换如何仍然定义良好？好吧，至关重要的是它不能在通用功能空间上运行 $\mathbb{R}\to\mathbb{C}$ 但仅在（Lebesgue-，square-）可积函数的空间上。现在，这种可集成性不是一个很强的属性（比可微性等要弱得多），但是它确实要求该功能“在本地可用可数信息来区分”。这种描述是通过短时傅立叶变换的系数给出的。^†最简单的情况是您的函数是连续的，并且将其划分为很小的区域，以至于每个区域基本上都是恒定的。然后，每个STFT具有最强的第零项。如果忽略（无论如何衰减）其他系数，则每个域只是一个数据点。在所有这些短时LF极限系数中，您可以进行离散傅立叶变换。实际上，这就是对实测数据进行任何FT时所做的事情！

但是，测得的数据不一定与基本物理量相对应。例如，当您测量一些光强度时，您实际上只是在测量振幅电磁波，该电磁波的频率本身太高而无法用ADC进行采样。但是显然，尽管光波的频率是疯狂的，但您也可以便宜地计算采样光强度信号的DFT。

这可以理解为FFT便宜的最重要原因：

不要打扰尝试从最高水平看到各个振荡周期。相反，仅转换已经在本地进行了预处理的高级信息。

但是，这还不是全部。FFT的妙处在于，它仍然可以为您提供完整DFT所能提供的所有信息。也就是说，在采样光束的精确电磁波时，您还将获得所有信息。可以通过转换光电二极管信号来实现吗？–您能从中测量出确切的光频率吗？

好吧，答案是否定的，你不能。也就是说，除非您应用其他技巧。
首先，您至少需要在短时间内大致测量频率。好吧，用光谱仪是可能的。但这只能达到 $\Delta \nu = 1/{\Delta t}$ ，典型的不确定性关系^‡。

通过总体上更长的时间跨度，我们还应该能够缩小频率不确定性。如果您不仅可以在本地测量粗略的频率，而且还可以测量波的相位，则确实可以做到。您知道，如果一秒钟后再看一眼，则1000 Hz信号将具有完全相同的相位。而1000.5 Hz信号虽然在短范围内无法区分，但在一秒钟后将具有反相。

幸运的是，该相位信息可以很好地存储在单个复数中。这就是FFT的工作方式！它从许多小的局部转换开始。它们很便宜-一方面，显然是因为它们仅使用少量数据，而另一方面，他们知道，由于时间跨度短，它们无论如何都无法非常精确地解析频率-因此，即使您做很多这样的转变。

但是，它们也确实记录了相位，然后您就可以在顶层使频率分辨率更加精确。所需的转换又很便宜，因为它本身不会打扰任何高频振荡，而只会打扰预处理的低频数据。

^†_{是的，目前我的论点有些循环。让我们称之为递归就可以了...}

^‡_{这种关系不是量子力学的，但是海森堡不确定性实际上具有相同的根本原因。}

— 到处走走
source

2

这个问题的漂亮图画。:-)

— 罗伯特·布里斯托

2

Don't you love diagrams that are repeated everywhere and never actually explained anywhere :)

— user541686

1

I understood the picture after having just read anpar’s answer.

— JDługosz

15

Here is a picture to add to Robert's good answer demonstrating the "re-use" of operations, in this case for an 8 point DFT. The "Twiddle Factors" are represented in the diagram using the notation $W_N^{nk}$ which is equal to $e^{j2\pi \frac{nk}{N}}$

Note the path shown and the equation underneath shows the result for the frequency bin X(1), as given by Robert's equation.

Dashed lines are no different than solid lines just to make clear where the summation joins are.

— Dan Boschen
source

8

essentially, in computing the naive DFT directly from the summation:

X [k] = \sum_{n = 0}^{N - 1} x [n] e^{j 2 π \frac{n k}{N}}

$X[k] = \sum\limits_{n=0}^{N-1} x[n] \, e^{j 2 \pi \frac{nk}{N}}$

there are $N$ table lookups for the twiddle factor $e^{j 2 \pi \frac{nk}{N}}$ , $N$ complex multiplications, and $N-1$ additions. and that's just for one value of $X[k]$ and one instance of $k$ . then the naive DFT throws away all of that intermediate data away and goes through all of it again for $X[k+1]$ .

so the FFT holds on to some intermediate data.
the FFT will also make use of factoring the twiddle factor a bit so that the same factor can be used for an intermediate combination of data.

— robert bristow-johnson
source

4

I am a visual person. I prefer to imagine the FFT as a matrix trick rather than as a summation trick.

To explain at a high level:

A naive DFT computes each output sample independently and uses every input sample in each computation (classic N² algorithm).

A common FFT uses symmetries and patterns in the DFT definition to do the computation in "layers" (log N layers), each layer with constant-time requirement per sample creating an N log N algorithm.

More specifics:

One way to visualize these symmetries is to look at the DFT as a 1×N matrix input multiplied by an NxN matrix of all your complex exponentials. Let's start with the "radix 2" case. We're going to split out the even and odd rows of the matrix (corresponding to the even and odd input samples) and consider them as two separate matrix multiplications which add together to get the same final result.

Now look at these matrices: in the first one the left half is identical to the right half. In the other, the right half is the left half x −1. This means we only really have to use the left half of these matrices for multiplication and create the right half cheaply by multiplying by 1 or −1. Next, observe that the second matrix differs from the first matrix by factors that are the same in each column, so we can factor that out and multiply it into the input so now both even and odd samples use the same matrix, but require a multiplier first. And the final step is observing that this resulting N/2 × N/2 matrix is identical to an N/2 DFT matrix and we can do this again and again until we reach a 1×1 matrix where the DFT is an identity function.

To generalize beyond radix 2, you can look at splitting every third row and looking at three chunks of columns, or every 4th etc.

In the event of prime sized inputs, there exists a method to properly zero-pad, FFT, and truncate, but that is beyond the scope of this answer.

See: http://whoiskylefinn.com/MatrixFFT.html

— kylefinn
source

prime FFT, various FFT. Using zero-pad is not the only option. Sorry, I just find zero-padding overused. One small question, I do not understand what you mean by "each layer with constant-time requirement per sample", if you could explain, it would be awesome.

— Evil

1

Sorry I didn't mean to say zero padding was THE way, just wanted to point to further reading. And "layer" meaning a recursion, or a translation from a N DFT to 2 N/2 DFTs, with constant time per sample meaning this step is O(N).

— kylefinn

So far, of all the descriptions, this one seems the closest to making a complex issue simple. The big thing it's missing, though, is an example of these matrix's. Would you happen to have one?

— Seph Reed

Uploaded this, should help: whoiskylefinn.com/MatrixFFT.html

— kylefinn

1

The DFT does a brute force N^2 matrix multiply.

FFTs does clever tricks, exploiting properties of the matrix (degeneralizing the matrix multiply) in order to reduce computational cost.

Let us first look at a small DFT:

W=fft(eye(4));

x = rand(4,1)+1j*rand(4,1);

X_ref = fft(x);

X = W*x;

assert(max(abs(X-X_ref)) < 1e-7)

Great so we are able to substitute MATLABs call to the FFTW library by a small 4x4 ( complex) matrix multiplication by filling a matrix from the FFT function. So what does this matrix look like?

N=4,

Wn=exp(-1j*2*pi/N),

f=((0:N-1)'*(0:N-1))

f =

 0     0     0     0
 0     1     2     3
 0     2     4     6
 0     3     6     9

W=Wn.^f

W =

1 1 1 1

1 -i -1 i

1 -1 1 -1

1 i -1 -i

Each element is either +1, -1, +1j or -1j. Obviously, this means that we can avoid full complex multiplications. Further, the first column is identical, meaning that we are multiplying the first element of x over and over by the same factor.

It turns out that Kronecker tensor products, "twiddle factors" and a permutation matrix where the index is changed according to the binary represantation flipped is both compact and gives an alternate perspective on how FFTs are computed as a set of sparse matrix operations.

The lines below is a simple Decimation in Frequency (DIF) radix 2 forward FFT. While the steps may seem cumbersome, it is convenient to reuse for forward/inverse FFT, radix4/split-radix or decimation-in-time, while being a fair representation of how in-place FFTs tends to be implemented in the real world, I believe.

N = 4;

x = randn(N, 1) +1j*randn(N, 1);

T1 = exp(-1j*2*pi*([zeros(1, N/2), 0:(N/2-1)]).'/N),

M0 =kron(eye(2), fft(eye(2))),

M1 = kron(fft(eye(2)), eye(2)),

X=bitrevorder(x.'*M1*diag(T1)*M0),

X_ref=fft(x)

assert(max(abs(X(:)-X_ref(:)))<1e-6)

C F Van Loan has a great book on this subject.

— Knut Inge
source

1

If you want to drink from the Firehose of Wisdom, I suggest :

"Fast Transforms - Algorithms, Analyses, Applications" by Douglas F. Elliott, K. Ramamohan Rao

It covers FFT, Hartley, Winograd and applications.

One strong point is that is shows how the FFT is a set of sparse matrix factorizations with bit reversal ordering.

— Fat32
source