USING FFT SOFTWARE LIBRARIES
OUTPUT FORMATS AND SCALING FACTORS

The Fast Fourier Transform is literally just a faster version of the Discrete Fourier Transform. In its naive form, the Discrete Fourier Transform is quite costly to compute, especially if you are interested in computing it in real-time. The FFT is an algorithmic optimization of the DFT which makes fast, real-time computation of the DFT possible on consumer grade hardware. I don’t want to go into the details of the FFT, but I feel obliged to share some quirks and gotchas you might encounter when using FFT libraries to develop software.

This rest of this section is esoteric and technical and there’s no need to read any further unless you’re interested in writing software which uses the DFT/FFT.

When you start developing software which incorporates an FFT library, the first eccentricity you’ll probably notice is that not all libraries output their data in the same format. All good libraries will give you a set of complex numbers, but the layout and ordering of these complex numbers may differ depending upon the library that you use.1 My suggestion is to familiarize yourself with the open-source package Octave, and use Octave’s implementation of the FFT as your ground-truth when comparing different FFT implementations.2

You can compute the FFT of an 8-sample sine wave in Octave using the following commands,

> eight_sample_sine_wave = [0, 0.707, 1, 0.707, 0, -0.707, -1, -0.707]
> fft(eight_sample_sine_wave)

The first command declares a discrete signal called eight_sample_sine_wave. The second command runs the FFT on this signal and will produce the following list of complex numbers as a result. You can compare this output to the results given earlier in our walk-through of the transform,

> ans =
  0.00000 + 0.00000i
  0.00000 - 3.99970i
  0.00000 + 0.00000i
  0.00000 + 0.00030i
  0.00000 + 0.00000i
  0.00000 - 0.00030i
  0.00000 - 0.00000i
  0.00000 + 3.99970i

I consider this to be "normal" output. Some high-performance libraries will output the result with a different layout in order to optimize the memory and speed characteristics of their FFT routines. For example, some libraries will omit the imaginary parts of the DC and Nyquist bins since these are always known to be zero for real-valued inputs.3 Omitting these two components, the library might choose to output the data in a format like this,

[R0, R4, R1, I1, R2, I2, R3, I3, R5, I5, R6, I6, R7, I7]

Since the bins after the Nyquist bin are complex-conjugate mirrors of the bins before the Nyquist bin, some libraries will omit these bins from the output altogether. For example, you might receive your results in a format like this,

[R0, R4, R1, I1, R2, I2, R3, I3]


This page gives a pretty nice overview of the different formats you might encounter when trying to interpret the output of your FFT library. Note too, that the output layout may vary depending upon platform. Good Luck!

The second eccentricity you’ll encounter is that different FFT libraries apply the scaling factor at different points in the FFT⇄IFFT roundtrip. The scaling factor may be applied during the forward transform, during the inverse transform, or partly in the forward transform and partly in the inverse transform. The scaling factor is used to ensure that when performing an FFT and IFFT (Inverse Fast Fourier Transform) the IFFT will produce the original signal. For example, try the following in Octave,4

> fft([0, 1, 0, -1])
> ans =
  0 + 0i
  0 - 2i
  0 + 0i
  0 + 2i
> ifft(ans)
> ans = 0 1 0 -1

> fft([0, 0.707, 1, 0.707, 0, -0.707, -1, -0.707])
> ans =
  0.00000 + 0.00000i
  0.00000 - 3.99970i
  0.00000 + 0.00000i
  0.00000 + 0.00030i
  0.00000 + 0.00000i
  0.00000 - 0.00030i
  0.00000 - 0.00000i
  0.00000 + 3.99970i
> ifft(ans)
> ans = 0.00000 0.70700 1.00000 0.70700 0.00000 -0.70700 -1.00000 -0.70700

Note that the maximum bin magnitude after performing the FFT is related to the length of the input signal. For a four sample sine wave, the maximum is 2, and for an eight sample sine wave the maximum is 4. We need to account for the length of the input signal, and this is where the scaling factor comes in. We must divide our signal by N/2 at some point during the FFT/IFFT round trip if we want to get the original signal back when performing the IFFT.

Some libraries don't perform this scaling. FFTW will return a scaled output and force you to apply the scaling factor yourself. They have good reasons for this, which you can read about here. Other packages like Octave will apply the scaling factor during the IFFT. Some libraries will perform a little bit of the scaling during both the forward and inverse transform by dividing by SQRT(N/2) in both the inverse and forward transform.



Long Story Short
Read the documentation of your library before becoming frustrated and confused by its output. If your output doesn't look like you would expect it to, you should check the output layout and understand how the scaling factor is applied in your library of choice. This stuff is not standardized.
















1. Some libraries will only output the bin magnitudes. This sort of output is fine if you’re only interested in creating simple visualizers, but it won't be appropriate if you want to perform any kind of serious signal processing or transformations on the signal data.

2. Octave is something like an open-source version of MATLAB. If you’re on a Mac, installing Octave is as easy as installing homebrew and then typing 'brew install octave' into your terminal window.

















3. You can convince yourself of this fact by revisiting the sine-waves which are used to compute the imaginary component of the DC and Nyquist bins in our walk-through of the DFT. Note that all of the samples for these sines are zero-valued.
























4. Outputs are shown in grey if you don't have Octave handy.