Building SoundFrequencyMapperFFT: Fast FFT-Based Frequency Mapping in C++

SoundFrequencyMapperFFT Tutorial: From Waveform to Frequency Bins

This tutorial walks through building a SoundFrequencyMapperFFT — a straightforward pipeline that converts an audio waveform into frequency bins using the Fast Fourier Transform (FFT). It covers concepts, windowing, FFT implementation, mapping bins to frequencies, and a simple C++ example using the FFTW library. Assumptions: mono PCM audio (floating-point), sample rate 44.1 kHz, and buffer sizes that are powers of two.

1. Overview — pipeline steps

  1. Capture or load PCM waveform samples.
  2. Apply a window function to reduce spectral leakage.
  3. Compute FFT on the windowed samples.
  4. Convert complex FFT output to magnitude (frequency spectrum).
  5. Map FFT bins to frequency values and group into desired bins (linear or logarithmic).
  6. Post-process (smoothing, peak detection, normalization) and visualize or use for analysis.

2. Key concepts

  • Sample rate (Fs): number of samples per second (Hz).
  • Frame / window size (N): number of samples per FFT; determines frequency resolution Δf = Fs / N.
  • Nyquist frequency: Fs/2, the highest representable frequency. FFT outputs N complex bins; positive frequencies are in bins 0..N/2.
  • Windowing: common windows — Hann, Hamming, Blackman — reduce leakage at the cost of widening peaks.
  • Magnitude spectrum: sqrt(re^2 + im^2) or use hypot(re, im); often converted to dB via 20log10(mag).
  • Bin frequency: bin k corresponds to k(Fs/N). For real input, use k = 0..N/2.

3. Choosing N and hop size

  • Frequency resolution Δf = Fs/N. For 1 Hz resolution at 44.1 kHz, N=44100 (large, high latency).
  • Time resolution trades off with frequency resolution. Choose hop size (overlap) typically 50%–75% of N for smoother spectrograms.

4. Window functions

Use a Hann window for general-purpose analysis: w[n] = 0.5(1 – cos(2πn/(N-1))), n=0..N-1. Apply sample-wise: xw[n] = x[n] * w[n].

5. FFT implementation notes

  • Use an efficient FFT library (FFTW, KissFFT, Intel MKL).
  • For real-valued input, use real-to-complex transforms to save computation and memory.
  • Normalize FFT output depending on library: some scale by N, others do not.

6. Mapping bins to frequency bins (linear and logarithmic)

  • Linear mapping: each FFT bin is its own frequency. For grouping, sum or average magnitudes across target bin ranges.
  • Logarithmic (octave or mel-like): map FFT bins to log-spaced bins — useful for audio perception and visualization. Compute target bin edges in Hz, then include FFT bins whose center frequencies fall within each edge range.

7. Post-processing

  • Convert magnitudes to dB: magdb = 20*log10(mag + ε).
  • Smoothing: exponential moving average per bin: s[t] = α*s[t-1] + (1-α)*current.
  • Peak detection: find local maxima across neighboring bins and/or frames.

8. C++ example using FFTW

  • Requirements: FFTW library installed. Link with -lfftw3 -lm.
  • Example code (mono float input, N power of two, Hann window, real-to-complex FFT, linear mapping to N/2+1 bins):

cpp

#include #include #include #include constexpr int N = 4096; constexpr double Fs = 44100.0; std::vector<double> make_hann(int N){ std::vector<double> w(N); for(int n=0;n<N;n++) w[n]=0.5(1.0 - cos(2.0M_PIn/(N-1))); return w; } int main(){ // Example: fill ‘in’ with N samples from audio source std::vector<double> in(N); // TODO: replace with real audio read for(int i=0;i<N;i++) in[i]=0.0; auto window = make_hann(N); std::vector<double> inw(N); for(int i=0;i<N;i++) inw[i]=in[i]window[i]; int outSize = N/2 + 1; std::vector<fftw_complex> out(outSize); fftw_plan plan = fftw_plan_dft_r2c_1d(N, inw.data(), out.data(), FFTW_ESTIMATE); fftw_execute(plan); std::vector<double> magn(outSize); for(int k=0;k<outSize;k++){ double re = out[k][0]; double im = out[k][1]; magn[k] = sqrt(rere + imim) / N; // normalize by N } // Map bin k to frequency f = k * Fs / N for(int k=0;k<outSize;k++){ double freq = k * Fs / N; printf(“Bin %d: f=%.2f Hz, mag=%.6f “, k, freq, magn[k]); } fftw_destroy_plan(plan); return 0; }

9. Practical tips

  • Use overlap-add or circular buffers for continuous streaming.
  • Apply pre-emphasis (high-pass) if focusing on high-frequency content.
  • Choose window and N based on trade-offs (speech vs music vs instrumentation).
  • For latency-sensitive apps, prefer smaller N and efficient libraries; consider hardware FFTs.

10. Further extensions

  • Compute mel-spectrograms by mapping FFT magnitudes to mel bands.
  • Use peak-picking and harmonic grouping for pitch detection.
  • Visualize with a log-frequency spectrogram or real-time waterfall display.

This provides a compact, actionable path from raw waveform to frequency bins using FFT. Use the C++ example with an actual audio input source and adjust N, window, and mapping to fit your application.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *