Martin Guy
2015-12-18 20:32:34 UTC
Hi! I've been looking at "sox spectrogram" internals, and would
appreciate some feedback on what I've found before I submit them. My
application wraps sox in a shell script to apply a log frequency axis
to a raw spectrogram. The changes I needed were.
- remove the maximum graph height limit (was 1200 for no obvious reason)
- add FFTW3 support for a 150-times speed increase on FFTs of size
other than 2^n
spectrogram has two FFT routines: lsx_safe_rdft() which is fast but
only works for dft sizes a power of two (-y for powers of two plus
one) and its own personal "rdft_p()" for all other sizes.
Surprisingly, the built-in FFT is faster than FFTW3.
For rdft_p() it's very different. Other than being between 60 and 150
times slower than lsx_safe_rdft(), it pre-allocates an immense array 2
* N x N/2 double-floats, which is 2GB for 8192-sized FFTs and 8GB for
16384-height.
By comparison, lsx_safe_rdft() uses 200MB for an 8192-sized DFT, not 2GB.
So, my question is, should I just dump rdft_p() and require FFTW
support for non-power-of-two FFTs? Or is it better to keep it, with
all its limitations, so that sox spectrogram will continue to work at
all sizes when sox is compiled without FFTW support?
Thanks
M
------------------------------------------------------------------------------
appreciate some feedback on what I've found before I submit them. My
application wraps sox in a shell script to apply a log frequency axis
to a raw spectrogram. The changes I needed were.
- remove the maximum graph height limit (was 1200 for no obvious reason)
- add FFTW3 support for a 150-times speed increase on FFTs of size
other than 2^n
spectrogram has two FFT routines: lsx_safe_rdft() which is fast but
only works for dft sizes a power of two (-y for powers of two plus
one) and its own personal "rdft_p()" for all other sizes.
Surprisingly, the built-in FFT is faster than FFTW3.
For rdft_p() it's very different. Other than being between 60 and 150
times slower than lsx_safe_rdft(), it pre-allocates an immense array 2
* N x N/2 double-floats, which is 2GB for 8192-sized FFTs and 8GB for
16384-height.
By comparison, lsx_safe_rdft() uses 200MB for an 8192-sized DFT, not 2GB.
So, my question is, should I just dump rdft_p() and require FFTW
support for non-power-of-two FFTs? Or is it better to keep it, with
all its limitations, so that sox spectrogram will continue to work at
all sizes when sox is compiled without FFTW support?
Thanks
M
------------------------------------------------------------------------------