Discussion:
[SoX-devel] sox spectrogram patches
Martin Guy
2015-12-26 01:23:30 UTC
Permalink
I've started maintaining a branch of things in their absence.
Thanks Eric. I have three commits, all regarding "sox spectrogram":
- remove arbitrary limit on spectrogram output height (was 1200)
- add "spectrogram -n" flag to normalize the image brightness
regardless of audio file loudness
- add FFTW3 support, which is between 150 and 250 times faster than
the slow built-in FFT routine
see commentary in the patches for further details.

They are the last three commits on https://github.com/martinwguy/sox.git
which is a fork of the main sf.net repository and are also attached.

They only touch src/spectrogram.c, and are independent o each other
(i.e. you can apply any or all of them in any order as they do not
conflict with each other). They are also attached as patches.

This were all needed to be able to use "sox spectrogram" for the
production of log-frequency-axis spectrograms of music in
wikidelia.net

Cheers

M
Eric Wong
2015-12-26 10:43:16 UTC
Permalink
Post by Martin Guy
I've started maintaining a branch of things in their absence.
- remove arbitrary limit on spectrogram output height (was 1200)
Seems alright, did you check for possible integer overflow issues
from raising the limit? I was mildly alarmed at this:

+ if (end) memset(p->window, 0, sizeof(*(p->window)) * p->dft_size);

Until I noticed p->window was allocated with calloc which does overflow
checking:

p->window = lsx_calloc(p->dft_size + 1, sizeof(*(p->window)));
Post by Martin Guy
- add "spectrogram -n" flag to normalize the image brightness
regardless of audio file loudness
No issues here.
Post by Martin Guy
- add FFTW3 support, which is between 150 and 250 times faster than
the slow built-in FFT routine
Nice! I've tested with/without FFTW3 installed and can confirm the
speedup. I do have problems with eyesight and do not see well;
so I can't comment on actual images.

I've made some minor edits to configure.ac for portability and
robustness (I realize you're taking nearby statements as a
guideline, but I think we can do better when introducing new code)

- prefer AS_IF macro to generate portable "if" statements in sh

I also realize you're only testing auto-generated variables,
so the next two are unnecessary, but can aid in avoiding red flags
for other reviewers:

- avoid "-a" with "test" since it is error-prone compared to
using "&&" and multiple "test" statements.
E.g. test -n "$x" -a "$a" = "$b"
is buggy if "$x" is "="
(example taken from Documentation/CodingGuidelines in git.git)

- prefix variables with "x" in "test" conditionals.
E.g. test "$var" = yes
is buggy if "$var" is "-n" or other "test" operators
We avoid this problem with: test "x$var" = xyes

I'm also trying to keep configure.ac wrapped at 80 columns
(I wish 64 columns were standard, I want to use bigger fonts).

I also added #endif labels for the larger flow function.
I'd prefer if we could avoid CPP inside C functions entirely;
but that's a larger task and can be split into another patch.

So I'll squash the following and merge into my "pu" branch
for now:

diff --git a/configure.ac b/configure.ac
index 4bc3064..794f19c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -337,19 +337,18 @@ AC_ARG_WITH(fftw,
AS_HELP_STRING([--without-fftw],
[Don't try to use FFTW]))
using_fftw=no
-if test "$with_fftw" != "no"; then
- AC_CHECK_HEADERS(fftw3.h, using_fftw=yes)
- if test "$using_fftw" = "yes"; then
- AC_CHECK_LIB(fftw3, fftw_execute, FFTW_LIBS="$FFTW_LIBS -lfftw3", using_fftw=no)
- fi
- if test "$with_fftw" = "yes" -a "$using_fftw" = "no"; then
- AC_MSG_FAILURE([cannot find FFTW3])
- fi
-fi
-if test "$using_fftw" = yes; then
- AC_DEFINE(HAVE_FFTW, 1, [Define to 1 if you have FFTW3.])
-fi
-AM_CONDITIONAL(HAVE_FFTW, test x$using_fftw = xyes)
+AS_IF([test "x$with_fftw" != xno], [
+ AC_CHECK_HEADERS([fftw3.h], [using_fftw=yes])
+ AS_IF([test "x$using_fftw" = xyes],
+ [ AC_CHECK_LIB([fftw3], [fftw_execute],
+ [FFTW_LIBS="$FFTW_LIBS -lfftw3"], [using_fftw=no]) ])
+
+ AS_IF([test "x$with_fftw" = xyes && test "x$using_fftw" = xno],
+ [ AC_MSG_FAILURE([cannot find FFTW3]) ])
+])
+AS_IF([test "x$using_fftw" = xyes],
+ [ AC_DEFINE(HAVE_FFTW, 1, [Define to 1 if you have FFTW3.]) ])
+AM_CONDITIONAL(HAVE_FFTW, test x"$using_fftw" = xyes)
AC_SUBST(FFTW_LIBS)


diff --git a/src/spectrogram.c b/src/spectrogram.c
index 3d8c208..fb7313e 100644
--- a/src/spectrogram.c
+++ b/src/spectrogram.c
@@ -392,7 +392,7 @@ static int flow(sox_effect_t * effp,
]);
}
p->magnitudes[p->dft_size / 2] += sqr(p->dft_buf[p->dft_size / 2]);
-#else
+#else /* ! HAVE_FFTW */
if (is_p2(p->dft_size)) {
lsx_safe_rdft(p->dft_size, 1, p->dft_buf);
p->magnitudes[0] += sqr(p->dft_buf[0]);
@@ -401,7 +401,7 @@ static int flow(sox_effect_t * effp,
p->magnitudes[p->dft_size >> 1] += sqr(p->dft_buf[1]);
}
else rdft_p(*p->shared_ptr, p->dft_buf, p->magnitudes, p->dft_size);
-#endif
+#endif /* ! HAVE_FFTW */

if (++p->block_num == p->block_steps && do_column(effp) == SOX_EOF)
return SOX_EOF;
Post by Martin Guy
They are the last three commits on https://github.com/martinwguy/sox.git
which is a fork of the main sf.net repository and are also attached.
Thanks, I've also setup a mirror of yours at:

git://80x24.org/mirrors/martinwguy/sox.git

(as well as one for Mans @ s/martinwguy/mansr/)

And I forgot to mention I made a mirror of my own repo at:

git://repo.or.cz/sox/ew.git

------------------------------------------------------------------------------
Martin Guy
2015-12-26 13:37:47 UTC
Permalink
Post by Eric Wong
Post by Martin Guy
- remove arbitrary limit on spectrogram output height (was 1200)
Seems alright, did you check for possible integer overflow issues
from raising the limit?
Only by inspection and testing a range of heights/widths.
Post by Eric Wong
Post by Martin Guy
- add FFTW3 support, which is between 150 and 250 times faster than
the slow built-in FFT routine
I do have problems with eyesight and do not see well;
so I can't comment on actual images.
"cmp" tells me that the PNG files are identical with/without FFTW
Post by Eric Wong
I've made some minor edits to configure.ac for portability and
robustness
Yes, I've done relatively little configure.ac hacking and was hoping
someone who knows better might improve things. Thanks for the heads-up
on these possible issues
Post by Eric Wong
I'd prefer if we could avoid CPP inside C functions entirely;
but that's a larger task and can be split into another patch.
Do you mean, where there is a configure option, have two separate
functions, one with the HAVE_ code and one with the HAVE_NOT code?
Yes, it's ugly, I agree.
One alternative would be to use "if (HAVE_FFTW)" and let the compiler
turn if(0) or if(1) into constant code removal.

Thanks again for stepping in to work on this. I was about to leave sox
to die, given that its fathers have abandoned it. If we can get a last
twitch of life out of them, it would be best they appoint you as its
official maintainer.

Blessings

M

------------------------------------------------------------------------------
Eric Wong
2015-12-26 21:18:13 UTC
Permalink
Post by Martin Guy
"cmp" tells me that the PNG files are identical with/without FFTW
Thanks. I'll note that if I get around to making a test suite.
Post by Martin Guy
Yes, I've done relatively little configure.ac hacking and was hoping
someone who knows better might improve things. Thanks for the heads-up
on these possible issues
No problem. I'm still not too experienced in this area,
either but I started forcing myself to learn in another project.
Post by Martin Guy
Post by Eric Wong
I'd prefer if we could avoid CPP inside C functions entirely;
but that's a larger task and can be split into another patch.
Do you mean, where there is a configure option, have two separate
functions, one with the HAVE_ code and one with the HAVE_NOT code?
Yes, it's ugly, I agree.
One alternative would be to use "if (HAVE_FFTW)" and let the compiler
turn if(0) or if(1) into constant code removal.
I mean defining functions with common signatures so using
them is transparent to higher-level callers.

The #ifdef in the struct seems inevitable, but below I've created
shared_{start,stop} functions which do the same thing inside
their respective "start" and "stop" functions. It helps identify
common code and even brought me to find a memory leak (see below).

diff --git a/src/spectrogram.c b/src/spectrogram.c
index fb7313e..1b82aad 100644
--- a/src/spectrogram.c
+++ b/src/spectrogram.c
@@ -70,7 +70,11 @@ typedef struct {
sox_bool using_stdout; /* output image to stdout */

/* Shared work area */
+#if HAVE_FFTW
+ fftw_plan fftw_plan; /* Used if FFT_type == FFT_fftw */
+#else
double * shared, * * shared_ptr;
+#endif

/* Per-channel work area */
int WORK; /* Start of work area is marked by this dummy variable. */
@@ -84,9 +88,6 @@ typedef struct {
double block_norm, max;
double * magnitudes; /* [(dft_size / 2) + 1] */
float * dBfs;
-#if HAVE_FFTW
- fftw_plan fftw_plan; /* Used if FFT_type == FFT_fftw */
-#endif
} priv_t;

#define secs(cols) \
@@ -169,7 +170,6 @@ static int getopts(sox_effect_t * effp, int argc, char **argv)
p->spectrum_points += 2;
if (p->alt_palette)
p->spectrum_points = min(p->spectrum_points, (int)alt_palette_len);
- p->shared_ptr = &p->shared;
if (!strcmp(p->out_name, "-")) {
if (effp->global_info->global_info->stdout_in_use_by) {
lsx_fail("stdout already in use by `%s'", effp->global_info->global_info->stdout_in_use_by);
@@ -204,8 +204,21 @@ static double make_window(priv_t * p, int end)
return sum;
}

-#if !HAVE_FFTW
+#if HAVE_FFTW
+/* Initialize the FFT routine */
+static void shared_start(sox_effect_t * effp)
+{
+ priv_t * p = (priv_t *)effp->priv;
+ /* We have one FFT plan per flow because the input/output arrays differ. */
+ p->fftw_plan = fftw_plan_r2r_1d(p->dft_size, p->dft_buf, p->dft_buf,
+ FFTW_R2HC, FFTW_MEASURE);
+}
+
+static void shared_stop(priv_t * p) {
+ (void)p; /* nothing */
+}

+#else /* !HAVE_FFTW */
static double * rdft_init(size_t n)
{
double * q = lsx_malloc(2 * (n / 2 + 1) * n * sizeof(*q)), * p = q;
@@ -227,7 +240,24 @@ static void rdft_p(double const * q, double const * in, double * out, int n)
}
}

-#endif /* HAVE_FFTW */
+static void shared_start(sox_effect_t * effp)
+{
+ priv_t * p = (priv_t *)effp->priv;
+
+ p->shared_ptr = &p->shared;
+ if (!is_p2(p->dft_size) && !effp->flow) {
+ if (p->y_size) {
+ p->shared = rdft_init((size_t)(p->dft_size));
+ }
+ lsx_safe_rdft(p->dft_size, 1, p->dft_buf);
+ }
+}
+
+static void shared_stop(priv_t * p)
+{
+ free(p->shared);
+}
+#endif /* !HAVE_FFTW */

static int start(sox_effect_t * effp)
{
@@ -277,10 +307,6 @@ static int start(sox_effect_t * effp)

if (p->y_size) {
p->dft_size = 2 * (p->y_size - 1);
-#if !HAVE_FFTW
- if (!is_p2(p->dft_size) && !effp->flow)
- p->shared = rdft_init((size_t)(p->dft_size));
-#endif
} else {
int y = max(32, (p->Y_size? p->Y_size : 550) / effp->in_signal.channels - 2);
for (p->dft_size = 128; p->dft_size <= y; p->dft_size <<= 1);
@@ -292,15 +318,7 @@ static int start(sox_effect_t * effp)
p->window = lsx_calloc(p->dft_size + 1, sizeof(*(p->window)));
p->magnitudes = lsx_calloc((p->dft_size / 2) + 1, sizeof(*(p->magnitudes)));

- /* Initialize the FFT routine */
-#if HAVE_FFTW
- /* We have one FFT plan per flow because the input/output arrays differ. */
- p->fftw_plan = fftw_plan_r2r_1d(p->dft_size, p->dft_buf, p->dft_buf,
- FFTW_R2HC, FFTW_MEASURE);
-#else
- if (is_p2(p->dft_size) && !effp->flow)
- lsx_safe_rdft(p->dft_size, 1, p->dft_buf);
-#endif
+ shared_start(effp);

lsx_debug("duration=%g x_size=%i pixels_per_sec=%g dft_size=%i", duration, p->x_size, pixels_per_sec, p->dft_size);

@@ -596,7 +614,7 @@ static int stop(sox_effect_t * effp) /* only called, by end(), on flow 0 */
double limit;
float autogain = 0.0; /* Is changed if the -n flag was supplied */

- free(p->shared);
+ shared_stop(p);
if (p->using_stdout) {
SET_BINARY_MODE(stdout);
file = stdout;

Which also brings me to the question:

Do we need to we need to teardown fftw_plan to avoid resource leaks?

valgrind --leak-check=full says we do:

18,040 (24 direct, 18,016 indirect) bytes in 1 blocks are definitely lost in loss record 204 of 206
at 0x4C27ED6: memalign (vg_replace_malloc.c:755)
by 0x5D67944: fftw_malloc_plain (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
by 0x5E2EFBE: fftw_mkapiplan (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
by 0x5E338E2: fftw_plan_many_r2r (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
by 0x5E33A28: fftw_plan_r2r (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
by 0x5E33948: fftw_plan_r2r_1d (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
by 0x4E8560F: start (spectrogram.c:298)
by 0x4E5CF4F: sox_add_effect (effects.c:157)
by 0x406EB0: add_effect (sox.c:708)
by 0x403CED: main (sox.c:1073)

24,744 (24 direct, 24,720 indirect) bytes in 1 blocks are definitely lost in loss record 205 of 206
at 0x4C27ED6: memalign (vg_replace_malloc.c:755)
by 0x5D67944: fftw_malloc_plain (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
by 0x5E2EFBE: fftw_mkapiplan (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
by 0x5E338E2: fftw_plan_many_r2r (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
by 0x5E33A28: fftw_plan_r2r (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
by 0x5E33948: fftw_plan_r2r_1d (in /usr/lib/x86_64-linux-gnu/libfftw3.so.3.3.2)
by 0x4E8560F: start (spectrogram.c:298)
by 0x4E5D0D7: sox_add_effect (effects.c:202)
by 0x406EB0: add_effect (sox.c:708)
by 0x403CED: main (sox.c:1073)

Btw, I haven't touched "flow", that is an exercise for the reader :)
Post by Martin Guy
Thanks again for stepping in to work on this. I was about to leave sox
to die, given that its fathers have abandoned it. If we can get a last
twitch of life out of them, it would be best they appoint you as its
official maintainer.
No problem. I hope they come back. It is certainly a scary thought
if I were to become an official... anything :x

------------------------------------------------------------------------------
Martin Guy
2015-12-26 22:14:52 UTC
Permalink
Post by Eric Wong
--- a/src/spectrogram.c
+++ b/src/spectrogram.c
@@ -70,7 +70,11 @@ typedef struct {
sox_bool using_stdout; /* output image to stdout */
/* Shared work area */
+#if HAVE_FFTW
+ fftw_plan fftw_plan; /* Used if FFT_type == FFT_fftw */
+#else
double * shared, * * shared_ptr;
+#endif
/* Per-channel work area */
int WORK; /* Start of work area is marked by this dummy variable. */
@@ -84,9 +88,6 @@ typedef struct {
double block_norm, max;
double * magnitudes; /* [(dft_size / 2) + 1] */
float * dBfs;
-#if HAVE_FFTW
- fftw_plan fftw_plan; /* Used if FFT_type == FFT_fftw */
-#endif
} priv_t;
#define secs(cols) \
+ /* We have one FFT plan per flow because the input/output arrays differ. */
+ p->fftw_plan = fftw_plan_r2r_1d(p->dft_size, p->dft_buf, p->dft_buf,
+ FFTW_R2HC, FFTW_MEASURE);
+}
This may not be right. FFTW3 plans depend on the memory addresses of
the input and output vectors, so if you have two FFTs that are exactly
the same except for the input and output buffer addresses, they need
separate plans.
In this case, the plan depends on dft_buf, which seems to be specific
to each flow, so I think you should have a separare plan for each
flow.

I don't know what "shared" is used for, so I just left it alone. AFter
all, it's ore important that the code work, than to save four bytes of
RAM and risk breaking it...

By the way, that "/* Used if FFT_type == FFT_fftw */" comment is
stale, dating back to when I had an option to choose the FFT algorithm
at runtime - sorry about that...

Thanks

M

------------------------------------------------------------------------------
Eric Wong
2015-12-26 23:15:53 UTC
Permalink
Post by Martin Guy
Post by Eric Wong
--- a/src/spectrogram.c
+++ b/src/spectrogram.c
@@ -70,7 +70,11 @@ typedef struct {
sox_bool using_stdout; /* output image to stdout */
/* Shared work area */
+#if HAVE_FFTW
+ fftw_plan fftw_plan; /* Used if FFT_type == FFT_fftw */
+#else
double * shared, * * shared_ptr;
+#endif
/* Per-channel work area */
int WORK; /* Start of work area is marked by this dummy variable. */
@@ -84,9 +88,6 @@ typedef struct {
double block_norm, max;
double * magnitudes; /* [(dft_size / 2) + 1] */
float * dBfs;
-#if HAVE_FFTW
- fftw_plan fftw_plan; /* Used if FFT_type == FFT_fftw */
-#endif
} priv_t;
#define secs(cols) \
+ /* We have one FFT plan per flow because the input/output arrays differ. */
+ p->fftw_plan = fftw_plan_r2r_1d(p->dft_size, p->dft_buf, p->dft_buf,
+ FFTW_R2HC, FFTW_MEASURE);
+}
This may not be right. FFTW3 plans depend on the memory addresses of
the input and output vectors, so if you have two FFTs that are exactly
the same except for the input and output buffer addresses, they need
separate plans.
In this case, the plan depends on dft_buf, which seems to be specific
to each flow, so I think you should have a separare plan for each
flow.
I'm not entirely sure what you mean, here. Basically the shared_start
function doesn't change anything for FFTW users; and I've verified the
PNGs of using FFTW with/without my work-in-progress patch via cmp(1)
with identical results.

Or did you mean the location of fftw_plan field inside priv_t?
That doesn't seem to matter, does it?

I don't know the FFT stuff and math stuff well at all,
so I could be wrong.
Post by Martin Guy
I don't know what "shared" is used for, so I just left it alone. AFter
all, it's ore important that the code work, than to save four bytes of
RAM and risk breaking it...
Heh, I did break it --without-fftw :x I wasn't even thinking about
the RAM usage, but about documenting how the structure would be used.

Here's my work-in-progress fix, output now matches the FFTW-using PNG:

diff --git a/src/spectrogram.c b/src/spectrogram.c
index 1b82aad..d1e98c5 100644
--- a/src/spectrogram.c
+++ b/src/spectrogram.c
@@ -72,9 +72,8 @@ typedef struct {
/* Shared work area */
#if HAVE_FFTW
fftw_plan fftw_plan; /* Used if FFT_type == FFT_fftw */
-#else
- double * shared, * * shared_ptr;
#endif
+ double * shared, * * shared_ptr; /* unused for FFTW */

/* Per-channel work area */
int WORK; /* Start of work area is marked by this dummy variable. */
@@ -170,6 +169,7 @@ static int getopts(sox_effect_t * effp, int argc, char **argv)
p->spectrum_points += 2;
if (p->alt_palette)
p->spectrum_points = min(p->spectrum_points, (int)alt_palette_len);
+ p->shared_ptr = &p->shared;
if (!strcmp(p->out_name, "-")) {
if (effp->global_info->global_info->stdout_in_use_by) {
lsx_fail("stdout already in use by `%s'", effp->global_info->global_info->stdout_in_use_by);
@@ -244,12 +244,14 @@ static void shared_start(sox_effect_t * effp)
{
priv_t * p = (priv_t *)effp->priv;

- p->shared_ptr = &p->shared;
- if (!is_p2(p->dft_size) && !effp->flow) {
- if (p->y_size) {
- p->shared = rdft_init((size_t)(p->dft_size));
+ if (!effp->flow) {
+ if (!is_p2(p->dft_size)) {
+ if (p->y_size) {
+ p->shared = rdft_init((size_t)(p->dft_size));
+ }
+ } else {
+ lsx_safe_rdft(p->dft_size, 1, p->dft_buf);
}
- lsx_safe_rdft(p->dft_size, 1, p->dft_buf);
}
}
Post by Martin Guy
By the way, that "/* Used if FFT_type == FFT_fftw */" comment is
stale, dating back to when I had an option to choose the FFT algorithm
at runtime - sorry about that...
No worries. Better we catch our mistakes now instead of
throwing a nasty heap at Chris and the gang when they
decide to return :)

------------------------------------------------------------------------------
Måns Rullgård
2015-12-28 22:11:03 UTC
Permalink
Post by Eric Wong
Post by Martin Guy
Thanks again for stepping in to work on this. I was about to leave sox
to die, given that its fathers have abandoned it. If we can get a last
twitch of life out of them, it would be best they appoint you as its
official maintainer.
No problem. I hope they come back. It is certainly a scary thought
if I were to become an official... anything :x
I'd be willing to co-maintain it, should it come to that.
--
Måns Rullgård

------------------------------------------------------------------------------
Eric Wong
2015-12-29 22:26:26 UTC
Permalink
Post by Måns Rullgård
I'd be willing to co-maintain it, should it come to that.
Thank you, much appreciated!

------------------------------------------------------------------------------
Loading...