Jan Stary
2014-03-18 11:09:21 UTC
There seem to be problems with the silence effect.
I believe it has been brought up some time ago,
but here is a (longer) complete story with examples.
This is the test file I will be testing on, using 14.4.1:
sox -D -n -c 1 file.wav synth 3 trap 440 sin 480 gain -6 pad ***@0 ***@1 ***@2 ***@3
That makes it three seconds of a dial tone, interpadded with four
seconds of silence: "silence TONE silence TONE silence TONE silence",
seven seconds in total.
First, basic silence trimming at the beginning of the file:
When above-periods is non-zero, you must also specify a duration
and threshold. Duration indications the amount of time that non-
silence must be detected before it stops trimming audio.
I think it would be an improvement if the manpage said explicitly
that the non-silence that is detected (and stops the trimming)
remains itself in the output stream; as opposed to only starting
the output with samples that come _after_ that non-silence.
At least that's how I understand what the manpage says,
and it is what any of the following commands does,
resulting in six seconds starting with the first tone.
sox file.wav out.wav silence 1 0.1 10%
sox file.wav out.wav silence 1 0.5 10%
sox file.wav out.wav silence 1 1.0 10%
There seems to be some rounding (buffers?) involved, e.g.
sox file.wav out.wav silence 1 1.01 10%
produces the same, although there is no occurence
of a non-silence of length 1.01 in the source file.
On the other hand,
sox file.wav out.wav silence 1 1.02 10%
already does the expected thing, i.e. results in an empty file.
However, SoX does not fill in the zero length in the header:
Input File : 'out.wav'
Channels : 1
Sample Rate : 48000
Precision : 32-bit
Sample Encoding: 32-bit Signed Integer PCM
Now, trimming up to the _second_ non-silence
already presents a problem for me:
sox file.wav out.wav silence 2 0.1 10%
I would expect this to trim the leading "silence TONE silence"
and result in an output file starting with the second TONE
(as the second above-period). That's the intended behaviour, right?
For example, if you had an audio file with two songs that each
contained 2 seconds of silence before the song, you could specify
an above-period of 2 to strip out both silence periods and the first
song.
That's my situation. But no, the result is a 00:00:05.90 file
where the first silence and the first 0.1 second of the first
tone are removed. If this is the intended behaviour,
the two-songs example is wrong.
It seems that instead of the first TONE counting
as the first above-period (to be trimmed) and the second TONE
counting as the second above-period (to start the output),
only the first 0.1 seconds if the first TONE count as
the first above-period (trimmed), and after that the output begins.
That's what the above command seems to do.
But is that intended? With the above two-songs example
from the manpage, specifying "silence 2 3 2%" would
just trim the first silence and the first 3 seconds
of the first song, as in my example, right? Let's try:
sox -D -n -c 1 songs.wav synth 60 trap 440 sin 480 gain -6 pad ***@0 ***@30
That's 00:02 of silence, 00:30 of song, 00:02 of silence, 00:30 of song,
as in the manpage example. Now running
sox songs.wav out.wav silence 1 3 10%
does the expected thing: trims the first 00:02 of silence away,
and leaves the rest as 00:30 + 00:02 + 00:30 of output.
Now running "sox songs.wav out.wav silence 2 3 10%" should trim
the first silence, the first song, and the second silence - right?
That's what the example says, but that's not the case:
the result is the same as before, i.e. only the first
00:02 of silence is removed.
That seems wrong, and is also inconsistent with the previous example:
if it was to do the same, the first 00:03 above-period (i.e. the first
00:03 of the first song) would be removed and the rest would
go in the output, right?
Whichever the expected behaviour is, there seems to be a bug.
Or am I missing something in what the manage says?
There are other problems with the silence effect
(trimming from the end), but let's resolve this first.
Thank you for you time
Jan
I believe it has been brought up some time ago,
but here is a (longer) complete story with examples.
This is the test file I will be testing on, using 14.4.1:
sox -D -n -c 1 file.wav synth 3 trap 440 sin 480 gain -6 pad ***@0 ***@1 ***@2 ***@3
That makes it three seconds of a dial tone, interpadded with four
seconds of silence: "silence TONE silence TONE silence TONE silence",
seven seconds in total.
First, basic silence trimming at the beginning of the file:
When above-periods is non-zero, you must also specify a duration
and threshold. Duration indications the amount of time that non-
silence must be detected before it stops trimming audio.
I think it would be an improvement if the manpage said explicitly
that the non-silence that is detected (and stops the trimming)
remains itself in the output stream; as opposed to only starting
the output with samples that come _after_ that non-silence.
At least that's how I understand what the manpage says,
and it is what any of the following commands does,
resulting in six seconds starting with the first tone.
sox file.wav out.wav silence 1 0.1 10%
sox file.wav out.wav silence 1 0.5 10%
sox file.wav out.wav silence 1 1.0 10%
There seems to be some rounding (buffers?) involved, e.g.
sox file.wav out.wav silence 1 1.01 10%
produces the same, although there is no occurence
of a non-silence of length 1.01 in the source file.
On the other hand,
sox file.wav out.wav silence 1 1.02 10%
already does the expected thing, i.e. results in an empty file.
However, SoX does not fill in the zero length in the header:
Input File : 'out.wav'
Channels : 1
Sample Rate : 48000
Precision : 32-bit
Sample Encoding: 32-bit Signed Integer PCM
Now, trimming up to the _second_ non-silence
already presents a problem for me:
sox file.wav out.wav silence 2 0.1 10%
I would expect this to trim the leading "silence TONE silence"
and result in an output file starting with the second TONE
(as the second above-period). That's the intended behaviour, right?
For example, if you had an audio file with two songs that each
contained 2 seconds of silence before the song, you could specify
an above-period of 2 to strip out both silence periods and the first
song.
That's my situation. But no, the result is a 00:00:05.90 file
where the first silence and the first 0.1 second of the first
tone are removed. If this is the intended behaviour,
the two-songs example is wrong.
It seems that instead of the first TONE counting
as the first above-period (to be trimmed) and the second TONE
counting as the second above-period (to start the output),
only the first 0.1 seconds if the first TONE count as
the first above-period (trimmed), and after that the output begins.
That's what the above command seems to do.
But is that intended? With the above two-songs example
from the manpage, specifying "silence 2 3 2%" would
just trim the first silence and the first 3 seconds
of the first song, as in my example, right? Let's try:
sox -D -n -c 1 songs.wav synth 60 trap 440 sin 480 gain -6 pad ***@0 ***@30
That's 00:02 of silence, 00:30 of song, 00:02 of silence, 00:30 of song,
as in the manpage example. Now running
sox songs.wav out.wav silence 1 3 10%
does the expected thing: trims the first 00:02 of silence away,
and leaves the rest as 00:30 + 00:02 + 00:30 of output.
Now running "sox songs.wav out.wav silence 2 3 10%" should trim
the first silence, the first song, and the second silence - right?
That's what the example says, but that's not the case:
the result is the same as before, i.e. only the first
00:02 of silence is removed.
That seems wrong, and is also inconsistent with the previous example:
if it was to do the same, the first 00:03 above-period (i.e. the first
00:03 of the first song) would be removed and the rest would
go in the output, right?
Whichever the expected behaviour is, there seems to be a bug.
Or am I missing something in what the manage says?
There are other problems with the silence effect
(trimming from the end), but let's resolve this first.
Thank you for you time
Jan