识别带有静音的 WAV 文件在 Java 中

huangapple go评论60阅读模式
英文:

Recognize wav files with silence in Java

问题

我需要一个类似这样的JAVA函数:

输入:.wav文件(或byte[]文件字节)
输出:真/假(文件仅包含静音)

最佳实现方法是什么?

谢谢。

更新:

  1. 我用于录制的命令:

    arecord --format=S16_LE --max-file-time=60 --rate=16000 --file-type=wav randomvalue_i.wav

  2. 静音 = 完全没有音频

英文:

I need a function in JAVA, something like this:

Input: .wav file (or byte[] fileBytes)
Output: true/false (the file consists of silence only)

What is the best way to do it?

Thank you.

UPDATE:

  1. The command that I use for recording:

    arecord --format=S16_LE --max-file-time=60 --rate=16000 --file-type=wav randomvalue_i.wav

  2. Silent = no audio at all

答案1

得分: 2

好的,以下是翻译好的内容:

好的,简短的回答是,您将需要扫描 .WAV 数据并对其进行最小/最大值处理。一个“静音”文件的值基本上都应该是 0。

更详细的回答是,您将需要了解 .WAV 格式,您可以在这里找到描述(http://soundfile.sapp.org/doc/WaveFormat/)。您可能可以跳过前 44 个字节(RIFF、'fmt')以进入数据部分,然后开始查看字节。头部中的“bits-per-sample”值可能很重要,因为 16 位样本意味着您需要将 2 个“字节”合并在一起以获得单个样本。但是,即使如此,对于静音的 16 位样本文件,这两个字节都应该是 0。对于 NumChannels 也是一样的 - 理论上您应该了解它,但对于真正的“静音”,两者都应该是 0。如果所有数据都是“0”,那就是静音的。

“静音”有点含糊。在上面,我很严格,假设它只意味着真正的“0”。然而,在一个静音的房间里,仍然会有非常低水平的背景环境噪音。在这种情况下,您需要对比较更宽容一些。例如,为每个样本计算最小/最大值,并确保范围在某个容限内。仍然可以确定,但这会增加代码量。

为了完整起见:

public boolean isSilent(byte[] info) {
    for (int idx = 44; idx < info.length; ++idx) {
        if (info[idx] != 0)
            return false;
    }
    return true;
}
英文:

Well, the short answer is you'll want to scan the .WAV data and do a min/max value on it. A "silent" file the values should essentially all be 0.

The longer answer is that you'll want to understand the .WAV format, which you can find described here (http://soundfile.sapp.org/doc/WaveFormat/). You can probably skip over the first 44 bytes (RIFF, 'fmt') to get down to the data, then start looking at the bytes. The 'bits-per-sample' value from the header might be important, as 16-bit samples would mean you'd need to consolidate 2 'bytes' together to get a single sample. But, even so, both bytes would be 0 for a silent, 16-bit sample file. Ditto for NumChannels - in theory you should understand it, but again, both should be 0 for true 'silent'. If all the data is '0', it's silent.

"Silent" is a bit ambiguous. Above, I was strict and assumed it meant true '0' only. However, in a silent room, there would still be very low levels of background ambient noise. In that case, you'd need to be a bit more forgiving about the comparison. e.g. calculate a min/max for each sample, and insure that the range is within some tolerance. It can still be determined, but it just adds code.

For completeness:

public boolean isSilent(byte[] info) {
    for (int idx = 44; idx &lt; info.length; ++idx) {
        if (info[idx] != 0)
            return false;
    }
    return true;
}

答案2

得分: 0

我编写了一个函数,似乎在检测静音与非静音方面表现得非常出色:

private boolean isSilent(byte[] byteArray) {
    IntBuffer intBuf = ByteBuffer.wrap(byteArray).order(ByteOrder.BIG_ENDIAN).asIntBuffer();
    int[] array = new int[intBuf.remaining()];
    intBuf.get(array);
    StandardDeviation sd = new StandardDeviation();
    double[] doubles = Arrays.stream(array).asDoubleStream().toArray();
    double stddev = sd.evaluate(doubles);
    logger.info("stddev: {}", stddev);
    return !(stddev > 10000000D);
}

基本上分析声音,如果发现标准偏差较小的变化,就假定它是“大部分”安静的;如果标准偏差较大的变化,则假定它不是静音的。静音或安静与有声音的部分之间的差异相当大。我发现标准偏差值大约超过10^6或以上,表明音频剪辑中没有静音部分。

英文:

i wrote a function which seems to do a really good job in detecting silence vs. non-silence:

 private boolean isSilent(byte[] byteArray) {
    IntBuffer intBuf = ByteBuffer.wrap(byteArray).order(ByteOrder.BIG_ENDIAN).asIntBuffer();
    int[] array = new int[intBuf.remaining()];
    intBuf.get(array);
    StandardDeviation sd = new StandardDeviation();
    double[] doubles = Arrays.stream(array).asDoubleStream().toArray();
    double stddev = sd.evaluate(doubles);
    logger.info(&quot;stddev: {}&quot;, stddev);
    return !(stddev &gt; 10000000D);
}

basically analyzes the sound and if it finds small standard deviation moves, then it presumes it is "mostly" quiet, and if the stddev moves are larger, it presume that it is not silent. the differential between "silent" or "quiet" and ones with sound is fairly large. i found that a value of over around 10^6 or above is indicative of there being no silence in the audio clip.

答案3

得分: -2

你可以有一个被你认为是“静音”的.wav文件,并将其与另一个.wav文件进行比较,看看它们是否具有相同的频率。

英文:

You could have a .wav file that is what you consider "silence" and compare it to the other .wav file to see if they have the same frequency.

huangapple
  • 本文由 发表于 2020年10月4日 19:28:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/64194039.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定