为什么使用Java的AudioSystem时,滑音频率会变得太高?

huangapple go评论78阅读模式
英文:

Why glissando frequency goes up too high using java Audiosystem

问题

我尝试从起始音符滑奏到终止音符以创建一个滑奏(平稳的音高升高),具体的 Java 代码如下所示。我从起始音符频率线性地升高到终止音符频率,代码如下所示:

		for (i = 0; i < b1.length; i++) {
			instantFrequency = startFrequency + (i * deltaFreq / nrOfSamples);
			b1[i] = (byte) (127 * Math.sin(2 * Math.PI * instantFrequency * i / sampleRate));
		}

生成的音频片段中,滑奏的末尾明显比终止音符具有更高的音高。是否是我的数学出了问题,还是有声学上的原因导致这种上升的正弦波似乎超出了预期的音高?非常感谢任何想法!

public static void main(String[] args) throws IOException {
		int sampleRate = 44100;
		int sampleSizeInBits = 8;
		int nrOfChannels = 1;

		byte[] sine220 = createTimedSine(220, sampleRate, 0.5);
		byte[] gliss220to440 = createTimedGlissando(220, 440, sampleRate, 4);
		byte[] sine440 = createTimedSine(440, sampleRate, 2);
		byte[] fullWave = concatenate(sine220, gliss220to440, sine440);

		AudioInputStream stream = new AudioInputStream(new ByteArrayInputStream(fullWave),
				new AudioFormat(sampleRate, sampleSizeInBits, nrOfChannels, true, false), fullWave.length);

		File fileOut = new File(path, filename);
		Type wavType = AudioFileFormat.Type.WAVE;
		try {
			AudioSystem.write(stream, wavType, fileOut);
		} catch (IOException e) {
			System.out.println("Error writing output file '" + filename + "': " + e.getMessage());
		}
	}

	public static byte[] createTimedSine(float frequency, int samplingRate, double duration) {
		int nrOfSamples = (int) Math.round(duration * samplingRate);
		return (createSampledSine(nrOfSamples, frequency, samplingRate));
	}

	public static byte[] createSampledSine(int nrOfSamples, float frequency, int sampleRate) {
		byte[] b1 = new byte[nrOfSamples];

		int i;
		for (i = 0; i < b1.length; i++) {
			b1[i] = (byte) (127 * Math.sin(2 * Math.PI * frequency * i / sampleRate));
		}
		System.out.println("Freq of sine: " + frequency);
		return b1;
	}

	public static byte[] createTimedGlissando(float startFrequency, float stopFrequency, int samplingRate,
			double duration) {
		int nrOfSamples = (int) Math.round(duration * samplingRate);

		return (createGlissando(nrOfSamples, startFrequency, stopFrequency, samplingRate));
	}

	public static byte[] createGlissando(int nrOfSamples, float startFrequency, float stopFrequency, int sampleRate) {
		byte[] b1 = new byte[nrOfSamples];
		float deltaFreq = (stopFrequency - startFrequency);
		float instantFrequency = 0;
		int i;
		for (i = 0; i < b1.length; i++) {
			instantFrequency = startFrequency + (i * deltaFreq / nrOfSamples);
			b1[i] = (byte) (127 * Math.sin(2 * Math.PI * instantFrequency * i / sampleRate));
		}
		System.out.println("Start freq glissando :" + startFrequency);
		System.out.println("Stop freq glissando :" + instantFrequency);
		return b1;
	}

	static byte[] concatenate(byte[] a, byte[] b, byte[] c) throws IOException {
		ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
		outputStream.write(a);
		outputStream.write(b);
		outputStream.write(c);

		byte d[] = outputStream.toByteArray();
		return d;
	}

控制台输出:

Freq of sine: 220.0
Start freq glissando :220.0
Stop freq glissando :439.9975
Freq of sine: 440.0
英文:

I try to create a glissando (smooth pitch rise) from a start note to an end note (java code below). I linearly rise from the start note frequency to the stop note frequency like this

		for (i = 0; i &lt; b1.length; i++) {
instantFrequency = startFrequency + (i * deltaFreq / nrOfSamples);
b1[i] = (byte) (127 * Math.sin(2 * Math.PI * instantFrequency * i / sampleRate));
}

In the resulting audio fragment, the end of the glissando clearly has a higher pitch than the stop note. Is there something wrong with my math or is there an audiological reason why this rising sine seems to overshoot? Any ideas are greatly appreciated!

public static void main(String[] args) throws IOException {
int sampleRate = 44100;
int sampleSizeInBits = 8;
int nrOfChannels = 1;
byte[] sine220 = createTimedSine(220, sampleRate, 0.5);
byte[] gliss220to440 = createTimedGlissando(220, 440, sampleRate, 4);
byte[] sine440 = createTimedSine(440, sampleRate, 2);
byte[] fullWave = concatenate(sine220, gliss220to440, sine440);
AudioInputStream stream = new AudioInputStream(new ByteArrayInputStream(fullWave),
new AudioFormat(sampleRate, sampleSizeInBits, nrOfChannels, true, false), fullWave.length);
File fileOut = new File(path, filename);
Type wavType = AudioFileFormat.Type.WAVE;
try {
AudioSystem.write(stream, wavType, fileOut);
} catch (IOException e) {
System.out.println(&quot;Error writing output file &#39;&quot; + filename + &quot;&#39;: &quot; + e.getMessage());
}
}
public static byte[] createTimedSine(float frequency, int samplingRate, double duration) {
int nrOfSamples = (int) Math.round(duration * samplingRate);
return (createSampledSine(nrOfSamples, frequency, samplingRate));
}
public static byte[] createSampledSine(int nrOfSamples, float frequency, int sampleRate) {
byte[] b1 = new byte[nrOfSamples];
int i;
for (i = 0; i &lt; b1.length; i++) {
b1[i] = (byte) (127 * Math.sin(2 * Math.PI * frequency * i / sampleRate));
}
System.out.println(&quot;Freq of sine: &quot; + frequency);
return b1;
}
public static byte[] createTimedGlissando(float startFrequency, float stopFrequency, int samplingRate,
double duration) {
int nrOfSamples = (int) Math.round(duration * samplingRate);
return (createGlissando(nrOfSamples, startFrequency, stopFrequency, samplingRate));
}
public static byte[] createGlissando(int nrOfSamples, float startFrequency, float stopFrequency, int sampleRate) {
byte[] b1 = new byte[nrOfSamples];
float deltaFreq = (stopFrequency - startFrequency);
float instantFrequency = 0;
int i;
for (i = 0; i &lt; b1.length; i++) {
instantFrequency = startFrequency + (i * deltaFreq / nrOfSamples);
b1[i] = (byte) (127 * Math.sin(2 * Math.PI * instantFrequency * i / sampleRate));
}
System.out.println(&quot;Start freq glissando :&quot; + startFrequency);
System.out.println(&quot;Stop freq glissando :&quot; + instantFrequency);
return b1;
}
static byte[] concatenate(byte[] a, byte[] b, byte[] c) throws IOException {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
outputStream.write(a);
outputStream.write(b);
outputStream.write(c);
byte d[] = outputStream.toByteArray();
return d;
}

Console output:

Freq of sine: 220.0
Start freq glissando :220.0
Stop freq glissando :439.9975
Freq of sine: 440.0

答案1

得分: 0

问题的产生是因为每个帧的相邻音高太宽。对于instantFrequency的计算是正确的,但通过将其乘以i来得到一个值是可疑的。当你从 i 转到 i+1 时,前进的距离如下所示:

距离 = ((n+1) * instantFrequency[n+1]) - (n * instantFrequency[n]) 

这比期望的增量值要大,应该等于新的 instantFrequency 值,例如:

距离 = ((n+1) * instantFrequency[n]) - (n * instantFrequency[n]) 

下面的代码帮助我找到了问题,这个问题让我困惑了好几个小时。只有在我睡醒后,我才能得到上面简明解释(在编辑时添加)。

这里有一个更简单的案例来说明这个问题。由于问题发生在正弦函数计算之前,我将它们和正弦计算之后的所有操作都排除在外。

public class CuriousSeries {

    public static void main(String[] args) {

        double aa = 1;  // 类似于你的220
        double bb = 2;  // 类似于你的440

        double delta = bb - aa;

        int steps = 10;
        double[] travelVals = new double[steps + 1]; 

        // 行程 aa
        for (int i = 0; i <= 10; i++) {
            travelVals[i] = aa * i;
            System.out.println("aa 行程。 travelVals[" + i + "] = " + travelVals[i]);
        }

        // 行程 ab
        for (int i = 0; i <= 10; i++) {
            double instantFreq = aa + (i / 10.0) * delta;
            travelVals[i] = instantFreq * i;
            System.out.println("ab 行程。 travelVals[" + i + "] = " + travelVals[i]);
        }

        // 行程 bb
        for (int i = 0; i <= 10; i++) {
            travelVals[i] = bb * i;
            System.out.println("bb 行程。 travelVals[" + i + "] = " + travelVals[i]);
        }

        // 行程 cc
        travelVals[0] = 0;
        for (int i = 1; i <= 10; i++) {
            double travelIncrement = aa + (i / 10.0) * delta;
            travelVals[i] = travelVals[i-1] + travelIncrement;
            System.out.println("cc 行程。 travelVals[" + i + "] = " + travelVals[i]);
        }
    }
}

让我们将 aa 视为类似于220 Hz,将 bb 视为类似于440 Hz。在每个部分中,我们从0开始到位置10。我们前进的量是根据类似于你的计算方式计算的。对于“固定速率”,我们只需将步骤的值乘以 i(行程 aabb)。在行程 ab 中,我使用了类似于你的计算方式。问题在于最后几步太大了。如果你检查输出行,就能看出来:

ab 行程。 travelVals[9] = 17.099999999999998
ab 行程。 travelVals[10] = 20.0

“步骤”中旅行的距离接近3,而不是期望的2!

在最后一个示例中,行程 cctravelIncrement 的计算与 instantFrequency 相同。但在这种情况下,增量仅添加到前一个位置。

实际上,在音频合成方面(在计算上创建波形时),使用加法来减少 CPU 成本是有道理的。沿着这些思路,我通常会更像下面这样做,尽量从内循环中删除尽可能多的计算:

double cursor = 0;
double prevCursor = 0;
double pitchIncrement = 2 * Math.PI * frequency / sampleRate;

for (int i = 0; i < n; i++) {
    cursor = prevCursor + pitchIncrement;
    audioVal[i] = Math.sin(cursor);
    prevCursor = cursor;
}
英文:

The problem arises because the adjacent pitches for each frame are too wide. The calculation for instantFrequency is good, but arriving at a value by multiplying it by i is dubious. When you go from i to i+1, the distance progressed is as follows:

distance = ((n+1) * instantFrequency[n+1]) - (n * instantFrequency[n]) 

This is larger than the desired delta value, which should equal the new instantFrequency value, e.g.:

distance = ((n+1) * instantFrequency[n]) - (n * instantFrequency[n]) 

The following code helped me figure out the problem, which had me puzzled for several hours. It was only after sleeping on it that I was able to get to the above succinct explanation (added in an edit).

Here is a simpler case that illustrates the issue. Since the problem occurs before the sin function calculations, I excluded them and all the operations that follow the trig calculation.

public class CuriousSeries {
public static void main(String[] args) {
double aa = 1;  // analogous to your 220
double bb = 2;  // analogous to your 440
double delta = bb - aa;
int steps = 10;
double[] travelVals = new double[steps + 1]; 
// trip aa
for (int i = 0; i &lt;= 10; i++) {
travelVals[i] = aa * i;
System.out.println(&quot;aa trip. travelVals[&quot; + i + &quot;] = &quot; + travelVals[i]);
}
// trip ab
for (int i = 0; i &lt;= 10; i++) {
double instantFreq = aa + (i / 10.0) * delta;
travelVals[i] = instantFreq * i;
System.out.println(&quot;ab trip. travelVals[&quot; + i + &quot;] = &quot; + travelVals[i]);
}
// trip bb
for (int i = 0; i &lt;= 10; i++) {
travelVals[i] = bb * i;
System.out.println(&quot;bb trip. travelVals[&quot; + i + &quot;] = &quot; + travelVals[i]);
}
// trip cc
travelVals[0] = 0;
for (int i = 1; i &lt;= 10; i++) {
double travelIncrement = aa + (i / 10.0) * delta;
travelVals[i] = travelVals[i-1] + travelIncrement;
System.out.println(&quot;cc trip. travelVals[&quot; + i + &quot;] = &quot; + travelVals[i]);
}
}
}

Let's consider aa as analogous to 220 Hz, and bb as analogous to 440 Hz. In each section, we start at 0 and go to position 10. The amount we go forward is calculated similarly to your calculations. For the "fixed rate", we simply multiply the value of the step by i (trips aa and bb). In trip ab I use a calculation similar to yours. The problem with it is that the last steps are too large. You can see this if you inspect the output lines:

ab trip. travelSum[9] = 17.099999999999998
ab trip. travelSum[10] = 20.0

The distance traveled that "step" was close to 3, not the desired 2!

In the last example, trip cc, the calculation for travelIncrement is the same as for instantFrequency. But in this case the increment is simply added to the previous position.

In fact, for purposes of audio synthesis (when creating wave forms computationally), it makes sense to use addition to minimize cpu cost. Along those lines, I usually do something more like the following, removing as many calculations from the inner loop as possible:

double cursor = 0;
double prevCursor = 0;
double pitchIncrement = 2 * Math.PI * frequency / sampleRate;
for (int i = 0; i &lt; n; i++) {
cursor = prevCursor + pitchIncrement;
audioVal[i] = Math.sin(cursor);
prevCursor = cursor;
}

huangapple
  • 本文由 发表于 2020年10月10日 22:08:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/64294324.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定