2020年4月10日 06:42:23go评论117阅读模式

英文:

How to enable Neural Text-to-Speech (NTTS) in Java using Amazon Polly

问题

package com.amazonaws.demos.polly;
import java.io.IOException;
import java.io.InputStream;
import com.amazonaws.ClientConfiguration;
import com.amazonaws.auth.DefaultAWSCredentialsProviderChain;
import com.amazonaws.regions.Region;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.polly.AmazonPollyClient;
import com.amazonaws.services.polly.model.DescribeVoicesRequest;
import com.amazonaws.services.polly.model.DescribeVoicesResult;
import com.amazonaws.services.polly.model.OutputFormat;
import com.amazonaws.services.polly.model.SynthesizeSpeechRequest;
import com.amazonaws.services.polly.model.SynthesizeSpeechResult;
import com.amazonaws.services.polly.model.Voice;
import javazoom.jl.player.advanced.AdvancedPlayer;
import javazoom.jl.player.advanced.PlaybackEvent;
import javazoom.jl.player.advanced.PlaybackListener;
public class PollyDemo {
	private final AmazonPollyClient polly;
	private final Voice voice;
	private static final String JOANNA = "Joanna";
	private static final String KENDRA = "Kendra";
	private static final String MATTHEW = "Matthew";
	private static final String SAMPLE = "Congratulations. You have successfully built this working demo of Amazon Polly in Java. Have fun building voice enabled apps with Amazon Polly (that's me!), and always look at the AWS website for tips and tricks on using Amazon Polly and other great services from AWS";
	public PollyDemo(Region region) {
		// create an Amazon Polly client in a specific region
		polly = new AmazonPollyClient(new DefaultAWSCredentialsProviderChain(), 
		new ClientConfiguration());
		polly.setRegion(region);
		
		// Create describe voices request.
		DescribeVoicesRequest describeVoicesRequest = new DescribeVoicesRequest();
		// Synchronously ask Amazon Polly to describe available TTS voices.
		DescribeVoicesResult describeVoicesResult = polly.describeVoices(describeVoicesRequest);
		// voice = describeVoicesResult.getVoices().get(0);
		voice = describeVoicesResult.getVoices().stream().filter(p -> p.getName().equals(MATTHEW)).findFirst().get();
	}
	public InputStream synthesize(String text, OutputFormat format) throws IOException {
		SynthesizeSpeechRequest synthReq = 
		new SynthesizeSpeechRequest().withText(text).withVoiceId(voice.getId())
				.withOutputFormat(format);
		SynthesizeSpeechResult synthRes = polly.synthesizeSpeech(synthReq);
		return synthRes.getAudioStream();
	}
	public static void main(String args[]) throws Exception {
		// create the test class
		PollyDemo helloWorld = new PollyDemo(Region.getRegion(Regions.US_WEST_1));
		// get the audio stream
		InputStream speechStream = helloWorld.synthesize(SAMPLE, OutputFormat.Mp3);
		// create an MP3 player
		AdvancedPlayer player = new AdvancedPlayer(speechStream,
				javazoom.jl.player.FactoryRegistry.systemRegistry().createAudioDevice());
		player.setPlayBackListener(new PlaybackListener() {
			@Override
			public void playbackStarted(PlaybackEvent evt) {
				System.out.println("Playback started");
				System.out.println(SAMPLE);
			}
			
			@Override
			public void playbackFinished(PlaybackEvent evt) {
				System.out.println("Playback finished");
			}
		});
		
		
		// play it!
		player.play();
		
	}
}

英文:

I am trying to use Amazon Polly to convert text to speech using Java API. As described by Amazon there are several US english voices which support Neural. <https://docs.aws.amazon.com/polly/latest/dg/voicelist.html>

The code I am following to run in Java application is as following:

package com.amazonaws.demos.polly;
import java.io.IOException;
import java.io.InputStream;
import com.amazonaws.ClientConfiguration;
import com.amazonaws.auth.DefaultAWSCredentialsProviderChain;
import com.amazonaws.regions.Region;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.polly.AmazonPollyClient;
import com.amazonaws.services.polly.model.DescribeVoicesRequest;
import com.amazonaws.services.polly.model.DescribeVoicesResult;
import com.amazonaws.services.polly.model.OutputFormat;
import com.amazonaws.services.polly.model.SynthesizeSpeechRequest;
import com.amazonaws.services.polly.model.SynthesizeSpeechResult;
import com.amazonaws.services.polly.model.Voice;
import javazoom.jl.player.advanced.AdvancedPlayer;
import javazoom.jl.player.advanced.PlaybackEvent;
import javazoom.jl.player.advanced.PlaybackListener;
public class PollyDemo {
private final AmazonPollyClient polly;
private final Voice voice;
private static final String JOANNA=&quot;Joanna&quot;; 
private static final String KENDRA=&quot;Kendra&quot;; 
private static final String MATTHEW=&quot;Matthew&quot;; 
private static final String SAMPLE = &quot;Congratulations. You have successfully built this working demo of Amazon Polly in Java. Have fun building voice enabled apps with Amazon Polly (that&#39;s me!), and always look at the AWS website for tips and tricks on using Amazon Polly and other great services from AWS&quot;;
public PollyDemo(Region region) {
// create an Amazon Polly client in a specific region
polly = new AmazonPollyClient(new DefaultAWSCredentialsProviderChain(), 
new ClientConfiguration());
polly.setRegion(region);
// Create describe voices request.
DescribeVoicesRequest describeVoicesRequest = new DescribeVoicesRequest();
// Synchronously ask Amazon Polly to describe available TTS voices.
DescribeVoicesResult describeVoicesResult = polly.describeVoices(describeVoicesRequest);
//voice = describeVoicesResult.getVoices().get(0);
voice = describeVoicesResult.getVoices().stream().filter(p -&gt; p.getName().equals(MATTHEW)).findFirst().get();
}
public InputStream synthesize(String text, OutputFormat format) throws IOException {
SynthesizeSpeechRequest synthReq = 
new SynthesizeSpeechRequest().withText(text).withVoiceId(voice.getId())
.withOutputFormat(format);
SynthesizeSpeechResult synthRes = polly.synthesizeSpeech(synthReq);
return synthRes.getAudioStream();
}
public static void main(String args[]) throws Exception {
//create the test class
PollyDemo helloWorld = new PollyDemo(Region.getRegion(Regions.US_WEST_1));
//get the audio stream
InputStream speechStream = helloWorld.synthesize(SAMPLE, OutputFormat.Mp3);
//create an MP3 player
AdvancedPlayer player = new AdvancedPlayer(speechStream,
javazoom.jl.player.FactoryRegistry.systemRegistry().createAudioDevice());
player.setPlayBackListener(new PlaybackListener() {
@Override
public void playbackStarted(PlaybackEvent evt) {
System.out.println(&quot;Playback started&quot;);
System.out.println(SAMPLE);
}
@Override
public void playbackFinished(PlaybackEvent evt) {
System.out.println(&quot;Playback finished&quot;);
}
});
// play it!
player.play();
}
}

By default its taking the Standard of the voice of Matthew. Please suggest what needs to be changed to make the speech Neural for the voice of Matthew.

Thanks

答案1

得分: 2

感谢 @ASR 的反馈。

我成功找到了您建议的 engine 参数。

我解决这个问题的步骤如下：

在 pom.xml 文件中将 aws-java-sdk-polly 版本从 1.11.77（正如他们在文档中提到的）更新到最新的 1.11.762，并构建 Maven 项目。这会为 SynthesizeSpeechRequest 类带来最新的类定义。在 1.11.77 版本中，我无法在其定义中找到 withEngine 函数。

<dependency>
 <groupId>com.amazonaws</groupId>
 <artifactId>aws-java-sdk-polly</artifactId>
 <version>1.11.762</version>
</dependency>

将 withEngine("neural") 更新如下：

SynthesizeSpeechRequest synthReq = 
		new SynthesizeSpeechRequest().withText(text).withVoiceId(voice.getId())
				.withOutputFormat(format).withEngine("neural");

如 https://docs.aws.amazon.com/polly/latest/dg/NTTS-main.html 中所定义，神经音色仅在特定区域可用。因此，我选择了如下设置：

PollyDemo helloWorld = new PollyDemo(Region.getRegion(Regions.US_WEST_2));

在完成上述步骤后，神经音色正常工作。

英文:

Thanks @ASR for your feedback.

I was able to find the engine parameter as you suggested.

The way I had to solve this is:

Update the aws-java-sdk-polly version from 1.11.77 (as they have in their documentation) to the latest 1.11.762 in the pom.xml and build the Maven project. This brings the latest class definition for SynthesizeSpeechRequest Class. With 1.11.77 I was unable to see withEngine function in its definition.

&lt;dependency&gt;
&lt;groupId&gt;com.amazonaws&lt;/groupId&gt;
&lt;artifactId&gt;aws-java-sdk-polly&lt;/artifactId&gt;
&lt;version&gt;1.11.762&lt;/version&gt;
&lt;/dependency&gt;

Updated the withEngine("neural") as below:

SynthesizeSpeechRequest synthReq = 
new SynthesizeSpeechRequest().withText(text).withVoiceId(voice.getId())
.withOutputFormat(format).withEngine(&quot;neural&quot;);

As defined in https://docs.aws.amazon.com/polly/latest/dg/NTTS-main.html Neural voice is only available in specific regions. So I had to chose as following:

PollyDemo helloWorld = new PollyDemo(Region.getRegion(Regions.US_WEST_2));

After this Neural voice worked perfectly.

答案2

得分: 0

我假设您正在使用AWS Java SDK 1.11

AWS文档此处指出您需要在语音合成请求中设置engine参数为neural。AWS Java SDK文档此处描述了将其设置为neural的withEngine方法。

附注：文档页面似乎未提供方法URL，因此您将不得不搜索它。

英文:

I am assuming you are using AWS Java SDK 1.11

AWS documentation here states that you need to set the engine parameter in the speech sysnthesis request to neural. AWS Java SDK documentation here describes the withEngine method to set it to neural.

PS: the documentation page doesn't seem to provide the method URLs, so you will have to search for it.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在Java中使用Amazon Polly启用神经网络文本转语音（NTTS）功能

问题

答案1

答案2

Java查找依赖jar的绝对路径

`SnakeGame` 类必须实现继承的抽象方法 `KeyListener.keyReleased(KeyEvent)`。

如何在Java Selenium中获取span元素中的动态文本？

如何将WebElements存储在一个数组中

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。