英文:
How to enable Neural Text-to-Speech (NTTS) in Java using Amazon Polly
问题
package com.amazonaws.demos.polly;
import java.io.IOException;
import java.io.InputStream;
import com.amazonaws.ClientConfiguration;
import com.amazonaws.auth.DefaultAWSCredentialsProviderChain;
import com.amazonaws.regions.Region;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.polly.AmazonPollyClient;
import com.amazonaws.services.polly.model.DescribeVoicesRequest;
import com.amazonaws.services.polly.model.DescribeVoicesResult;
import com.amazonaws.services.polly.model.OutputFormat;
import com.amazonaws.services.polly.model.SynthesizeSpeechRequest;
import com.amazonaws.services.polly.model.SynthesizeSpeechResult;
import com.amazonaws.services.polly.model.Voice;
import javazoom.jl.player.advanced.AdvancedPlayer;
import javazoom.jl.player.advanced.PlaybackEvent;
import javazoom.jl.player.advanced.PlaybackListener;
public class PollyDemo {
private final AmazonPollyClient polly;
private final Voice voice;
private static final String JOANNA = "Joanna";
private static final String KENDRA = "Kendra";
private static final String MATTHEW = "Matthew";
private static final String SAMPLE = "Congratulations. You have successfully built this working demo of Amazon Polly in Java. Have fun building voice enabled apps with Amazon Polly (that's me!), and always look at the AWS website for tips and tricks on using Amazon Polly and other great services from AWS";
public PollyDemo(Region region) {
// create an Amazon Polly client in a specific region
polly = new AmazonPollyClient(new DefaultAWSCredentialsProviderChain(),
new ClientConfiguration());
polly.setRegion(region);
// Create describe voices request.
DescribeVoicesRequest describeVoicesRequest = new DescribeVoicesRequest();
// Synchronously ask Amazon Polly to describe available TTS voices.
DescribeVoicesResult describeVoicesResult = polly.describeVoices(describeVoicesRequest);
// voice = describeVoicesResult.getVoices().get(0);
voice = describeVoicesResult.getVoices().stream().filter(p -> p.getName().equals(MATTHEW)).findFirst().get();
}
public InputStream synthesize(String text, OutputFormat format) throws IOException {
SynthesizeSpeechRequest synthReq =
new SynthesizeSpeechRequest().withText(text).withVoiceId(voice.getId())
.withOutputFormat(format);
SynthesizeSpeechResult synthRes = polly.synthesizeSpeech(synthReq);
return synthRes.getAudioStream();
}
public static void main(String args[]) throws Exception {
// create the test class
PollyDemo helloWorld = new PollyDemo(Region.getRegion(Regions.US_WEST_1));
// get the audio stream
InputStream speechStream = helloWorld.synthesize(SAMPLE, OutputFormat.Mp3);
// create an MP3 player
AdvancedPlayer player = new AdvancedPlayer(speechStream,
javazoom.jl.player.FactoryRegistry.systemRegistry().createAudioDevice());
player.setPlayBackListener(new PlaybackListener() {
@Override
public void playbackStarted(PlaybackEvent evt) {
System.out.println("Playback started");
System.out.println(SAMPLE);
}
@Override
public void playbackFinished(PlaybackEvent evt) {
System.out.println("Playback finished");
}
});
// play it!
player.play();
}
}
英文:
I am trying to use Amazon Polly to convert text to speech using Java API. As described by Amazon there are several US english voices which support Neural. <https://docs.aws.amazon.com/polly/latest/dg/voicelist.html>
The code I am following to run in Java application is as following:
package com.amazonaws.demos.polly;
import java.io.IOException;
import java.io.InputStream;
import com.amazonaws.ClientConfiguration;
import com.amazonaws.auth.DefaultAWSCredentialsProviderChain;
import com.amazonaws.regions.Region;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.polly.AmazonPollyClient;
import com.amazonaws.services.polly.model.DescribeVoicesRequest;
import com.amazonaws.services.polly.model.DescribeVoicesResult;
import com.amazonaws.services.polly.model.OutputFormat;
import com.amazonaws.services.polly.model.SynthesizeSpeechRequest;
import com.amazonaws.services.polly.model.SynthesizeSpeechResult;
import com.amazonaws.services.polly.model.Voice;
import javazoom.jl.player.advanced.AdvancedPlayer;
import javazoom.jl.player.advanced.PlaybackEvent;
import javazoom.jl.player.advanced.PlaybackListener;
public class PollyDemo {
private final AmazonPollyClient polly;
private final Voice voice;
private static final String JOANNA="Joanna";
private static final String KENDRA="Kendra";
private static final String MATTHEW="Matthew";
private static final String SAMPLE = "Congratulations. You have successfully built this working demo of Amazon Polly in Java. Have fun building voice enabled apps with Amazon Polly (that's me!), and always look at the AWS website for tips and tricks on using Amazon Polly and other great services from AWS";
public PollyDemo(Region region) {
// create an Amazon Polly client in a specific region
polly = new AmazonPollyClient(new DefaultAWSCredentialsProviderChain(),
new ClientConfiguration());
polly.setRegion(region);
// Create describe voices request.
DescribeVoicesRequest describeVoicesRequest = new DescribeVoicesRequest();
// Synchronously ask Amazon Polly to describe available TTS voices.
DescribeVoicesResult describeVoicesResult = polly.describeVoices(describeVoicesRequest);
//voice = describeVoicesResult.getVoices().get(0);
voice = describeVoicesResult.getVoices().stream().filter(p -> p.getName().equals(MATTHEW)).findFirst().get();
}
public InputStream synthesize(String text, OutputFormat format) throws IOException {
SynthesizeSpeechRequest synthReq =
new SynthesizeSpeechRequest().withText(text).withVoiceId(voice.getId())
.withOutputFormat(format);
SynthesizeSpeechResult synthRes = polly.synthesizeSpeech(synthReq);
return synthRes.getAudioStream();
}
public static void main(String args[]) throws Exception {
//create the test class
PollyDemo helloWorld = new PollyDemo(Region.getRegion(Regions.US_WEST_1));
//get the audio stream
InputStream speechStream = helloWorld.synthesize(SAMPLE, OutputFormat.Mp3);
//create an MP3 player
AdvancedPlayer player = new AdvancedPlayer(speechStream,
javazoom.jl.player.FactoryRegistry.systemRegistry().createAudioDevice());
player.setPlayBackListener(new PlaybackListener() {
@Override
public void playbackStarted(PlaybackEvent evt) {
System.out.println("Playback started");
System.out.println(SAMPLE);
}
@Override
public void playbackFinished(PlaybackEvent evt) {
System.out.println("Playback finished");
}
});
// play it!
player.play();
}
}
By default its taking the Standard of the voice of Matthew. Please suggest what needs to be changed to make the speech Neural for the voice of Matthew.
Thanks
答案1
得分: 2
感谢 @ASR 的反馈。
我成功找到了您建议的 engine 参数。
我解决这个问题的步骤如下:
- 在
pom.xml
文件中将aws-java-sdk-polly
版本从 1.11.77(正如他们在文档中提到的)更新到最新的 1.11.762,并构建 Maven 项目。这会为SynthesizeSpeechRequest
类带来最新的类定义。在 1.11.77 版本中,我无法在其定义中找到 withEngine 函数。
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-polly</artifactId>
<version>1.11.762</version>
</dependency>
- 将
withEngine("neural")
更新如下:
SynthesizeSpeechRequest synthReq =
new SynthesizeSpeechRequest().withText(text).withVoiceId(voice.getId())
.withOutputFormat(format).withEngine("neural");
- 如 https://docs.aws.amazon.com/polly/latest/dg/NTTS-main.html 中所定义,神经音色仅在特定区域可用。因此,我选择了如下设置:
PollyDemo helloWorld = new PollyDemo(Region.getRegion(Regions.US_WEST_2));
在完成上述步骤后,神经音色正常工作。
英文:
Thanks @ASR for your feedback.
I was able to find the engine parameter as you suggested.
The way I had to solve this is:
- Update the aws-java-sdk-polly version from 1.11.77 (as they have in their documentation) to the latest 1.11.762 in the pom.xml and build the Maven project. This brings the latest class definition for SynthesizeSpeechRequest Class. With 1.11.77 I was unable to see withEngine function in its definition.
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-polly</artifactId>
<version>1.11.762</version>
</dependency>
- Updated the withEngine("neural") as below:
SynthesizeSpeechRequest synthReq =
new SynthesizeSpeechRequest().withText(text).withVoiceId(voice.getId())
.withOutputFormat(format).withEngine("neural");
- As defined in https://docs.aws.amazon.com/polly/latest/dg/NTTS-main.html Neural voice is only available in specific regions. So I had to chose as following:
PollyDemo helloWorld = new PollyDemo(Region.getRegion(Regions.US_WEST_2));
After this Neural voice worked perfectly.
答案2
得分: 0
我假设您正在使用AWS Java SDK 1.11
AWS文档此处指出您需要在语音合成请求中设置engine
参数为neural
。AWS Java SDK文档此处描述了将其设置为neural
的withEngine
方法。
附注:文档页面似乎未提供方法URL,因此您将不得不搜索它。
英文:
I am assuming you are using AWS Java SDK 1.11
AWS documentation here states that you need to set the engine
parameter in the speech sysnthesis request to neural
. AWS Java SDK documentation here describes the withEngine
method to set it to neural
.
PS: the documentation page doesn't seem to provide the method URLs, so you will have to search for it.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论