Microsoft TTS 和 Apache HttpRequest

我正在使用 Microsoft Cognitive Services 并通过 Rest API 提交一些 TTS。如果我直接使用 postman 将以下 xml 提交到服务,那么它可以正常工作。如果我通过我的 Java 代码提交以下 xml,则会出现以下错误:

HttpResponseProxy{HTTP/1.1 400 Synthesis failed. StatusCode: FailedPrecondition, Details: SSML parsing error: 8004507A. [Server: openresty/1.15.8.2, Date: Mon, 07 Jun 2021 20:48:40 GMT, Content-Type: text/xml, Transfer-Encoding: chunked, Connection: keep-alive, Strict-Transport-Security: max-age=15724800; includeSubDomains] ResponseEntityProxy{[Content-Type: text/xml,Chunked: true]}}

如果我不注入 SSML,则该 Java 代码可以完美地运行

<phoneme alphabet="ipa" ph="təˈmeɪtoʊ"> 番茄 </phoneme>

我能想到的唯一错误是正文的实体。对于 Postman,这是一个原始正文。在 Java 中是否还有其他操作?

HttpPost httpPost = new HttpPost("https://eastus.tts.speech.microsoft.com/cognitiveservices/v1");
httpPost.setEntity(new StringEntity(xml));
httpPost.addHeader("Content-Type", "application/ssml+xml");
httpPost.addHeader("Ocp-Apim-Subscription-Key", key);
httpPost.addHeader("X-Microsoft-OutputFormat", "audio-48khz-192kbitrate-mono-mp3");
org.apache.http.HttpResponse resp = httpclient.execute(httpPost);
<speak xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xmlns:emo="http://www.w3.org/2009/10/emotionml" version="1.0" xml:lang="en-US"><voice name="en-US-JennyNeural"><mstts:express-as style="assistant"><prosody rate="5%" pitch="13%">
我们现在完全可以为您制作一个<phoneme alphabet="ipa" ph="təˈmeɪtoʊ"> 番茄 </phoneme></prosody></mstts:express-as></voice></speak>
点赞