为什么语音 REST API 的响应与 go SDK API 的响应不同?

huangapple go评论87阅读模式
英文:

Why is the speech REST API response different from the go SDK API response?

问题

当通过 REST 调用语音转文本 API 时,响应结构与使用 Golang SDK 调用时略有不同。

例如,我通过 Golang SDK 提交了一个异步语音作业。然后下面我展示了通过两种不同方法查询谷歌云以获取转录作业结果的结果,分别是使用 REST 和 Golang SDK,结果略有不同。

方法 1:REST 调用

GET https://speech.googleapis.com/v1/operations/{id}

{id} 是操作 ID,例如 (2593790426826555555)

结果 1,使用驼峰命名的属性,startTimeendTime 属性的类型为 string

"words": [
  {
    "startTime": "0s",
    "endTime": "0.400s",
    "word": "We",
    "confidence": 0.98762906
  },
...

方法 2:Golang SDK

// 省略错误处理,
client, err := speech.NewClient(ctx)
op, err := client.LROClient.GetOperation(ctx, &lropb.GetOperationRequest{Name: id})
resp := new(speechpb.LongRunningRecognizeResponse)
err = op.GetResponse().UnmarshalTo(resp)
js, err := json.Marshal(resp)
ioutil.WriteFile("sdk-response.json", js, 0644)

结果 2,start_time/end_time 的对象类型为下划线命名法。

"words": [
{
  "start_time": {},
  "end_time": {
    "nanos": 400000000
  },
  "word": "We",
  "confidence": 0.98762906
},
...

如果你在 SDK 代码中查找类型信息,会发现它使用 start_time 作为 JSON 标签,所以我认为这是预期行为。或者我可能在使用 op.GetResponse().UnmarshalTo(resp) 解组响应时出错了?任何帮助或建议都将不胜感激。

StartTime *durationpb.Duration `protobuf:"bytes,1,opt,name=start_time,json=startTime,proto3" json:"start_time,omitempty"`

使用 go 1.18.1 和 cloud.google.com/go/speech v1.4.0

更新,详细说明问题的原因 我有两组通过不同方法(存储桶 vs. SDK)下载的转录文本。其中一组是从谷歌云存储桶中获取的,谷歌将其作为驼峰命名格式持久化到存储桶中(与 REST API 的格式相同)。我还有另一组通过 SDK API 获取并使用 Golang 中的 JSON 编码持久化的转录文本,该转录文本根据 SDK 的结构布局应用了下划线命名法。

编写一些代码来纠正/规范到单一格式并不是什么大问题,但在我看来,这有点不一致。我提出这个问题是为了了解是否我做错了什么,是否可以进行纠正,或者这是预期的行为。

英文:

When Calling the Speech-To-Text API via REST the response structure is slightly different than when calling with the Golang SDK.

Example, I've submitted an asynchronous speech job via the golang SDK. Then below I show the results of querying google cloud for the result of the transcription job via 2 different methods, REST and go SDK with slightly different results.

Method 1: REST call

GET https://speech.googleapis.com/v1/operations/{id}

{id} is the operation id, e.g (2593790426826555555)

RESULT 1, camelCased attributes with string typed startTime endTime attrs.

"words": [
  {
    "startTime": "0s",
    "endTime": "0.400s",
    "word": "We",
    "confidence": 0.98762906
  },
...

Method 2: go SDK

// omitting err handling,
client, err := speech.NewClient(ctx)
op, err := client.LROClient.GetOperation(ctx, &lropb.GetOperationRequest{Name: id})
resp := new(speechpb.LongRunningRecognizeResponse)
err = op.GetResponse().UnmarshalTo(resp)
js, err := json.Marshal(resp)
ioutil.WriteFile("sdk-response.json", js, 0644)

RESULT 2, snake_cased object types for start_time/end_time

"words": [
{
  "start_time": {},
  "end_time": {
    "nanos": 400000000
  },
  "word": "We",
  "confidence": 0.98762906
},
...

If you hunt down the type information in the SDK code, it does use start_time as the json tag so I suppose this is expected behavior. Or I could be incorrectly unmarshalling the response with op.GetResponse().UnmarshalTo(resp)? Any help or advice is appreciated.

StartTime *durationpb.Duration `protobuf:"bytes,1,opt,name=start_time,json=startTime,proto3" json:"start_time,omitempty"`

Using go 1.18.1 and cloud.google.com/go/speech v1.4.0

Update, elaborating on rationale for question I have 2 sets of transcripts that were downloaded via different methods (storage buckets vs. SDK). One was pulled from Google cloud storage buckets and these are persisted by Google as camcelCased in a bucket (same format as the REST API). I have another set of transcripts that were pulled from the SDK API and persisted using json encoding in golang, which applies snake_casing per the SDK's struct layout.

It isn't a huge deal to write some code to correct/normalize to a single format, but it is somewhat of inconsistency in my opinion. Raising the question to learn if I'm doing something wrong and it could be corrected or if this is to be expected.

答案1

得分: 3

JSON序列化的Golang(结构体)是protobufs(使用snake_case的字段和时间是google.protobuf.Timestamp)。

你可以尝试使用Golang protobuf protojson包,而不是encoding/json,因为这样可以双向映射JSON和Golang protobuf结构体。

英文:

The JSON-marshaled Golang (structs) are protobufs (snake_case'd fields and the times are google.protobuf.Timestamp).

Can you try using the Golang protobuf protojson package instead of encoding/json as this should bijectively map JSON and Golang protobuf structs.

huangapple
  • 本文由 发表于 2022年6月15日 10:03:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/72625170.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定