如何匹配 Golang 结构体标签中的字符串

huangapple go评论81阅读模式
英文:

How to match string golang struct tags

问题

我有一个带有动态键的 JSON 数据,例如:

{
  "id_employee": 123
}

或者

{
  "id_user": 321
}

我正在尝试将数据反序列化为结构体。

在反序列化数据后,如何创建一个结构体标签来匹配这两个示例中的 "id_user" 和 "id_employee" 键?

type User struct {
   Id int64 `json:"id_user,omitempty"`
}
英文:

i have a json data with dynamic key like:

{
  "id_employee": 123
}

or

{
  "id_user": 321
}

I'm trying to unmarshall data to struct.

how can i create a struct tags to match all those 2 key "id_user" and "id_employee" from the example, after unmarshall the data ?

interface User struct {
   Id int64 .....
}

答案1

得分: 3

在开始之前

一个小的免责声明:我以下的所有代码片段都是我凭记忆写的,没有校对或其他类似的操作。这些代码不是可以直接复制粘贴的。这个答案的目的是为您提供一些方法,让您能够实现您所要求的功能,解释为什么选择某个选项可能是一个好主意或不是一个好主意等等。这些方法中,第三种方法绝对是最好的方法,但是考虑到信息有限(关于您要解决的问题没有具体的细节),您可能需要进一步研究才能得到最终的解决方案。

接下来,我必须问一下,为什么您要做这样的事情。如果您想要一个可以用于解组不同有效载荷的单个类型,我认为您引入了很多代码异味。如果有效载荷不同,它们必须表示不同的数据。想要使用单个通用类型来处理多个数据集,我认为这只会带来麻烦。我将提供几种可以实现这一点的方法,但是在开始之前,我想非常明确地说明一下:

即使这是可能的,这是一个坏主意

一个较小的问题,但我必须指出:您包含了一个类似这样的示例类型:

interface User struct {
    Id int64
}

这是完全错误的。带有字段的结构体不是一个接口,所以我将假设两件事情。一是您想要专用的用户类型,比如:

type Employee struct {
    Id int64
}
type Employer struct {
    Id int64
}

和一个单一的:

type User interface {
   ID() int64
}

解组这些东西

所以有很多方法可以实现您想要做的事情。混乱但简单的方法是有一个包含所有可能字段排列的单个类型:

type AllUser struct {
    UID int64 `json:"user_id"`
    EID int64 `json:"employee_id"`
}

这样可以确保您的 JSON 输入中的user_idemployee_id字段都有一个位置,并且 ID 字段将被填充。然而,当您想要实现User接口时,真正的混乱很快就会显现出来:

func (a AllUser) ID() int64 {
    if a.UID != 0 {
        return a.UID
    }
    if a.EID != 0 {
        return a.EID
    }
    // and so on
    return 0 // 可能是一个错误?
}

对于 getter 方法来说,这只是很多样板代码,但是对于 setter 方法呢?字段可能还没有被设置。您需要找到一种方法来从单个 setter 中设置正确的 ID 字段。最初,传递一个枚举/常量来指定要设置的字段可能似乎是一个合理的方法,但是想一想:这有点违背了首先使用接口的目的。您将失去所有的抽象。所以这种方法是相当有缺陷的。

此外,如果您设置了一个员工 ID,其他 ID 字段将默认为它们的零值(int64 的零值为0)。再次对该类型进行编组将导致以下 JSON 输出:

{
    "employee_id": 123,
    "user_id": 0,
    "employer_id": 0,
}

您可以通过更改类型以使用指针,并添加omitempty来跳过 JSON 输出中的nil字段来解决此问题:

type AllUser struct {
    EID *int64 `json:"employee_id,omitempty"`
    UID *int64 `json:"user_id,omitempty"`
}

同样,这是一个令人讨厌的问题,您将不得不处理指针字段(在不同时间点可能为nil)的代码。这并不难做,但它增加了很多噪音,使代码更容易出错,并且只是一个您应该尽量避免的麻烦事。而且,如果您能够轻松避免它,您确实可以避免它。

自定义解组

一个更好的方法是创建一个基本类型,其中嵌入了特定数据类型。假设我们已经创建了EmployeeEmployerCustomer类型。这些类型都有一个ID字段,带有自己的标签,如下所示:

type Employee struct {
    ID int64 `json:"employee_id"`
}

type FooUser struct {
    ID int64 `json:"foo_id"`
}

接下来要做的是创建一个半通用类型,其中嵌入了所有特定用户类型。共享字段(例如,如果所有数据集都有一个name字段)可以添加到此基本类型上。接下来,您需要将此组合类型嵌入到另一个类型中,该类型实现自定义的解组/组合。这将允许您设置一些字段(例如,我在这个示例中包含了一个指定您正在处理的用户类型的字段)。

type UserType int

const (
    EmployeeUserType UserType = iota
    FooUserType
    // 所有用户类型的枚举值
)

type BaseUser struct {
    WrappedUser
}

type WrappedUser struct {
    *Employee // 将指针嵌入到这些类型中
    *FooUser
    Name       string   `json:"name"`
    Type       UserType `json:"-"`
}

func (b *BaseUser) UnmarshalJSON(data []byte) error {
    if err := json.Unmarshal(data, &b.WrappedUser); err != nil {
        return err
    }
    if b.Employee != nil {
        b.Type = EmployeeUserType // 设置用户类型标志
    }
    if b.FooUser != nil {
        b.Type = FooUserType
    }
    return nil
}

func (b BaseUser) MarshalJSON() ([]byte, error) {
    return json.Marshal(b.WrappedUser) // 包装的用户没有任何自定义处理
}

现在,您可以在WrappedUser类型上实现User接口,因为BaseUser嵌入了它,所以无论如何都可以访问这些方法,并且现在您确切地知道需要获取/设置哪些字段,因为您有类型标志来告诉您:

func (w WrappedUser) ID() int64 {
    switch w.Type {
    case EmployeeUserType:
        return w.Employee.ID
    case FooUserType:
        return w.FooUser.ID
    }
    return 0
}

可以使用相同的方法来实现 setter:

func (w *WrappedUser) SetID(id int64) {
    switch w.Type {
    case EmployeeUserType:
        if w.Employee == nil {
            w.Employee = &Employee{}
        }
        w.Employee.ID = id
    case FooUserType:
        if w.FooUser == nil {
            w.FooUser = &FooUser{}
        }
        w.FooUser.ID = id
    }
}

像这样使用自定义解组和嵌入类型稍微好一些,但是您可能可以通过查看这个简单示例就能够看出来,处理/维护起来很麻烦。

改变策略

现在我假设您想要能够将不同的有效载荷解组为单个类型,因为许多字段是共享的,但是像 ID 字段这样的字段可能是不同的(在这种情况下是user_idemployee_id)。这是非常正常的。您正在询问如何使用单个通用类型。这有点像一个 X-Y 问题。与其问如何为所有特定数据集使用单个类型,为什么不简单地创建一个包含共享字段的类型,并将其包含在特定类型中呢?这与自定义解组的方法非常相似,但是简单了大约一百万倍:

// BaseUser 包含所有特定用户类型共享的字段
type BaseUser struct {
    Name   string `json:"name"`
    Active bool   `json:"active"`
    // 等等...
}

// Employee 是一个用户,恰好是一个员工
type Employee struct {
    ID int64 `json:"employee_id"`
    BaseUser // 在这里嵌入所有用户共享的其他字段
}

type FooUser struct {
    ID   int64  `json:"foo_id"`
    BaseUser
    Name string `json:"foo_user"` // 覆盖 BaseUser 的 name 字段
}

BaseUser类型上实现User接口的所有方法,然后只在特定类型上实现 ID 的 getter/setter 方法,您就完成了。如果需要覆盖字段,就像我在FooUser类型中对Name字段所做的那样,那么您只需要在该类型上覆盖该字段的 getter/setter 方法:

func (f FooUser) Name() string {
    return f.Name
}
func (f *FooUser) SetName(n string) {
    f.Name = n
}

这就是您需要做的全部。简单而容易。您正在处理 JSON 数据。这意味着您正在从某个地方获取数据(无论是 API 还是作为对某种数据存储的查询的响应)。如果您正在处理您请求的数据,那么您至少应该知道_您期望的响应数据类型_。API 是契约:我发出调用 X,服务以给定格式回复所请求的数据或错误。我从存储中查询数据集 Y,我要么得到所请求的数据,要么什么都没有(可能会得到一个错误)。

如果您正在从文件或某个服务中获取数据,并且无法预测您将获得什么,那么您需要修复您的数据源。您不应该试图绕过一个更基本的问题。如果必须这样做,我会花一些时间编写一个小程序,例如,读取源文件,将其解组为一个粗糙的map[string]interface{},检查每个对象包含的键,然后将数据按类型分组写入不同的文件中,以便以更合理的方式处理数据。

英文:

Before we begin

A small disclaimer: I wrote all the code snippets below off the top of my head, no proof reading, or anything of the sort. The code is not copy-paste ready. The point of this answer is to provide you with some approaches that allow you to do what you're asking, some explanation as to why it may or may not be a good idea to choose a given option, etc... The third approach is definitely the better approach of the bunch, but given the limited information (no specifics WRT the problem you're trying to solve), you might need to do some more digging to get to a final solution.

Next, I have to ask why you're trying to do something like this. If you want a single type you can use to unmarshal different payloads, I think you're introducing a lot of code smell. If the payloads differ, they must represent different data. Wanting to use a single catch-all type for multiple data-sets IMO is just asking for trouble. I'll give a couple of ways you can do this, but I want to be very clear on this before I begin:

Even though this is possible, it is a bad idea

A smaller issue, but I have to point it out: you're including an example type like this:

interface User struct {
    Id int64
}

This is just outright wrong. A struct with fields is not an interface, so I'm going to assume 2 things moving forwards. One is that you want dedicated user types like:

type Employee struct {
    Id int64
}
type Employer struct {
    Id int64
}

And a single:

type User interface {
   ID() int64
}

Unmarshalling this stuff

So there are a number of ways you can accomplish what you're trying to do. The messy, but simple way is to have a single type that contains all possible permutations of the fields:

type AllUser struct {
    UID int64 `json:"user_id"`
    EID int64 `json:"employee_id"`
}

This ensures that both user_id employee_id fields in your JSON input will find a home, and an ID field will be populated. The real mess quickly becomes apparent when you want to implement the User interface, though:

func (a AllUser) ID() int64 {
    if a.UID != 0 {
        return a.UID
    }
    if a.EID != 0 {
        return a.EID
    }
    // and so on
    return 0 // probably an error?
}

For getters, that's just a lot of boilerplate to get through, but what about setters? The field may not have been set yet. You'd need to figure out a way to set the correct ID field from a single setter. Passing in an enum/constant to specify what field you're looking to set may at first seem like a reasonable approach, but think about it: it kind of defeats the purpose of having an interface in the first place. You'd lose any and all abstraction. So this approach is quite flawed.

Furthermore, if you have an employee ID set, the other ID fields will default to their nil values (0 for int64). Marshalling the type again will result in JSON output like this:

{
    "employee_id": 123,
    "user_id": 0,
    "employer_id": 0,
}

You can address this issue by changing your type to use pointers, and add omitempty to skip nil fields from the JSON output:

type AllUser struct {
    EID *int64 `json:"employee_id,omitempty"`
    UID *int64 `json:"user_id,omitempty"`
}

Again, this is nasty business, and will result in you having to deal with pointer fields (which may or may not be nil at different points in time) throughout the code. It's not that difficult to do, but it adds a lot of noise, makes the code more prone to bugs, and is just all-round a PITA that you should avoid if you can. And you can avoid it quite easily.

Custom marshalling

A better approach would be to create a base type that embeds data-specific types. Assuming we have created our Employee and Employer or Customer types. These types all have an ID field, with their own tags, like this:

type Employee struct {
    ID int64 `json:"employee_id"`
}

type FooUser struct {
    ID int64 `json:"foo_id"`
}

The next thing to do is to create a semi-generic type that embeds all specific user types. Shared fields (e.g. if all data-sets have a name field) can be added on this base type. The next thing you'll have to do is embed this composite type into yet another type that implements custom marshal/unmarshalling. This will allow you to set some fields (like I've included in the example here: a field that specifies that type of user you're dealing with, for example).

type UserType int

const (
    EmployeeUserType UserType = iota
    FooUserType
    // go-style enum values for all user-types
)

type BaseUser struct {
    WrappedUser
}

type WrappedUser struct {
    *Employee // embed pointers to these types
    *FooUser
    Name       string   `json:"name"`
    Type       UserType `json:"-"` // ignore this in JSON unmarshalling
}

func (b *BaseUser) UnmarshalJSON(data []byte) error {
    if err := json.Unmarshal(data, &b.WrappedUser); err != nil {
        return err
    }
    if b.Employee != nil {
        b.Type = EmployeeUserType // set the user-type flag
    }
    if b.FooUser != nil {
        b.Type = FooUserType
    }
    return nil
}

func (b BaseUser) MarshalJSON() ([]byte, error) {
    return json.Marshal(b.WrappedUser) // wrapped user doesn't have any custom handling
}

To implement the User interface now, you can implement it on the WrappedUser type (the BaseUser embeds it, so the methods will be accessible either way), and you now know precisely what fields you need to get/set because you have the type flag to tell you:

func (w WrappedUser) ID() int64 {
    switch w.Type {
    case EmployeeUserType:
        return w.Employee.ID
    case FooUserType:
        return w.FooUser.ID
    }
    return 0
}

The same can be done with setters:

func (w *WrappedUser) SetID(id int64) {
    switch w.Type {
    case EmployeeUserType:
        if w.Employee == nil {
            w.Employee = &Employee{}
        }
        w.Employee.ID = id
    case FooUserType:
        if w.FooUser == nil {
            w.FooUser = &FooUser{}
        }
        w.FooUser.ID = id
    }
}

Using custom marshalling and embedding types like this is slightly better, but as you probably can tell by looking at this one, pretty simple example already, it quickly becomes quite cumbersome to handle/maintain.

Flipping the script

Now I'm assuming that you want to be able to unmarshal different payloads into a single type because a lot of fields are shared, but things like the ID field might be different (user_id vs employee_id in this case). That's perfectly normal. You're asking how you can use a single catch-all type. That's kind of an X-Y problem. Instead of asking how you can use a single type for all specific data-sets, why not simply create a type for the shared fields, and include that into specific types in turn? It's very similar to the approach with the custom marshalling, but it's ~1,000,000 times simpler:

// BaseUser contains all fields all specific user-types share
type BaseUser struct {
    Name   string `json:"name"`
    Active bool   `json:"active"`
    // etc...
}

// Employee is a user, that happens to be an employee
type Employee struct {
    ID int64 `json:"employee_id"`
    BaseUser // embed the other fields that all users share here
}

type FooUser struct {
    ID int64 `json:"foo_id"`
    BaseUser
    Name string `json:"foo_user"` // override the name field of BaseUser
}

Implement all methods for the User interface of on the BaseUser type, and just implement the ID getter/setter on the specific types, and you're done. If you need to override a field, like I did for Name on the FooUser type, then you just override the getter/setter for that field on that single type:

func (f FooUser) Name() string {
    return f.Name
}
func (f *FooUser) SetName(n string) {
    f.Name = n
}

That's all you need to do. Nice and easy. You're consuming JSON data. That implies you're getting that data from somewhere (either an API, or as a response from a query to some kind of data store). If you're processing data you requested, you should at the very least know what kind of response data you expect. API's are contracts: I make call X, and the service responds with either the data I request in a given format, or an error. I query data-set Y from a store, and I either get the requested data, or I don't get anything (potentially, I get an error).

If you're ingesting data from a file, or from some service, and you can't predict what you're getting back, you need to fix your data-source. You shouldn't be trying to code around a more fundamental problem. Needs must, I'd spend some time writing a small program that, for instance, reads the source file, unmarshals it into something as crude as a map[string]interface{}, check what keys each object contains, and I'd write the data out into distinct files, grouped by type, so I can ingest the data in a more sane way.

huangapple
  • 本文由 发表于 2022年3月14日 18:52:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/71466562.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定