如何在将 JSON 反序列化为结构体时过滤掉重复项?

huangapple go评论81阅读模式
英文:

How to filter out duplicates while unmarshalling json into struct?

问题

我有这个JSON,我正在尝试将其解组到我的结构体中。

{
  "clientMetrics": [
    {
      "clientId": 951231,
      "customerData": {
        "Process": [
          "ABC"
        ],
        "Mat": [
          "KKK"
        ]
      },
      "legCustomer": [
        8773
      ]
    },
    {
      "clientId": 1234,
      "legCustomer": [
        8789
      ]
    },
    {
      "clientId": 3435,
      "otherIds": [
        4,
        32,
        19
      ],
      "legCustomer": [
        10005
      ]
    },
    {
      "clientId": 9981,
      "catId": 8,
      "legCustomer": [
        13769
      ]
    },
    {
      "clientId": 12124,
      "otherIds": [
        33,
        29
      ],
      "legCustomer": [
        12815
      ]
    },
    {
      "clientId": 8712,
      "customerData": {
        "Process": [
          "College"
        ]
      },
      "legCustomer": [
        951
      ]
    },
    {
      "clientId": 23214,
      "legCustomer": [
        12724,
        12727
      ]
    },
    {
      "clientId": 119812,
      "catId": 8,
      "legCustomer": [
        14519
      ]
    },
    {
      "clientId": 22315,
      "otherIds": [
        32
      ],
      "legCustomer": [
        12725,
        13993
      ]
    },
    {
      "clientId": 765121,
      "catId": 8,
      "legCustomer": [
        14523
      ]
    }
  ]
}

我使用这个工具生成了以下结构体:

type AutoGenerated struct {
	ClientMetrics []ClientMetrics `json:"clientMetrics"`
}
type CustomerData struct {
	Process []string `json:"Process"`
	Mat     []string `json:"Mat"`
}
type ClientMetrics struct {
	ClientID     int          `json:"clientId"`
	CustomerData CustomerData `json:"customerData,omitempty"`
	LegCustomer  []int        `json:"legCustomer"`
	OtherIds     []int        `json:"otherIds,omitempty"`
	CatID        int          `json:"catId,omitempty"`
	CustomerData CustomerData `json:"customerData,omitempty"`
}

现在我困惑的是,我有很多字符串或整数数组,所以如何过滤掉重复项?我相信在Go中没有集合数据类型,那么我该如何在这里实现相同的功能?基本上,当我将JSON解组到我的结构体中时,我需要确保没有重复项存在。有没有办法实现这一点?如果有,有人可以提供一个示例,说明如何针对上述JSON实现这一点,以及我应该如何设计我的结构体。

更新

所以基本上只需像这样使用,并更改我的结构体定义,就可以了吗?它会内部调用UnmarshalJSON并处理重复项吗?我将把JSON字符串和结构体传递给JSONStringToStructure方法。

func JSONStringToStructure(jsonString string, structure interface{}) error {
	jsonBytes := []byte(jsonString)
	return json.Unmarshal(jsonBytes, structure)
}

type UniqueStrings []string

func (u *UniqueStrings) UnmarshalJSON(in []byte) error {
	var arr []string
	if err := json.Unmarshal(in, &arr); err != nil {
		return err
	}
	*u = UniqueStrings(dedupStr(arr))
	return nil
}

func dedupStr(in []string) []string {
	seen := make(map[string]struct{})
	w := 0
	for i := range in {
		if _, s := seen[in[i]]; !s {
			seen[in[i]] = struct{}{}
			in[w] = in[i]
			w++
		}
	}
	return in[:w]
}
英文:

I have this json which I am trying to unmarshall to my struct.

{
"clientMetrics": [
{
"clientId": 951231,
"customerData": {
"Process": [
"ABC"
],
"Mat": [
"KKK"
]
},
"legCustomer": [
8773
]
},
{
"clientId": 1234,
"legCustomer": [
8789
]
},
{
"clientId": 3435,
"otherIds": [
4,
32,
19
],
"legCustomer": [
10005
]
},
{
"clientId": 9981,
"catId": 8,
"legCustomer": [
13769
]
},
{
"clientId": 12124,
"otherIds": [
33,
29
],
"legCustomer": [
12815
]
},
{
"clientId": 8712,
"customerData": {
"Process": [
"College"
]
},
"legCustomer": [
951
]
},
{
"clientId": 23214,
"legCustomer": [
12724,
12727
]
},
{
"clientId": 119812,
"catId": 8,
"legCustomer": [
14519
]
},
{
"clientId": 22315,
"otherIds": [
32
],
"legCustomer": [
12725,
13993
]
},
{
"clientId": 765121,
"catId": 8,
"legCustomer": [
14523
]
}
]
}

I used this tool to generate struct as shown below -

type AutoGenerated struct {
ClientMetrics []ClientMetrics `json:"clientMetrics"`
}
type CustomerData struct {
Process []string `json:"Process"`
Mat     []string `json:"Mat"`
}
type ClientMetrics struct {
ClientID     int          `json:"clientId"`
CustomerData CustomerData `json:"customerData,omitempty"`
LegCustomer  []int        `json:"legCustomer"`
OtherIds     []int        `json:"otherIds,omitempty"`
CatID        int          `json:"catId,omitempty"`
CustomerData CustomerData `json:"customerData,omitempty"`
}

Now my confusion is, I have lot of string or int array so how can I filter out duplicates? I believe there is no set data type in golang so how can I achieve same thing here? Basically when I unmarshall json into my struct I need to make sure there are no duplicates present at all. Is there any way to achieve this? If yes, can someone provide an example how to achieve this for my above json and how should I design my struct for that.

Update

So basically just use like this and change my struct definitions and that's all? Internally it will call UnmarshalJSON and take care of duplicates? I will pass json string and structure to JSONStringToStructure method.

func JSONStringToStructure(jsonString string, structure interface{}) error {
jsonBytes := []byte(jsonString)
return json.Unmarshal(jsonBytes, structure)
}
type UniqueStrings []string
func (u *UniqueStrings) UnmarshalJSON(in []byte) error {
var arr []string
if err := json.Unmarshal(in, arr); err != nil {
return err
}
*u = UniqueStrings(dedupStr(arr))
return nil
}
func dedupStr(in []string) []string {
seen:=make(map[string]struct{})
w:=0
for i:=range in {
if _,s:=seen[in[i]]; !s {
seen[in[i]]=struct{}{}
in[w]=in[i]
w++
}
}
return in[:w]
}

答案1

得分: 2

理想情况下,你应该对这些数组进行后处理以去除重复项。然而,你可以在解组过程中使用具有解组器的自定义类型来实现这一点:

type UniqueStrings []string

func (u *UniqueStrings) UnmarshalJSON(in []byte) error {
  var arr []string
  if err := json.Unmarshal(in, &arr); err != nil {
     return err
  }
  *u = UniqueStrings(dedupStr(arr))
  return nil
}

其中

func dedupStr(in []string) []string {
   seen := make(map[string]struct{})
   w := 0
   for i := range in {
      if _, s := seen[in[i]]; !s {
         seen[in[i]] = struct{}{}
         in[w] = in[i]
         w++
      } 
   }
   return in[:w]
}

你可以对[]int使用类似的方法。

你可以在结构体中使用这些自定义类型:

type CustomerData struct {
    Process UniqueStrings `json:"Process"`
    Mat     UniqueStrings `json:"Mat"`
}
英文:

Ideally, you should post-process these arrays to remove duplicates. However, you can achieve this during unmarshaling using a custom type with an unmarshaler:

type UniqueStrings []string
func (u *UniqueStrings) UnmarshalJSON(in []byte) error {
var arr []string
if err:=json.Unmarshal(in,arr); err!=nil {
return err
}
*u=UniqueStrings(dedupStr(arr))
return nil
}

where

func dedupStr(in []string) []string {
seen:=make(map[string]struct{})
w:=0
for i:=range in {
if _,s:=seen[in[i]]; !s {
seen[in[i]]=struct{}{}
in[w]=in[i]
w++
} 
}
return in[:w]
}

You may use a similar approach for []ints.

You use the custom types in your structs:

type CustomerData struct {
Process UniqueStrings `json:"Process"`
Mat     UniqueStrings `json:"Mat"`
}

huangapple
  • 本文由 发表于 2021年12月31日 05:11:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/70536744.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定