如何在 Golang 中找到结构体切片中最频繁的整数?

huangapple go评论91阅读模式
英文:

how to find most frequent integer in a slice of struct with Golang

问题

以下是问题的翻译结果:

免责声明:我不是专业开发人员,已经在学习Go语言大约8个月(通过Udemy和YouTube),对于下面这样简单的问题仍然没有头绪。

以下是问题的概述:

  • 我试图从解码的.json文件中找到最常见的“年龄”(包含字符串“name”和整数“age”的结构体)。
  • 然后,我需要根据“年龄”的最大出现频率打印“name”。
  • 基于“年龄”的最大出现频率打印的“name”需要按字母顺序排序。

输入(.json):

[
{"name": "John","age": 15},
{"name": "Peter","age": 12},
{"name": "Roger","age": 12},
{"name": "Anne","age": 44},
{"name": "Marry","age": 15},
{"name": "Nancy","age": 15}
]

输出:['John', 'Mary', 'Nancy']。

解释:因为数据中最常见的年龄是15(出现了3次),所以输出应该是一个字符串数组,包含这三个人的名字,即['John', 'Mary', 'Nancy']。

异常情况:

  • 如果有多个具有相同最大出现次数的“年龄”,我需要将数据拆分并将它们打印到不同的数组中(例如,当'Anne'的年龄为12时,结果是:['John', 'Mary', 'Nancy'],['Anne','Peter','Roger'])。

这是我尝试过的(使用Go语言):

package main

import (
	"encoding/json"
	"fmt"
	"os"
	"sort"
)

// 1. 为.json准备空的结构体
type Passenger struct {
	Name string `json:"name"`
	Age  int    `json:"age"`
}

func main() {
	// 2. 加载json文件
	content, err := os.ReadFile("passenger.json")
	if err != nil {
		fmt.Println(err.Error())
	}

	// 3. 将json文件解析为切片
	var passengers []Passenger
	err2 := json.Unmarshal(content, &passengers)
	if err2 != nil {
		fmt.Println("Error JSON Unmarshalling")
		fmt.Println(err2.Error())
	}

	// 4. 找到最常见的年龄数字(?)
	ageCount := make(map[int]int)
	for _, v := range passengers {
		ageCount[v.Age]++
	}

	// 5. 打印姓名并按字母顺序排序(?)
	var maxAge int
	var maxCount int
	for age, count := range ageCount {
		if count > maxCount {
			maxAge = age
			maxCount = count
		}
	}

	var names []string
	for _, passenger := range passengers {
		if passenger.Age == maxAge {
			names = append(names, passenger.Name)
		}
	}

	sort.Strings(names)
	fmt.Println(names)
}

我从昨天开始就卡住了,我还尝试了许多编程挑战的解决方案,例如:

func majorityElement(arr int) int {
	sort.Ints(arr)
	return arr[len(arr)/2]
}

但我仍然在努力理解如何将Passenger切片中的“age”值作为整数输入(arr int)传递给上面的代码。

我在网上找到的其他解决方案是通过迭代map[int]int来找到最大频率:

func main() {
	arr := []int{90, 70, 30, 30, 10, 80, 40, 50, 40, 30}
	freq := make(map[int]int)
	for _, num := range arr {
		freq[num] = freq[num] + 1
	}
	fmt.Println("Frequency of the Array is: ", freq)
}

但是,.json文件不仅包含整数(年龄),还包含字符串(姓名)格式,我仍然不知道如何分别处理“name”和“age”。

我真的需要适当的指导。

英文:

*** disclaimer : i'm not a professional developer, been tinkering with golang for about 8 month (udemy + youtube) and still got no idea how to simple problem like below..

Here's the summarize of the problem :

  • I'm trying to find the most frequent "age" from struct that comes from the decoded .json file (containing string "name" and integer "age").

  • After that i need to print the "name" based on the maximum occurence frequency of "age".

  • The printed "name" based on the maximum-occurence of "age" needs to be sorted alpabethically

Input (.json) :

[
{"name": "John","age": 15},
{"name": "Peter","age": 12},
{"name": "Roger","age": 12},
{"name": "Anne","age": 44},
{"name": "Marry","age": 15},
{"name": "Nancy","age": 15}
]

Output : ['John', 'Mary', 'Nancy'].

Explaination : Because the most occurring age in the data is 15 (occured 3 times), the output should be an array of strings with the three people's
name, in this case it should be ['John', 'Mary', 'Nancy'].

Exception :

  • In the case there are multiple "age" with the same maximum-occurence count, i need to split the data and print them into different arrays (i.e when 'Anne' age is 12, the result is: ['John', 'Mary', 'Nancy'], ['Anne','Peter','Roger']

This is what i've tried (in Golang) :

package main
{
import (
	"encoding/json"
	"fmt"
	"os"
	"sort"
)
// 1. preparing the empty struct for .json
type Passanger struct {
	Name string `json:"name"`
	Age  int    `json:"age"`
}
func main() {
    // 2. load the json file
	content, err := os.ReadFile("passanger.json")
	if err != nil {
		fmt.Println(err.Error())
	}
	// 3. parse json file to slice
	var passangers []Passanger
	err2 := json.Unmarshal(content, &passangers)
	if err2 != nil {
		fmt.Println("Error JSON Unmarshalling")
		fmt.Println(err2.Error())
	}
	// 4. find most frequent age numbers (?)
	for _, v := range passangers {
        // this code only show the Age on every line
		fmt.Println(v.Age)
	}
	// 5. print the name and sort them apabethically (?)
       // use sort.slice package
       // implement group based by "max-occurence-age"
}

Been stuck since yesterday, i've also tried to implement the solution from many coding challenge question like :

func majorityElement(arr int) int {
	sort.Ints(arr)
	return arr[len(arr)/2]
}

but i'm still struggling to understand how to handle the "age" value from the Passanger slice as an integer input(arr int) to code above.

others solution i've found online is by iterating trough map[int]int to find the maximum frequency :

func main(){
    arr := []int{90, 70, 30, 30, 10, 80, 40, 50, 40, 30}
    freq := make(map[int]int)
    for _ , num :=  range arr {
        freq[num] = freq[num]+1
    }
    fmt.Println("Frequency of the Array is : ", freq)
}

but then again, the .json file contain not only integer(age) but also string(name) format, i still don't know how to handle the "name" and "age" separately..

i really need a proper guidance here.

*** here's the source code (main.go) and (.json) file that i mentioned above :
https://github.com/ariejanuarb/golang-json

答案1

得分: 2

在实施解决方案之前要做什么

在我上大学的头几年里,我的老师总是会对我和我的同学重复一些话,不要首先编写代码,特别是如果你是初学者,而是按照以下步骤进行操作:

  • 写下你想要发生的事情
  • 将问题细分为小步骤
  • 编写所有分支出现的情况和案例
  • 编写输入和输出(方法/函数签名)
  • 检查它们是否相互匹配

让我们按照这些步骤进行操作...

写下你想要发生的事情

你已经很好地定义了你的问题,所以我会跳过这一步。

让我们进一步详细说明

  1. 你有一个乘客名单
  2. 你想按照他们的年龄对乘客进行分组
  3. 你想查看哪些是最常见的/乘客最多的。
  4. 你想按字母顺序打印姓名

分支出来

  • 情景一:一个组的大小比其他所有组都大。
  • 情景二:两个或更多组的大小相同,并且比其他组都大。

可能还有其他情景,但是你需要自己找出来。

输入输出??

现在我们已经找出了我们必须要做的事情,我们将检查每个步骤的输入和输出,以实现这个目标。

步骤:

  1. 你有一个乘客名单

    • 输入 => 无或文件名(字符串)
    • 输出 => []Passenger
  2. 你想按照他们的年龄对乘客进行分组

    • 输入 => []Passenger // 乘客名单
    • 输出 => map[int][]int 或 map[int][]&Passenger // 年龄组

    第一种类型,括号内的是整个组的年龄。

    第二种类型是一个列表,其中包含以下内容之一:

    • 乘客在列表中的位置
    • 对象/乘客在内存中的地址

    只要我们能够轻松地从列表中检索到乘客,而不需要再次迭代它,这就不重要。

  3. 你想查看哪些是最常见的/乘客最多的。

    • 输入 => 组(年龄组)

    这里有情景1和情景2的分支...这意味着它必须对所有情景都有效,或者使用条件来分支它们。

    • 情景1的输出 => 最常见的年龄(整数)
    • 情景2的输出 => 最常见的年龄([]整数)

    我们可以看到情景1的输出可以与情景2的输出合并。

  4. 你想按字母顺序打印年龄组中所有乘客的姓名

    • 输入 => 组([]Passenger)+ 年龄([]整数)+ 乘客名单([]Passenger)。
    • 输出 => 字符串或[]字节,如果只是打印的话就什么都不返回...

    老实说,如果你愿意,你可以跳过这一步。

检查完毕,开始编码

首先,让我们创建符合我们签名的函数

type Passenger struct {
    Name string `json:"name"`
    Age  int    `json:"age"`
}

func GetPassengerList() []Passenger{
   // 2. 加载json文件
   content, err := os.ReadFile("passanger.json")
   if err != nil {
       fmt.Println(err.Error())
   }

   // 3. 将json文件解析为切片
   var passengers []Passenger
 
   err2 := json.Unmarshal(content, &passengers)
   if err2 != nil {
       fmt.Println("Error JSON Unmarshalling")
       fmt.Println(err2.Error())
   }

   return passengers
}

// 4a. 按年龄分组
func GroupByAge(passengers []Passenger) map[int][]int {
	group := make(map[int][]int, 0)

	for index, passenger := range passengers {
		ageGroup := group[passenger.Age]
		ageGroup = append(ageGroup, index)
		group[passenger.Age] = ageGroup
	}

	return group
}

// 4b. 找到最常见的年龄数字

func FindMostCommonAge(ageGroups map[int][]int) []int {
	mostFrequentAges := make([]int, 0)
	biggestGroupSize := 0

	// 找到最常见的年龄数字
	for age, ageGroup := range ageGroups {
		// 是最常见的年龄
		if biggestGroupSize < len(ageGroup) {
			biggestGroupSize = len(ageGroup)
			mostFrequentAges = []int{age}
		} else if biggestGroupSize == len(ageGroup) { // 是其中一个最常见的年龄
			mostFrequentAges = append(mostFrequentAges, age)
		}
		// 不是最常见的年龄,所以不做任何操作
	}

	return mostFrequentAges
}

func main() {
	passengers := loadPassengers()

    // 我很懒,但如果你愿意,你可以在打印之前对较小的切片进行排序,以提高性能
	sort.Slice(passengers, func(i, j int) bool {
		if passengers[i].Age == passengers[j].Age {
			return passengers[i].Name < passengers[j].Name
		}
		return passengers[i].Age < passengers[j].Age
	})

	// 年龄 => []位置
	// 数组的长度表示出现的次数
	ageGrouper := GroupByAge(passengers)

	mostFrequentAges := FindMostCommonAge(ageGrouper)

	// 打印乘客
	for _, age := range mostFrequentAges {
		fmt.Println("{")
		for _, passengerIndex := range ageGrouper[age] {
			fmt.Println("\t", passengers[passengerIndex].Name)
		}
		fmt.Println("}")
	}
}


英文:

What to do before implementing a solution

During my first years of college, my teachers would always repeat something to me and my fellow classmates, don't write code first, especially if you are a beginner, follow these steps instead:

  • Write what you want to happen
  • Details the problem into small steps
  • Write all scenarios and cases when they branch out
  • Write input and output (method/function signature)
  • Check they fit each other

Let's follow these steps...

Write what you want to happen

You have well defined your problem so i will skip this step.

Let's detail this further

  1. You have a passenger list
  2. You want to group the passengers by their age
  3. You want to look which are the most common/which have the most passengers.
  4. You want to print the name in alphabetical order

Branching out

  • Scenario one: one group has a bigger size than all others.
  • Scenario two: two or more groups has the same size and are bigger than the others.

There might more scenario but they are yours to find

input output ??

Well now that we have found out what we must be doing, we are going to check the input and output of each steps to achieve this goal.

the steps:

  1. You have a passenger list
  • input => none or filename (string)
  • output => []Passenger
  1. You want to group the passengers by their age
  • input => []Passenger // passenger list
  • output => map[int][]int or map[int][]&Passenger // ageGroups

The first type, the one inside the bracket is the age of the whole group.

The second type, is a list that contains either:

  • the passenger's position within the list
  • the address of the object/passenger in the memory

it is not important as long as we can retrieve back easily the passenger from the list without iterating it back again.

  1. You want to look which are the most common/which have the most passengers.
  • input => groups (ageGroups)

so here we have scenario 1 and 2 branching out... which mean that it must either be valid for all scenario or use a condition to branch them out.

  • output for scenario 1 => most common age (int)
  • output for scenario 2 => most common ages ([]int)

we can see that the output for scenario 1 can be merged with the output of scenario 2

  1. You want to print the name in alphabetical order of all passengers in an
    ageGroup

    • input => groups ([]Passenger) + ages ([]int) + passenger list ([]Passenger).
    • output => string or []byte or nothing if you just print it...

    to be honest, you can skip this one if you want to.

After checking, time to code

let's create functions that fit our signature first

type Passenger struct {
    Name string `json:&quot;name&quot;`
    Age  int    `json:&quot;age&quot;`
}

func GetPassengerList() []Passenger{
   // 2. load the json file
   content, err := os.ReadFile(&quot;passanger.json&quot;)
   if err != nil {
       fmt.Println(err.Error())
   }

   // 3. parse json file to slice
   var passengers []Passenger
 
   err2 := json.Unmarshal(content, &amp;passengers)
   if err2 != nil {
       fmt.Println(&quot;Error JSON Unmarshalling&quot;)
       fmt.Println(err2.Error())
   }

   return passengers
}

// 4a. Group by Age
func GroupByAge(passengers []Passenger) map[int][]int {
	group := make(map[int][]int, 0)

	for index, passenger := range passengers {
		ageGroup := group[passenger.Age]
		ageGroup = append(ageGroup, index)
		group[passenger.Age] = ageGroup
	}

	return group
}

// 4b. find the most frequent age numbers

func FindMostCommonAge(ageGroups map[int][]int) []int {
	mostFrequentAges := make([]int, 0)
	biggestGroupSize := 0

	// find most frequent age numbers
	for age, ageGroup := range ageGroups {
		// is most frequent age
		if biggestGroupSize &lt; len(ageGroup) {
			biggestGroupSize = len(ageGroup)
			mostFrequentAges = []int{age}
		} else if biggestGroupSize == len(ageGroup) { // is one of the most frequent age
			mostFrequentAges = append(mostFrequentAges, age)
		}
		// is not one of the most frequent age so does nothing
	}

	return mostFrequentAges
}

func main() {
	passengers := loadPassengers()

    // I am lazy but if you want you could sort the
    // smaller slice before printing to increase performance
	sort.Slice(passengers, func(i, j int) bool {
		if passengers[i].Age == passengers[j].Age {
			return passengers[i].Name &lt; passengers[j].Name
		}
		return passengers[i].Age &lt; passengers[j].Age
	})

	// age =&gt; []position
	// Length of the array count as the number of occurences
	ageGrouper := GroupByAge(passengers)

	mostFrequentAges := FindMostCommonAge(ageGrouper)

	// print the passenger
	for _, age := range mostFrequentAges {
		fmt.Println(&quot;{&quot;)
		for _, passengerIndex := range ageGrouper[age] {
			fmt.Println(&quot;\t&quot;, passengers[passengerIndex].Name)
		}
		fmt.Println(&quot;}&quot;)
	}
}


答案2

得分: -1

不应该比这更复杂:

  • 按年龄和姓名对源切片进行排序
  • 将其分解为具有相同年龄的序列,并且
  • 在进行过程中,跟踪最常见的

类似这样的代码:

type Person struct {
	Age  int
	Name string
}

func MostCommonAge(persons []Person) (mostCommonAge int, names []string) {
  
  sorted := make([]Person, len(persons))
  copy(sorted, persons)
  
  // 按年龄和姓名对切片进行排序
  sort.Slice(sorted, func(x, y int) bool {
    left, right := sorted[x], sorted[y]
    
    switch {
    case left.Age < right.Age:
      return true
    case left.Age > right.Age:
      return false
    default:
      return left.Name < right.Name
    }
  })

  updateMostCommonAge := func(seq []Person) (int, []string) {
    
    if len(seq) > len(names) {
      
      buf := make([]string, len(seq))
      for i := 0; i < len(seq); i++ {
        buf[i] = seq[i].Name
      }
      
      mostCommonAge = seq[0].Age
      names = buf
      
    }

    return mostCommonAge, names
  
  }

  for lo, hi := 0, 0; lo < len(sorted); lo = hi {
    
    for hi = lo; hi < len(sorted) && sorted[lo].Age == sorted[hi].Age; {
      hi++
    }
    
    mostCommonAge, names = updateMostCommonAge(sorted[lo:hi])
    
  }

  return mostCommonAge, names
}

另一种方法使用更多的内存,但稍微简单一些。在这里,我们建立一个按年龄分类的姓名映射,并遍历它以找到具有最长姓名列表的键。

type Person struct {
	Age  int
	Name string
}

func MostCommonAge(persons []Person) (mostCommonAge int, names []string) {
	namesByAge := map[int][]string{}

	for _, p := range persons {
		value, found := namesByAge[p.Age]
		if !found {
			value = []string{}
		}
		namesByAge[p.Age] = append(value, p.Name)
	}

	for age, nameList := range namesByAge {
		if len(nameList) > len(names) {
			mostCommonAge, names = age, nameList
		}
	}

	return mostCommonAge, names
}
英文:

Should be any more complicated than

  • Sort the source slice by age and name
  • Break it up into sequences with a common age, and
  • As you go along, track the most common

Something like this:

https://goplay.tools/snippet/6pCpkTEaDXN

type Person struct {
Age  int
Name string
}
func MostCommonAge(persons []Person) (mostCommonAge int, names []string) {
sorted := make([]Person, len(persons))
copy(sorted, persons)
// sort the slice by age and then by name
sort.Slice(sorted, func(x, y int) bool {
left, right := sorted[x], sorted[y]
switch {
case left.Age &lt; right.Age:
return true
case left.Age &gt; right.Age:
return false
default:
return left.Name &lt; right.Name
}
})
updateMostCommonAge := func(seq []Person) (int, []string) {
if len(seq) &gt; len(names) {
buf := make([]string, len(seq))
for i := 0; i &lt; len(seq); i++ {
buf[i] = seq[i].Name
}
mostCommonAge = seq[0].Age
names = buf
}
return mostCommonAge, names
}
for lo, hi := 0, 0; lo &lt; len(sorted); lo = hi {
for hi = lo; hi &lt; len(sorted) &amp;&amp; sorted[lo].Age == sorted[hi].Age; {
hi++
}
mostCommonAge, names = updateMostCommonAge(sorted[lo:hi])
}
return mostCommonAge, names
}

Another approach uses more memory, but is a little simpler. Here, we build a map of names by age and then iterate over it to find the key with the longest list of names.

https://goplay.tools/snippet/_zmMys516IM

type Person struct {
Age  int
Name string
}
func MostCommonAge(persons []Person) (mostCommonAge int, names []string) {
namesByAge := map[int][]string{}
for _, p := range persons {
value, found := namesByAge[p.Age]
if !found {
value = []string{}
}
namesByAge[p.Age] = append(value, p.Name)
}
for age, nameList := range namesByAge {
if len(nameList) &gt; len(names) {
mostCommonAge, names = age, nameList
}
}
return mostCommonAge, names
}

huangapple
  • 本文由 发表于 2022年9月30日 01:30:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/73899548.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定