英文:
How to reorder a CSV file to group by contents of a particular column
问题
我非常新手Golang,我的问题也不清楚,但这就是我想要实现的。我有一个如下的csv文件,我主要想重新排列/排序最后一列(状态=通过、失败/跳过):
test,test-cat,skipped
test,test-cat,failed
test,test-cat,passed
test,test-cat,skipped
test,test-cat,passed
test,test-cat,failed
期望的结果是,如果最后一列具有相同的状态,则将它们分组在一起。
test,test-cat,skipped
test,test-cat,skipped
test,test-cat,failed
test,test-cat,failed
test,test-cat,passed
test,test-cat,passed
我写了以下代码,它看起来不太好:-),但它按照我想要的方式工作。
package main
import (
"bufio"
"fmt"
"os"
"strings"
)
func main() {
var FailStat, SkipStat,PassStat []string
file, err := os.Open("test.csv")
if err != nil {
fmt.Println(err)
} else {
scanner := bufio.NewScanner(file)
for scanner.Scan() {
line := scanner.Text()
if strings.Contains(line, "failed") {
FailStat = append(FailStat, line)
}
if strings.Contains(line, "skipped") {
SkipStat = append(SkipStat, line)
}
if strings.Contains(line, "passed") {
PassStat = append(PassStat, line)
}
}
}
file.Close()
var finalstat []string
finalstat = append(SkipStat, FailStat...)
finalstat = append(finalstat, PassStat...)
for _, line := range finalstat {
fmt.Println(line)
}
}
测试运行:
$ ./readfile
test,test-cat,skipped
test,test-cat,skipped
test,test-cat,failed
test,test-cat,failed
test,test-cat,passed
test,test-cat,passed
肯定有更好的方法,请给予建议。对于新手问题,我很抱歉!
英文:
I am very new go Golang and my question is not cleared also, but this is what I am trying to achieve.
I have a csv file as follow, as I am mainly trying to re-arrange/sort last column(status=passed,failed/skipped)
test,test-cat,skipped
test,test-cat,failed
test,test-cat,passed
test,test-cat,skipped
test,test-cat,passed
test,test-cat,failed
Expecting last column to be grouped them together if it has same status.
test,test-cat,skipped
test,test-cat,skipped
test,test-cat,failed
test,test-cat,failed
test,test-cat,passed
test,test-cat,passed
With this codes I did, it does not look good:-) but it works as I wanted.
package main
import (
"bufio"
"fmt"
"os"
"strings"
)
func main() {
var FailStat, SkipStat,PassStat []string
file, err := os.Open("test.csv")
if err != nil {
fmt.Println(err)
} else {
scanner := bufio.NewScanner(file)
for scanner.Scan() {
line := scanner.Text()
if strings.Contains(line, "failed") {
FailStat = append(FailStat, line)
}
if strings.Contains(line, "skipped") {
SkipStat = append(SkipStat, line)
}
if strings.Contains(line, "passed") {
PassStat = append(PassStat, line)
}
}
}
file.Close()
var finalstat []string
finalstat = append(SkipStat, FailStat...)
finalstat = append(finalstat, PassStat...)
for _, line := range finalstat {
fmt.Println(line)
}
}
Test-Run:
$ ./readfile
test,test-cat,skipped
test,test-cat,skipped
test,test-cat,failed
test,test-cat,failed
test,test-cat,passed
test,test-cat,passed
There must be a many better ways, please advice. Sorry for newbie question!
答案1
得分: 2
Inian的解决方案在状态分组的顺序不重要时可以工作(由于map的设计,你不应该期望每次运行时都能得到相同的分组顺序)。
如果你需要按照一致的顺序对分组进行排序:
package main
import (
"encoding/csv"
"io"
"log"
"os"
"sort"
"strings"
)
type Row struct {
Name, Category, Status string
}
func main() {
in := `test,test-cat,skipped
test,test-cat,failed
test,test-cat,passed
test,test-cat,skipped
test,test-cat,passed
test,test-cat,failed
`
r := csv.NewReader(strings.NewReader(in))
rows := make([]Row, 0)
for {
record, err := r.Read()
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
row := Row{record[0], record[1], record[2]}
rows = append(rows, row)
}
sort.Slice(rows, func(i, j int) bool { return rows[i].Status < rows[j].Status })
w := csv.NewWriter(os.Stdout)
for _, row := range rows {
w.Write([]string{row.Name, row.Category, row.Status})
}
w.Flush()
if err := w.Error(); err != nil {
log.Fatal(err)
}
}
我们得到的结果是:
test,test-cat,failed
test,test-cat,failed
test,test-cat,passed
test,test-cat,passed
test,test-cat,skipped
test,test-cat,skipped
将sort.Slice中的<更改为>可以反转排序的顺序。
如果你不想改动Row结构并在[]Row和[][]string之间进行转换:
// ...
rows := make([][]string, 0)
for {
row, err := r.Read()
// ...
rows = append(rows, row)
}
sort.Slice(rows, func(i, j int) bool { return rows[i][2] < rows[j][2] })
w := csv.NewWriter(os.Stdout)
for _, row := range rows {
w.Write(row)
}
// ...
在你的评论中,你提到想要特定分组的顺序,现在我可以在你的原始代码中看到你的目标 😊
在这种情况下,Inian的解决方案是正确的方向:
// ...
recordGroups := make(map[string][][]string)
for {
records, err := r.Read()
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
groupName := records[2]
recordGroups[groupName] = append(recordGroups[groupName], records)
}
w := csv.NewWriter(os.Stdout)
// 使用这个分组名称的切片来控制顺序
groupNames := []string{"failed", "passed", "skipped", "Bogus group!"}
for _, groupName := range groupNames {
recordGroup, ok := recordGroups[groupName]
if !ok {
log.Printf("did not find expected group %q\n", groupName)
continue
}
for _, record := range recordGroup {
if err := w.Write(record); err != nil {
log.Fatalln("error writing record to csv:", err)
}
}
}
// ...
2009/11/10 23:00:00 did not find expected group "Bogus group!"
test,test-cat,failed
test,test-cat,failed
test,test-cat,skipped
test,test-cat,skipped
test,test-cat,passed
test,test-cat,passed
英文:
Inian's solution will work if the order of the status groupings doesn't matter (because of map's design, you should never expect to get the same ordering of the groups from run to run).
If you need the groups consistently ordered, that is actually sorted:
package main
import (
"encoding/csv"
"io"
"log"
"os"
"sort"
"strings"
)
type Row struct {
Name, Category, Status string
}
func main() {
in := `test,test-cat,skipped
test,test-cat,failed
test,test-cat,passed
test,test-cat,skipped
test,test-cat,passed
test,test-cat,failed
`
r := csv.NewReader(strings.NewReader(in))
rows := make([]Row, 0)
for {
record, err := r.Read()
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
row := Row{record[0], record[1], record[2]}
rows = append(rows, row)
}
sort.Slice(rows, func(i, j int) bool { return rows[i].Status < rows[j].Status })
w := csv.NewWriter(os.Stdout)
for _, row := range rows {
w.Write([]string{row.Name, row.Category, row.Status})
}
w.Flush()
if err := w.Error(); err != nil {
log.Fatal(err)
}
}
and we get:
test,test-cat,failed
test,test-cat,failed
test,test-cat,passed
test,test-cat,passed
test,test-cat,skipped
test,test-cat,skipped
Change the < to > in the anonymous func for sort.Slice to reverse the order of the sort.
If you don't want to mess with the Row struct and convert between []Row and [][]string:
// ...
rows := make([][]string, 0)
for {
row, err := r.Read()
// ...
rows = append(rows, row)
}
sort.Slice(rows, func(i, j int) bool { return rows[i][2] < rows[j][2] })
w := csv.NewWriter(os.Stdout)
for _, row := range rows {
w.Write(row)
}
// ...
In a comment you mentioned wanting a specific order of the groups, and now I can see in your original code what you were aiming for 🙂
In which case Ianian's solution is going the right direction:
// ...
recordGroups := make(map[string][][]string)
for {
records, err := r.Read()
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
groupName := records[2]
recordGroups[groupName] = append(recordGroups[groupName], records)
}
w := csv.NewWriter(os.Stdout)
// Control the order with this slice of group names
groupNames := []string{"failed", "passed", "skipped", "Bogus group!"}
for _, groupName := range groupNames {
recordGroup, ok := recordGroups[groupName]
if !ok {
log.Printf("did not find expected group %q\n", groupName)
continue
}
for _, record := range recordGroup {
if err := w.Write(record); err != nil {
log.Fatalln("error writing record to csv:", err)
}
}
}
// ...
2009/11/10 23:00:00 did not find expected group "Bogus group!"
test,test-cat,failed
test,test-cat,failed
test,test-cat,skipped
test,test-cat,skipped
test,test-cat,passed
test,test-cat,passed
答案2
得分: 1
这个目的最好使用标准库中提供的 csv 包。逻辑涉及创建一个字符串到字符串切片的映射,其中键将是你想要分组的列,值将是唯一于该列的行的列表。
一旦填充了映射,接下来的操作将是以 CSV 格式打印结果。下面的示例涉及从变量中读取输入并打印回标准输出。你可以参考该包中的其他方法来在文本文件上执行相同的操作。
package main
import (
"encoding/csv"
"io"
"log"
"os"
"strings"
)
func main() {
in := `test,test-cat,skipped
test,test-cat,failed
test,test-cat,passed
test,test-cat,skipped
test,test-cat,passed
test,test-cat,failed
`
r := csv.NewReader(strings.NewReader(in))
dictMap := make(map[string][][]string)
for {
records, err := r.Read()
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
dictMap[records[2]] = append(dictMap[records[2]], records)
}
w := csv.NewWriter(os.Stdout)
for _, records := range dictMap {
for idx := range records {
if err := w.Write(records[idx]); err != nil {
log.Fatalln("error writing record to csv:", err)
}
}
}
w.Flush()
if err := w.Error(); err != nil {
log.Fatal(err)
}
}
英文:
It is better of to the csv package provided in the standard library for this purpose. The logic involves creating a map of string to a slice of strings, where the key will be the column you want to group on and the value being the list of rows that are unique to it.
Once you populate the map, the subsequent action would be to print the result back in CSV format. The below example involves reading the input from a variable and printing back to stdout. You can refer to the other methods in the package to perform the same on a text file.
package main
import (
"encoding/csv"
"io"
"log"
"os"
"strings"
)
func main() {
in := `test,test-cat,skipped
test,test-cat,failed
test,test-cat,passed
test,test-cat,skipped
test,test-cat,passed
test,test-cat,failed
`
r := csv.NewReader(strings.NewReader(in))
dictMap := make(map[string][][]string)
for {
records, err := r.Read()
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
dictMap[records[2]] = append(dictMap[records[2]], records)
}
w := csv.NewWriter(os.Stdout)
for _, records := range dictMap {
for idx := range records {
if err := w.Write(records[idx]); err != nil {
log.Fatalln("error writing record to csv:", err)
}
}
}
w.Flush()
if err := w.Error(); err != nil {
log.Fatal(err)
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论