how to filter dataframe base on some columns value in golang?

huangapple go评论72阅读模式
英文:

how to filter dataframe base on some columns value in golang?

问题

输入示例:

name,price,pay_time,refund_time
job,19.0,20220622 12:23:23,20220622 13:23:23
kim,0,20220623 12:23:23,20220623 13:23:23

期望的输出示例:

name,price,finnal_time
job,19.0,20220622 12:23:23
kim,0,20220623 13:23:23

规则是如果价格等于0,则使用refund_time作为finnal_time,否则使用pay_time

目前我使用的是 https://github.com/go-gota/gota 进行处理,我的代码如下所示:

package frame

import (
	"fmt"
	"github.com/go-gota/gota/dataframe"
	"github.com/go-gota/gota/series"
	"strings"
)

func LoadCSV() {
	csvStr := `
name,price,pay_time,refund_time
job,19.0,20220622 12:23:23,20220622 13:23:23
kim,0,20220623 12:23:23,20220623 13:23:23
`
	df := dataframe.ReadCSV(strings.NewReader(csvStr))
	df = df.Filter(dataframe.F{Colname: "price", Comparator: series.Eq, Comparando: 0})
	fmt.Println("df ->", df)
}

但输出结果为:

    name     price    pay_time          refund_time
 0: kim      0.000000 20220623 12:23:23 20220623 13:23:23
    <string> <float>  <string>          <string>

而且其他行被删除了,这不是我想要的结果。

英文:

input sample:

name,price,pay_time,refund_time
job,19.0,20220622 12:23:23,20220622 13:23:23
kim,0,20220623 12:23:23,20220623 13:23:23

expect sample:

name,price,finnal_time
job,19.0,20220622 12:23:23
kim,0,20220623 13:23:23

The rule is once the price is equal to 0 and we will use the refund_time as final_time, otherwise, pay_time will be use.

Currently I do it by using https://github.com/go-gota/gota
my code is show as below:

package frame

import (
	&quot;fmt&quot;
	&quot;github.com/go-gota/gota/dataframe&quot;
	&quot;github.com/go-gota/gota/series&quot;
	&quot;strings&quot;
)

func LoadCSV() {
	csvStr := `
name,price,pay_time,refund_time
job,19.0,20220622 12:23:23,20220622 13:23:23
kim,0,20220623 12:23:23,20220623 13:23:23
`
	df := dataframe.ReadCSV(strings.NewReader(csvStr))
	df = df.Filter(dataframe.F{Colname: &quot;price&quot;, Comparator: series.Eq, Comparando: 0})
	fmt.Println(&quot;df --&gt;&quot;, df)
}

but the output is:


    name     price    pay_time          refund_time
 0: kim      0.000000 20220623 12:23:23 20220623 13:23:23
    &lt;string&gt; &lt;float&gt;  &lt;string&gt;          &lt;string&gt;

and the other line is deleted, is not what I want.

答案1

得分: 1

package main

import (
	"fmt"
	"strings"

	"github.com/go-gota/gota/series"

	"github.com/go-gota/gota/dataframe"
)

func main() {
	csvStr := `
name,price,pay_time,refund_time
job,19.0,20220622 12:23:23,20220622 13:23:23
kim,0,20220623 12:23:23,20220623 13:23:23
`
	df := dataframe.ReadCSV(strings.NewReader(csvStr))
	df = df.Filter(dataframe.F{Colname: "price", Comparator: series.GreaterEq, Comparando: 0})

	var (
		finalTimeList []string
	)

	for i := 0; i < df.Nrow(); i++ {
		payTime := df.Elem(i, 2).Val().(string)
		price := df.Elem(i, 1).Val().(float64)
		if price == 0 {
			payTime = df.Elem(i, 3).Val().(string)
		}
		finalTimeList = append(finalTimeList, payTime)
	}
	df = df.Mutate(series.New(finalTimeList, series.String, "final_time"))
	df = df.Drop([]string{"pay_time", "refund_time"})
	fmt.Println("df ->", df)
}

这是您提供的代码的翻译版本。

英文:

Does it meet your requirements?

package main

import (
	&quot;fmt&quot;
	&quot;strings&quot;

	&quot;github.com/go-gota/gota/series&quot;

	&quot;github.com/go-gota/gota/dataframe&quot;
)

func main() {
	csvStr := `
name,price,pay_time,refund_time
job,19.0,20220622 12:23:23,20220622 13:23:23
kim,0,20220623 12:23:23,20220623 13:23:23
`
	df := dataframe.ReadCSV(strings.NewReader(csvStr))
	df = df.Filter(dataframe.F{Colname: &quot;price&quot;, Comparator: series.GreaterEq, Comparando: 0})

	var (
		finalTimeList []string
	)

	for i := 0; i &lt; df.Nrow(); i++ {
		payTime := df.Elem(i, 2).Val().(string)
		price := df.Elem(i, 1).Val().(float64)
		if price == 0 {
			payTime = df.Elem(i, 3).Val().(string)
		}
		finalTimeList = append(finalTimeList, payTime)
	}
	df = df.Mutate(series.New(finalTimeList, series.String, &quot;final_time&quot;))
	df = df.Drop([]string{&quot;pay_time&quot;, &quot;refund_time&quot;})
	fmt.Println(&quot;df --&gt;&quot;, df)
}

huangapple
  • 本文由 发表于 2022年6月22日 11:36:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/72709390.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定