Julia数据框列的总和,其中另一列的值在列表中。

huangapple go评论56阅读模式
英文:

Sum of Julia Dataframe column where values of another column are in a list

问题

如何编写适用于Julia的一行代码,用于对“col1”中的值在“list”中的行的“col2”值求和?我对Julia还不太熟悉,尝试以下代码会出现错误“Exception has occurred: DimensionMismatch DimensionMismatch: arrays could not be broadcast to a common size; got a dimension with lengths 10 and 3” 
total_sum = sum(df[ismember(df[:, :col1], list), :col2])
英文:

How do I make a line of code that works for Julia to sum the values of col2 where the values of col1 that are in list ? I'm pretty new to Julia and trying the following lines prints out the error Exception has occurred: DimensionMismatch
DimensionMismatch: arrays could not be broadcast to a common size; got a dimension with lengths 10 and 3

total_sum = sum(df[ismember(df[:, :col1], list), :col2])

答案1

得分: 2

以下是已翻译的代码部分:

julia> df = DataFrame(reshape(1:12,4,3),:auto)
4×3 DataFrame
 Row │ x1     x2     x3    
Int64  Int64  Int64 
─────┼─────────────────────
   11      5      9
   22      6     10
   33      7     11
   44      8     12

julia> list = [2,3]
2-element Vector{Int64}:
 2
 3

julia> sum(df.x2[df.x1 .∈ Ref(list)])
13

此代码段使用了 Julia 中的广播功能,在其中 in(Julia 中的 ismember 函数)也可以写为 Ref(list) 用于防止对 list 进行广播。

英文:

One way could be:

julia> df = DataFrame(reshape(1:12,4,3),:auto)
4×3 DataFrame
 Row │ x1     x2     x3    
     │ Int64  Int64  Int64 
─────┼─────────────────────
   1 │     1      5      9
   2 │     2      6     10
   3 │     3      7     11
   4 │     4      8     12

julia> list = [2,3]
2-element Vector{Int64}:
 2
 3

julia> sum(df.x2[df.x1 .∈ Ref(list)])
13

Uses broadcasting on in (how ismember is written in Julia) which can also be written as . Ref(list) is used to prevent broadcasting over list.

答案2

得分: 1

filter! 是一个值得了解的函数,具体取决于你想要做什么(使用 Dan Getz 的代码示例):

julia> sum(filter!(:x1 => x1 -> x1  [2,3], df).x2)
13
英文:

Depending on what you want to do filter! is also worth knowing (using code form Dan Getz's answer):

julia> sum(filter!(:x1 => x1 -> x1 ∈ [2,3], df).x2)
13

答案3

得分: 0

Not exactly sure if this is what you're asking but try intersect

julia> using DataFrames

julia> df = DataFrame(a = 1:5, b = 2:6)
5×2 DataFrame
 Row │ a      b
Int64  Int64
─────┼──────────────
   11      2
   22      3
   33      4
   44      5
   55      6

julia> list = collect(3:10);

julia> sum(df.b[intersect(df.a, list)])
15
英文:

Not exactly sure if this is what you're asking but try intersect

julia> using DataFrames

julia> df = DataFrame(a = 1:5, b = 2:6)
5×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      2
   2 │     2      3
   3 │     3      4
   4 │     4      5
   5 │     5      6

julia> list = collect(3:10);

julia> sum(df.b[intersect(df.a, list)])
15

huangapple
  • 本文由 发表于 2023年2月14日 07:56:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/75442253.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定