Julia数据框列的总和,其中另一列的值在列表中。

huangapple go评论95阅读模式
英文:

Sum of Julia Dataframe column where values of another column are in a list

问题

  1. 如何编写适用于Julia的一行代码,用于对“col1”中的值在“list”中的行的“col2”值求和?我对Julia还不太熟悉,尝试以下代码会出现错误“Exception has occurred: DimensionMismatch DimensionMismatch: arrays could not be broadcast to a common size; got a dimension with lengths 10 and 3
  1. total_sum = sum(df[ismember(df[:, :col1], list), :col2])
英文:

How do I make a line of code that works for Julia to sum the values of col2 where the values of col1 that are in list ? I'm pretty new to Julia and trying the following lines prints out the error Exception has occurred: DimensionMismatch
DimensionMismatch: arrays could not be broadcast to a common size; got a dimension with lengths 10 and 3

  1. total_sum = sum(df[ismember(df[:, :col1], list), :col2])

答案1

得分: 2

以下是已翻译的代码部分:

  1. julia> df = DataFrame(reshape(1:12,4,3),:auto)
  2. 4×3 DataFrame
  3. Row x1 x2 x3
  4. Int64 Int64 Int64
  5. ─────┼─────────────────────
  6. 1 1 5 9
  7. 2 2 6 10
  8. 3 3 7 11
  9. 4 4 8 12
  10. julia> list = [2,3]
  11. 2-element Vector{Int64}:
  12. 2
  13. 3
  14. julia> sum(df.x2[df.x1 .∈ Ref(list)])
  15. 13

此代码段使用了 Julia 中的广播功能,在其中 in(Julia 中的 ismember 函数)也可以写为 Ref(list) 用于防止对 list 进行广播。

英文:

One way could be:

  1. julia> df = DataFrame(reshape(1:12,4,3),:auto)
  2. 4×3 DataFrame
  3. Row x1 x2 x3
  4. Int64 Int64 Int64
  5. ─────┼─────────────────────
  6. 1 1 5 9
  7. 2 2 6 10
  8. 3 3 7 11
  9. 4 4 8 12
  10. julia> list = [2,3]
  11. 2-element Vector{Int64}:
  12. 2
  13. 3
  14. julia> sum(df.x2[df.x1 .∈ Ref(list)])
  15. 13

Uses broadcasting on in (how ismember is written in Julia) which can also be written as . Ref(list) is used to prevent broadcasting over list.

答案2

得分: 1

filter! 是一个值得了解的函数,具体取决于你想要做什么(使用 Dan Getz 的代码示例):

  1. julia> sum(filter!(:x1 => x1 -> x1 [2,3], df).x2)
  2. 13
英文:

Depending on what you want to do filter! is also worth knowing (using code form Dan Getz's answer):

  1. julia> sum(filter!(:x1 => x1 -> x1 [2,3], df).x2)
  2. 13

答案3

得分: 0

Not exactly sure if this is what you're asking but try intersect

  1. julia> using DataFrames
  2. julia> df = DataFrame(a = 1:5, b = 2:6)
  3. 5×2 DataFrame
  4. Row a b
  5. Int64 Int64
  6. ─────┼──────────────
  7. 1 1 2
  8. 2 2 3
  9. 3 3 4
  10. 4 4 5
  11. 5 5 6
  12. julia> list = collect(3:10);
  13. julia> sum(df.b[intersect(df.a, list)])
  14. 15
英文:

Not exactly sure if this is what you're asking but try intersect

  1. julia> using DataFrames
  2. julia> df = DataFrame(a = 1:5, b = 2:6)
  3. 5×2 DataFrame
  4. Row a b
  5. Int64 Int64
  6. ─────┼──────────────
  7. 1 1 2
  8. 2 2 3
  9. 3 3 4
  10. 4 4 5
  11. 5 5 6
  12. julia> list = collect(3:10);
  13. julia> sum(df.b[intersect(df.a, list)])
  14. 15

huangapple
  • 本文由 发表于 2023年2月14日 07:56:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/75442253.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定