英文:
Sum of Julia Dataframe column where values of another column are in a list
问题
如何编写适用于Julia的一行代码,用于对“col1”中的值在“list”中的行的“col2”值求和?我对Julia还不太熟悉,尝试以下代码会出现错误“Exception has occurred: DimensionMismatch DimensionMismatch: arrays could not be broadcast to a common size; got a dimension with lengths 10 and 3”
total_sum = sum(df[ismember(df[:, :col1], list), :col2])
英文:
How do I make a line of code that works for Julia to sum the values of col2
where the values of col1
that are in list
? I'm pretty new to Julia and trying the following lines prints out the error Exception has occurred: DimensionMismatch
DimensionMismatch: arrays could not be broadcast to a common size; got a dimension with lengths 10 and 3
total_sum = sum(df[ismember(df[:, :col1], list), :col2])
答案1
得分: 2
以下是已翻译的代码部分:
julia> df = DataFrame(reshape(1:12,4,3),:auto)
4×3 DataFrame
Row │ x1 x2 x3
│ Int64 Int64 Int64
─────┼─────────────────────
1 │ 1 5 9
2 │ 2 6 10
3 │ 3 7 11
4 │ 4 8 12
julia> list = [2,3]
2-element Vector{Int64}:
2
3
julia> sum(df.x2[df.x1 .∈ Ref(list)])
13
此代码段使用了 Julia 中的广播功能,在其中 in
(Julia 中的 ismember 函数)也可以写为 ∈
。Ref(list)
用于防止对 list
进行广播。
英文:
One way could be:
julia> df = DataFrame(reshape(1:12,4,3),:auto)
4×3 DataFrame
Row │ x1 x2 x3
│ Int64 Int64 Int64
─────┼─────────────────────
1 │ 1 5 9
2 │ 2 6 10
3 │ 3 7 11
4 │ 4 8 12
julia> list = [2,3]
2-element Vector{Int64}:
2
3
julia> sum(df.x2[df.x1 .∈ Ref(list)])
13
Uses broadcasting on in
(how ismember is written in Julia) which can also be written as ∈
. Ref(list)
is used to prevent broadcasting over list
.
答案2
得分: 1
filter!
是一个值得了解的函数,具体取决于你想要做什么(使用 Dan Getz 的代码示例):
julia> sum(filter!(:x1 => x1 -> x1 ∈ [2,3], df).x2)
13
英文:
Depending on what you want to do filter!
is also worth knowing (using code form Dan Getz's answer):
julia> sum(filter!(:x1 => x1 -> x1 ∈ [2,3], df).x2)
13
答案3
得分: 0
Not exactly sure if this is what you're asking but try intersect
julia> using DataFrames
julia> df = DataFrame(a = 1:5, b = 2:6)
5×2 DataFrame
Row │ a b
│ Int64 Int64
─────┼──────────────
1 │ 1 2
2 │ 2 3
3 │ 3 4
4 │ 4 5
5 │ 5 6
julia> list = collect(3:10);
julia> sum(df.b[intersect(df.a, list)])
15
英文:
Not exactly sure if this is what you're asking but try intersect
julia> using DataFrames
julia> df = DataFrame(a = 1:5, b = 2:6)
5×2 DataFrame
Row │ a b
│ Int64 Int64
─────┼──────────────
1 │ 1 2
2 │ 2 3
3 │ 3 4
4 │ 4 5
5 │ 5 6
julia> list = collect(3:10);
julia> sum(df.b[intersect(df.a, list)])
15
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论