英文:
Share out a value on multiple entries with DataFrames in Julia
问题
如何根据另一列中的国家数量和组合来共享列的值?
英文:
I'm new with the Julia programming language and I would like to group scores by country in a DataFrame like :
Row │ Name Score Country
│ String15 Float64 String15
─────┼────────────────────────────────
1 │ Oliver 5.0 France
2 │ Patrick 3.0 Spain
3 │ Jules 2.0 France
4 │ Steven 3.5 USA
5 │ Karl 4.0 France
6 │ Alexander 3.0 France/USA
7 │ Julian 1.0 Spain/USA
I have grouped my data by Country with
combine(groupby(db_test, :Country), :Score=>sum)
and I get :
Row │ Country Score_sum
│ String15 Float64
─────┼───────────────────────
1 │ France 11.0
2 │ Spain 3.0
3 │ USA 3.5
4 │ France/USA 3.0
5 │ Spain/USA 1.0
But I would like to share the score of France/USA and Spain/USA to France, Spain and USA to obtain this :
Row │ Country Score_sum
│ String15 Float64
─────┼───────────────────────
1 │ France 12.5
2 │ Spain 3.5
3 │ USA 5.5
How can I share the value of a column according to the number and the combination of countries in another column ?
答案1
得分: 1
以下是代码的翻译部分:
julia> using CSV, DataFrames
julia> data = """
score,country
5.0,France
3.0,Spain
2.0,France
3.5,USA
4.0,France
3.0,France/USA
1.0,Spain/USA"""
"score,country\n5.0,France\n3.0,Spain\n2.0,France\n3.5,USA\n4.0,France\n3.0,France/USA\n1.0,Spain/USA"
julia> df = CSV.read(IOBuffer(data), DataFrame)
7×2 DataFrame
Row │ score country
│ Float64 String15
─────┼─────────────────────
1 │ 5.0 France
2 │ 3.0 Spain
3 │ 2.0 France
4 │ 3.5 USA
5 │ 4.0 France
6 │ 3.0 France/USA
7 │ 1.0 Spain/USA
julia> df.countrys = split.(df.country, "/")
7-element Vector{Vector{SubString{String15}}}:
["France"]
["Spain"]
["France"]
["USA"]
["France"]
["France", "USA"]
["Spain", "USA"]
julia> df.scores = df.score ./ length.(df.countrys)
7-element Vector{Float64}:
5.0
3.0
2.0
3.5
4.0
1.5
0.5
julia> df2 = flatten(df, :countrys)
9×4 DataFrame
Row │ score country countrys scores
│ Float64 String15 SubStrin… Float64
─────┼─────────────────────────────────────────
1 │ 5.0 France France 5.0
2 │ 3.0 Spain Spain 3.0
3 │ 2.0 France France 2.0
4 │ 3.5 USA USA 3.5
5 │ 4.0 France France 4.0
6 │ 3.0 France/USA France 1.5
7 │ 3.0 France/USA USA 1.5
8 │ 1.0 Spain/USA Spain 0.5
9 │ 1.0 Spain/USA USA 0.5
julia> combine(groupby(df2, :countrys), :scores=>sum)
3×2 DataFrame
Row │ countrys scores_sum
│ SubStrin… Float64
─────┼───────────────────────
1 │ France 12.5
2 │ Spain 3.5
3 │ USA 5.5
这是代码的翻译部分。
英文:
Here is a full code doing this. I do it step-by-step to make it easy to understand what is going on:
julia> using CSV, DataFrames
julia> data = """score,country
5.0,France
3.0,Spain
2.0,France
3.5,USA
4.0,France
3.0,France/USA
1.0,Spain/USA"""
"score,country\n5.0,France\n3.0,Spain\n2.0,France\n3.5,USA\n4.0,France\n3.0,France/USA\n1.0,Spain/USA"
julia> df = CSV.read(IOBuffer(data), DataFrame)
7×2 DataFrame
Row │ score country
│ Float64 String15
─────┼─────────────────────
1 │ 5.0 France
2 │ 3.0 Spain
3 │ 2.0 France
4 │ 3.5 USA
5 │ 4.0 France
6 │ 3.0 France/USA
7 │ 1.0 Spain/USA
julia> df.countrys = split.(df.country, "/")
7-element Vector{Vector{SubString{String15}}}:
["France"]
["Spain"]
["France"]
["USA"]
["France"]
["France", "USA"]
["Spain", "USA"]
julia> df.scores = df.score ./ length.(df.countrys)
7-element Vector{Float64}:
5.0
3.0
2.0
3.5
4.0
1.5
0.5
julia> df2 = flatten(df, :countrys)
9×4 DataFrame
Row │ score country countrys scores
│ Float64 String15 SubStrin… Float64
─────┼─────────────────────────────────────────
1 │ 5.0 France France 5.0
2 │ 3.0 Spain Spain 3.0
3 │ 2.0 France France 2.0
4 │ 3.5 USA USA 3.5
5 │ 4.0 France France 4.0
6 │ 3.0 France/USA France 1.5
7 │ 3.0 France/USA USA 1.5
8 │ 1.0 Spain/USA Spain 0.5
9 │ 1.0 Spain/USA USA 0.5
julia> combine(groupby(df2, :countrys), :scores=>sum)
3×2 DataFrame
Row │ countrys scores_sum
│ SubStrin… Float64
─────┼───────────────────────
1 │ France 12.5
2 │ Spain 3.5
3 │ USA 5.5
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论