英文:
Share out a value on multiple entries with DataFrames in Julia
问题
如何根据另一列中的国家数量和组合来共享列的值?
英文:
I'm new with the Julia programming language and I would like to group scores by country in a DataFrame like :
 Row │ Name       Score    Country       
     │ String15   Float64  String15   
─────┼────────────────────────────────
   1 │ Oliver         5.0  France
   2 │ Patrick        3.0  Spain
   3 │ Jules          2.0  France
   4 │ Steven         3.5  USA
   5 │ Karl           4.0  France
   6 │ Alexander      3.0  France/USA
   7 │ Julian         1.0  Spain/USA
I have grouped my data by Country with
combine(groupby(db_test, :Country), :Score=>sum)
and I get :
 Row │ Country     Score_sum 
     │ String15    Float64   
─────┼───────────────────────
   1 │ France           11.0
   2 │ Spain             3.0
   3 │ USA               3.5
   4 │ France/USA        3.0
   5 │ Spain/USA         1.0
But I would like to share the score of France/USA and Spain/USA to France, Spain and USA to obtain this :
 Row │ Country     Score_sum 
     │ String15    Float64   
─────┼───────────────────────
   1 │ France           12.5
   2 │ Spain             3.5
   3 │ USA               5.5
How can I share the value of a column according to the number and the combination of countries in another column ?
答案1
得分: 1
以下是代码的翻译部分:
julia> using CSV, DataFrames
julia> data = """
score,country
5.0,France
3.0,Spain
2.0,France
3.5,USA
4.0,France
3.0,France/USA
1.0,Spain/USA"""
"score,country\n5.0,France\n3.0,Spain\n2.0,France\n3.5,USA\n4.0,France\n3.0,France/USA\n1.0,Spain/USA"
julia> df = CSV.read(IOBuffer(data), DataFrame)
7×2 DataFrame
 Row │ score    country    
     │ Float64  String15   
─────┼─────────────────────
   1 │     5.0  France
   2 │     3.0  Spain
   3 │     2.0  France
   4 │     3.5  USA
   5 │     4.0  France
   6 │     3.0  France/USA
   7 │     1.0  Spain/USA
julia> df.countrys = split.(df.country, "/")
7-element Vector{Vector{SubString{String15}}}:
 ["France"]
 ["Spain"]
 ["France"]
 ["USA"]
 ["France"]
 ["France", "USA"]
 ["Spain", "USA"]
julia> df.scores = df.score ./ length.(df.countrys)
7-element Vector{Float64}:
 5.0
 3.0
 2.0
 3.5
 4.0
 1.5
 0.5
julia> df2 = flatten(df, :countrys)
9×4 DataFrame
 Row │ score    country     countrys   scores  
     │ Float64  String15    SubStrin…  Float64 
─────┼─────────────────────────────────────────
   1 │     5.0  France      France         5.0
   2 │     3.0  Spain       Spain          3.0
   3 │     2.0  France      France         2.0
   4 │     3.5  USA         USA            3.5
   5 │     4.0  France      France         4.0
   6 │     3.0  France/USA  France         1.5
   7 │     3.0  France/USA  USA            1.5
   8 │     1.0  Spain/USA   Spain          0.5
   9 │     1.0  Spain/USA   USA            0.5
julia> combine(groupby(df2, :countrys), :scores=>sum)
3×2 DataFrame
 Row │ countrys   scores_sum 
     │ SubStrin…  Float64
─────┼───────────────────────
   1 │ France           12.5
   2 │ Spain             3.5
   3 │ USA               5.5
这是代码的翻译部分。
英文:
Here is a full code doing this. I do it step-by-step to make it easy to understand what is going on:
julia> using CSV, DataFrames
julia> data = """score,country
5.0,France
3.0,Spain
2.0,France
3.5,USA
4.0,France
3.0,France/USA
1.0,Spain/USA"""
"score,country\n5.0,France\n3.0,Spain\n2.0,France\n3.5,USA\n4.0,France\n3.0,France/USA\n1.0,Spain/USA"
julia> df = CSV.read(IOBuffer(data), DataFrame)
7×2 DataFrame
Row │ score    country    
│ Float64  String15   
─────┼─────────────────────
1 │     5.0  France
2 │     3.0  Spain
3 │     2.0  France
4 │     3.5  USA
5 │     4.0  France
6 │     3.0  France/USA
7 │     1.0  Spain/USA
julia> df.countrys = split.(df.country, "/")
7-element Vector{Vector{SubString{String15}}}:
["France"]
["Spain"]
["France"]
["USA"]
["France"]
["France", "USA"]
["Spain", "USA"]
julia> df.scores = df.score ./ length.(df.countrys)
7-element Vector{Float64}:
5.0
3.0
2.0
3.5
4.0
1.5
0.5
julia> df2 = flatten(df, :countrys)
9×4 DataFrame
Row │ score    country     countrys   scores  
│ Float64  String15    SubStrin…  Float64 
─────┼─────────────────────────────────────────
1 │     5.0  France      France         5.0
2 │     3.0  Spain       Spain          3.0
3 │     2.0  France      France         2.0
4 │     3.5  USA         USA            3.5
5 │     4.0  France      France         4.0
6 │     3.0  France/USA  France         1.5
7 │     3.0  France/USA  USA            1.5
8 │     1.0  Spain/USA   Spain          0.5
9 │     1.0  Spain/USA   USA            0.5
julia> combine(groupby(df2, :countrys), :scores=>sum)
3×2 DataFrame
Row │ countrys   scores_sum 
│ SubStrin…  Float64
─────┼───────────────────────
1 │ France           12.5
2 │ Spain             3.5
3 │ USA               5.5
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论