英文:
Julia DataFrame: the best way to replace the data in DataFrame?
问题
我正在考虑如何替换DataFrame中的数据,我的意思是:
-
df1的存在如下
Row │ c_name t_name a_number │ String Any Int64 ─────┼────────────────────────────────── 1 │ f1.id ["f1"] 2 2 │ f2.name ["f2"] 2 3 │ f.id ["f","f2"] 1 4 │ f.name ["f"] 1 5 │ f3.id ["f3","f"] 1
-
其他df2如下
Row │ t_name │ String ─────┼─────────── 1 │ f 2 │ f1 3 │ f2 4 │ f3
-
使df1.t_name与df2.t_name匹配,然后用df2中相同t_name的行号替换df1中的t_name,例如 f->1,f1->2...
Row │ c_name t_name a_number │ String Any Int64 ─────┼────────────────────────────────── 1 │ f1.id [2] 2 2 │ f2.name [3] 2 3 │ f.id [1,3] 1 4 │ f.name [1] 1 5 │ f3.id [4,1] 1
我认为可以通过迭代来实现,但似乎有些繁琐。是否可以使用连接(join)?但很难将它们替换为df2中的行号。Julia的DataFrame可能有更智能的方法。
如果您注意到了这一点,请告诉我。谢谢。
英文:
I am considering what is the best way to replace the data in DataFrame.
I mean
-
df1 exists like this
<pre>
Row │ c_name t_name a_number
│ String Any Int64
─────┼──────────────────────────────────
1 │ f1.id ["f1"] 2
2 │ f2.name ["f2"] 2
3 │ f.id ["f","f2"] 1
4 │ f.name ["f"] 1
5 │ f3.id ["f3","f"] 1
</pre> -
Other df2 exists like this,
<pre>
Row │ t_name
│ String
─────┼───────────
1 │ f
2 │ f1
3 │ f2
4 │ f3
</pre> -
Make matching df1.t_name & df2.t_name, then df1.t_name is replaced by df2 Row number of the same t_name in df2, ex. f->1, f1->2...
<pre>
Row │ c_name t_name a_number
│ String Any Int64
─────┼──────────────────────────────────
1 │ f1.id [2] 2
2 │ f2.name [3] 2
3 │ f.id [1,3] 1
4 │ f.name [1] 1
5 │ f3.id [4,1] 1
</pre>
I think it can be make with iteration but seems silly. Use join? but hard to replace them with Row number.
Julia DataFrame may have a smarter way.
I am appreciated if you noticed it to me.
Thank you.
答案1
得分: 1
@assert allunique(df2.t_name)
d = Dict(df2.t_name .=> axes(df2, 1))
df1.t_name = [getindex.(Ref(d), v) for v in df1.t_name]
英文:
Use dictionary for this:
@assert allunique(df2.t_name)
d = Dict(df2.t_name .=> axes(df2, 1)
df1.t_name = [getindex.(Ref(d), v) for x in df1.t_name]
答案2
得分: 0
"Wow, 非常感谢Ph. Kaminski。这正是我想要的优雅代码。它完美地运行了,不过为了未来的类似问题,我需要稍作更正。 :^)
d = Dict(df2.t_name .=> axes(df2, 1)) -> 在axes()后面添加")"
df1.t_name = [getindex.(Ref(d), x) for x in df1.t_name] -> 将"v"更改为"x"
这段代码可以正常工作,因为df2.t_name是完全唯一的。
Jeszcze raz dziękuję, Dr. Kamiński"
英文:
Wow, Thank you so much Ph. Kaminski again. This is totaly elegant code that I wanted.
It worked perfectly, however I have to correct it a little bit for the future similar questioners. :^)
d = Dict(df2.t_name .=> axes(df2, 1) ) <- add ")" after axes()
df1.t_name = [getindex.(Ref(d), x) for x in df1.t_name] <- change "v" to "x"
And this code works fine because df2.t_name is perfectly unique.
Jeszcze raz dziękuję, Dr. Kamiński
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论