英文:
Explode raises values error ValueError: columns must have matching element counts
问题
我有以下的数据框:
list1 = [1, 6, 7, [46, 56, 49], 45, [15, 10, 12]]
list2 = [[49, 57, 45], 3, 7, 8, [16, 19, 12], 41]
data = {'A': list1, 'B': list2}
data = pd.DataFrame(data)
我可以使用以下代码来展开数据框:
data.explode('A').explode('B')
但是当我运行以下代码来执行相同的操作时,会引发一个值错误:
data.explode(['A', 'B'])
ValueError Traceback (most recent call last)
<ipython-input-97-efafc6c7cbfa> in <module>
5 'B': list2}
6 data = pd.DataFrame(data)
----> 7 data.explode(['A', 'B'])
...
ValueError: columns must have matching element counts
有人能解释为什么吗?
英文:
I have the following dataframe:
list1 = [1, 6, 7, [46, 56, 49], 45, [15, 10, 12]]
list2 = [[49, 57, 45], 3, 7, 8, [16, 19, 12], 41]
data = {'A':list1,
'B': list2}
data = pd.DataFrame(data)
I can explode the dataframe using this piece of code:
data.explode('A').explode('B')
but when I run this one to do the same operation a value error is raised:
data.explode(['A', 'B'])
ValueError Traceback (most recent call last)
<ipython-input-97-efafc6c7cbfa> in <module>
5 'B': list2}
6 data = pd.DataFrame(data)
----> 7 data.explode(['A', 'B'])
~\AppData\Roaming\Python\Python38\site-packages\pandas\core\frame.py in explode(self, column, ignore_index)
9033 for c in columns[1:]:
9034 if not all(counts0 == self[c].apply(mylen)):
-> 9035 raise ValueError("columns must have matching element counts")
9036 result = DataFrame({c: df[c].explode() for c in columns})
9037 result = df.drop(columns, axis=1).join(result)
ValueError: columns must have matching element counts
Can anyone explain why?
答案1
得分: 1
df.explode(["A", "B"])
和 df.explode("A").explode("B")
不是相同的操作。看起来你的目标是获取所有组合,其中多列的explode
尝试解决不同的情况,其中你的列中有成对的列表。你可以在原始 GitHub 特性请求中看到其理由。这似乎是为了避免在其中一列中重复值。
在特性请求中有一个链接到 GitHub 的 gist/notebook,它探讨了如何实现explode
,但似乎无法处理并行的不匹配列表长度。
英文:
df.explode(["A", "B"])
and df.explode("A").explode("B")
do not do the same thing. It seems that you are aiming to get all the combinations where are the multi-column explode attempts to resolve a different scenario, one where you have paired lists in your columns. You can see the rationale in the original GitHub feature request. This seems to have been chosen to avoid duplicating values in one of the columns.
In the feature request there is a link to a GitHub gist/notebook that explores how explode could be implemented, but they seem to have not been able to explode with mis-matched list lengths in parallel.
答案2
得分: 1
尝试这个,如果在你的情况下有效。
import numpy as np
data = pd.DataFrame({'A': np.hstack(list1), 'B': np.hstack(list2)})
英文:
try this if it work in your case.
import numpy as np
data = pd.DataFrame({'A' : np.hstack(list1), 'B' : np.hstack(list2)})
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论