2023年5月15日 14:11:53go评论76阅读模式

英文:

Explode a column vertically top create new columns

问题

I have a Dataframe like this:

name	zones
aa	[]
bb	[{"rack":11,"bin":22},{"rack":33,"bin":44}]

Now I want to transform into something like this:

name	rack	bin
aa	-	-
bb	11	22
bb	33	44

I tried this:

cols = ['zones',]

df1 = (df.drop(cols, axis=1)
.join(pd.concat([pd.json_normalize(df[x].explode()).add_prefix(f'{x}.')
for x in cols], axis=1)))

But it only gives the values of the first dictionary:

name  zones.rack  zones.bin
0   aa         NaN        NaN
1   bb        11.0       22.0

英文:

I have a Dataframe like this :

name	zones
aa	[]
bb	[{&quot;rack&quot;:11,&quot;bin&quot;:22},{&quot;rack&quot;:33,&quot;bin&quot;:44}]

Now I want to transform into something like this:

name	rack	bin
aa	-	-
bb	11	22
bb	33	44

I tried this:

cols = ['zones',]

df1 = (df.drop(cols, axis=1)
.join(pd.concat([pd.json_normalize(df[x].explode()).add_prefix(f'{x}.')
for x in cols], axis=1)))

But it only gives the values of first dictionary:

name  zones.rack  zones.bin
0   aa         NaN        NaN
1   bb        11.0       22.0

答案1

得分: 1

你可以在使用[`json_normalize`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html)和连接之前使用[`explode`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.explode.html)：

out = df.explode('zones', ignore_index=True)
out = out.join(pd.json_normalize(out.pop('zones')).add_prefix('zones.'))

*注意：如果要用`&#39;-&#39;`填充NaN，可以添加`.fillna('&#39;-&#39;')`。*

输出：

name zones.rack zones.bin
0 aa NaN NaN
1 bb 11.0 22.0
2 bb 33.0 44.0


如果有多个列需要处理，可以使用循环和[`concat`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html)：

cols = ['zones', 'zones2']

out = df2.explode(cols, ignore_index=True)
out = out.join(pd.concat([pd.json_normalize(out.pop(col))
.add_prefix(f'{col}.')
for col in cols
], axis=1)
)

例如，如果有一个`zones2`列：

name zones.rack zones.bin zones2.rack zones2.bin
0 aa NaN NaN NaN NaN
1 bb 11.0 22.0 11.0 22.0
2 bb 33.0 44.0 33.0 44.0


<details>
<summary>英文:</summary>

You could [`explode`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.explode.html) before converting to columns with [`json_normalize`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html) and joining:

out = df.explode('zones', ignore_index=True)
out = out.join(pd.json_normalize(out.pop('zones')).add_prefix('zones.'))

*NB. add `.fillna(&#39;-&#39;)` if you want to fill the NaNs with `&#39;-&#39;`.*

Output:

name zones.rack zones.bin
0 aa NaN NaN
1 bb 11.0 22.0
2 bb 33.0 44.0


If you have more than one column to handle, use a loop and [`concat`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html):

cols = ['zones', 'zones2']

out = df2.explode(cols, ignore_index=True)
out = out.join(pd.concat([pd.json_normalize(out.pop(col))
.add_prefix(f'{col}.')
for col in cols
], axis=1)
)

Example if there was a column `zones2`:

name zones.rack zones.bin zones2.rack zones2.bin
0 aa NaN NaN NaN NaN
1 bb 11.0 22.0 11.0 22.0
2 bb 33.0 44.0 33.0 44.0


</details>



# 答案2
**得分**: 0

需要在[`Series.explode`](http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.explode.html)创建的系列上创建索引，然后在[`json_normalize`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html)之后执行以下代码：

```python
cols = ['zones',]

dfs = []
for x in cols:
    s = df[x].explode()
    df1 = pd.json_normalize(s).set_index(s.index).add_prefix(f'{x}.')
    dfs.append(df1)

df1 = df.drop(cols, axis=1).join(pd.concat(dfs, axis=1))
print(df1)
  name  zones.rack  zones.bin
0   aa         NaN        NaN
1   bb        11.0       22.0
1   bb        33.0       44.0

英文:

You need create indices by Series created in Series.explode after json_normalize:

cols = [&#39;zones&#39;,]

dfs = []
for x in cols:
    s = df[x].explode()
    df1 = pd.json_normalize(s).set_index(s.index).add_prefix(f&#39;{x}.&#39;)
    dfs.append(df1)

df1 = df.drop(cols, axis=1).join(pd.concat(dfs, axis=1))
print (df1)
  name  zones.rack  zones.bin
0   aa         NaN        NaN
1   bb        11.0       22.0
1   bb        33.0       44.0

答案3

得分: 0

以下是翻译好的部分：

这是另一种选项：

out = (
    df.join(df.pop("zones").explode().dropna()
         .apply(lambda x: pd.Series({k: v for k,v in x.items()})))
         .fillna("-") #此链是可选的
)

输出：

print(out)

  names  rack   bin
0    aa     -     -
1    bb 11.00 22.00
1    bb 33.00 44.00

英文:

Here is another option :

out = (
    df.join(df.pop(&quot;zones&quot;).explode().dropna()
         .apply(lambda x: pd.Series({k: v for k,v in x.items()})))
         .fillna(&quot;-&quot;) #this chain is optional though
)

Output :

print(out)

  names  rack   bin
0    aa     -     -
1    bb 11.00 22.00
1    bb 33.00 44.00

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将一列垂直拆分以创建新列。

问题

答案1

答案3

如何向 Pandas 数据框按另一列分组添加具体数值的列？

我运行我的代码时，在Python中没有得到我想要的直方图。

TarFile.extractall基本路径错误，python？

从MagicMock对象继承的子类具有奇怪的规范’str’，无法使用或模拟类的方法。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论