在列 A 中存在的字符串,使用 np.where() 添加到列 B 中的字符串。

huangapple go评论70阅读模式
英文:

Add string Column B where string exists in column Columns A + np.where() + pandas

问题

我需要在包含多个个体的数据集中添加一个次要ID以确保条目唯一为此我尝试使用`np.where()`,但在实施后我意识到每次都在覆盖上一个条目这是原始方法的示例

```python
df = pd.DataFrame({'Example':['1','2','3','4']})

df['Add'] = ''

df['Add'] = np.where(df['Example']== '1', 'One','')
df['Add'] = np.where(df['Example']== '2', 'Two','')
df['Add'] = np.where(df['Example']== '3', 'Three','')
df['Add'] = np.where(df['Example']== '4', 'Four','')

df.head()

作为解决办法,我尝试添加了str.contains(''),认为当字符串为空时会评估为True,并且只在这种情况下插入新字符串。如下:

df = pd.DataFrame({'Example':['1','2','3','4']})

df['Add'] = ''

df['Add'] = np.where(df['Example'].str.contains('')== '1', 'One','')
df['Add'] = np.where(df['Example'].str.contains('')== '2', 'Two','')
df['Add'] = np.where(df['Example'].str.contains('')== '3', 'Three','')
df['Add'] = np.where(df['Example'].str.contains('')== '4', 'Four','')

df.head()

在这种情况下,所有内容都被填充为空字符串...

有没有一种简单的方法在使用np.where()写入之前检查单元格是否为空?


<details>
<summary>英文:</summary>

I need to add a secondary ID in a dataset with unique entries for multiple individuals. To do so I am trying to use `np.where()`, after I implemented I realized I am overwriting the last entry each time. This an example of the original approach:

df = pd.DataFrame({'Example':['1','2','3','4']})

df['Add'] = ''

df['Add'] = np.where(df['Example']== '1', 'One','')
df['Add'] = np.where(df['Example']== '2', 'Two','')
df['Add'] = np.where(df['Example']== '3', 'Three','')
df['Add'] = np.where(df['Example']== '4', 'Four','')

df.head()


As a work around I tried adding `str.contains(&#39;&#39;)` thinking that would evaluate `True` when string is empty and only insert new string in that case. As below:

df = pd.DataFrame({'Example':['1','2','3','4']})

df['Add'] = ''

df['Add'] = np.where(df['Example'].str.contains('')== '1', 'One','')
df['Add'] = np.where(df['Example'].str.contains('')== '2', 'Two','')
df['Add'] = np.where(df['Example'].str.contains('')== '3', 'Three','')
df['Add'] = np.where(df['Example'].str.contains('')== '4', 'Four','')

df.head()


In that instance everything is being filled with an empty string... 

Is there a simple method to check if a cell is empty before writing with `np.where()`? 



</details>


# 答案1
**得分**: 2

使用 [`map`][1] 函数:

```python
dmap = {'1': 'One', '2': 'Two', '3': 'Three', '4': 'Four'}

df['Add'] = df['Example'].map(dmap).fillna('')

输出:

>>> df
  Example    Add
0       1    One
1       2    Two
2       3  Three
3       4   Four
英文:

Use map:

dmap = {&#39;1&#39;: &#39;One&#39;, &#39;2&#39;: &#39;Two&#39;, &#39;3&#39;: &#39;Three&#39;, &#39;4&#39;: &#39;Four&#39;}

df[&#39;Add&#39;]  = df[&#39;Example&#39;].map(dmap).fillna(&#39;&#39;)

Output:

&gt;&gt;&gt; df
  Example    Add
0       1    One
1       2    Two
2       3  Three
3       4   Four

huangapple
  • 本文由 发表于 2023年7月11日 04:37:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/76657176.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定