Python基于ID填写列数值

huangapple go评论64阅读模式
英文:

Python Fill in column values based on ID

问题

我想通过ID号来填充'Col'中的缺失值。
我已经尝试了groupby

这是预期的输出:

df=pd.DataFrame({
    'ID':[1,2,1,2,1,2],
    'Col':['One','Two','One','Two','One','Two']
})

我知道这是一个简单的例子,但我会感激你能提供的任何帮助。此外,我有一个包含100万行的数据帧,因此任何提高效率的方法都将不胜感激。

我尝试过的方法:

x=df_total[df_total['id'].astype(str)=='2']
buck_map = dict(x[~x['buckets'].isnull()][['id','buckets']].values)
x['buckets']=x['id'].map(buck_map)
英文:

I wanted to fill in the na values in 'Col' by the ID number.
I have tried groupby

df=pd.DataFrame({
    'ID':[1,2,1,2,1,2],
    'Col':['One','NaN','NaN','Two','NaN','NaN']
})

This is the expected output:

df=pd.DataFrame({
    'ID':[1,2,1,2,1,2],
    'Col':['One','Two','One','Two','One','Two']
})

I know this is an easy example but I would appreciate any help you could give me. Also I have a dataframe with 1 million rows so anything that would be time efficient would be appreciated

What I have tried:

x=df_total[df_total['id'].astype(str)=='2']
buck_map = dict(x[~x['buckets'].isnull()][['id','buckets']].values)
x['buckets']=x['id'].map(buck_map)

答案1

得分: 1

以下是翻译好的代码部分:

import pandas as pd

df = pd.DataFrame({
    'ID': [1, 2, 1, 2, 1, 2],
    'Col': ['One', 'NaN', 'NaN', 'Two', 'NaN', 'NaN']
})

def func(row):
    d = {0: 'zero', 1: 'One', 2: 'Two'}
    if row['Col'] == 'NaN':
        val = d[row['ID']]
    else:
        val = row['Col']
    return val

df['Col'] = df.apply(func, axis=1)

print(df)

输出结果如下:

   ID  Col
0   1  One
1   2  Two
2   1  One
3   2  Two
4   1  One
5   2  Two
英文:

It is not clear what you really want and if it is just a translation and substitution or if groupby is needed. Assumining you mean strings for the column and that you want just a substitution then you need a way of translating such as 1 to 'One' (a Dictionary is ideal) and then applying this to each row. You can use:

import pandas as pd

df=pd.DataFrame({
    'ID':[1,2,1,2,1,2],
    'Col':['One','NaN','NaN','Two','NaN','NaN']
})

def func(row):
    d= {0: 'zero', 1:'One', 2:'Two'}
    if row['Col'] == 'NaN':
        val = d[row['ID']]
    else:
        val = row['Col']
    return val
 
df['Col'] = df.apply(func, axis = 1)

print(df)

which gives:

   ID  Col
0   1  One
1   2  Two
2   1  One
3   2  Two
4   1  One
5   2  Two

答案2

得分: 1

以下是翻译好的部分:

"Your question is ambiguous, as there are several ways to produce the desired output based on your example."

"Assuming that you are looking for the "majority value" per ID, and also that the NaNs are actual float('NaN') and to be dropped, and not just the string 'NaN', then the following would be quite efficient:"

def majority(s):
    return s.mode()[0]

newdf = df.assign(Col=df.groupby('ID')['Col'].transform(majority))

">> newdf
ID Col
0 1 One
1 2 Two
2 1 One
3 2 Two
4 1 One
5 2 Two


"Note: to make sure the `'NaN'` are nan and not strings, do this first:"

```python
df = df.assign(Col=df['Col'].replace({'NaN': float('Nan')}))
英文:

Your question is ambiguous, as there are several ways to produce the desired output based on your example.

Assuming that you are looking for the "majority value" per ID, and also that the NaNs are actual float('NaN') and to be dropped, and not just the string 'NaN', then the following would be quite efficient:

def majority(s):
    return s.mode()[0]

newdf = df.assign(Col=df.groupby('ID')['Col'].transform(majority))

>>> newdf
   ID  Col
0   1  One
1   2  Two
2   1  One
3   2  Two
4   1  One
5   2  Two

Note: to make sure the 'NaN' are nan and not strings, do this first:

df = df.assign(Col=df['Col'].replace({'NaN': float('Nan')}))

答案3

得分: 1

你可以创建一个将ID值映射到填充值的字典:

fill_dict = df.groupby('ID')['Col'].last().to_dict()

然后使用字典将NaN值替换为填充值:

df['Col'] = df['Col'].fillna(df['ID'].map(fill_dict))
英文:

You can create a dictionary mapping ID values to fill values:

fill_dict = df.groupby('ID')['Col'].last().to_dict()

then replace NaN values with fill values using the dictionary:

df['Col'] = df['Col'].fillna(df['ID'].map(fill_dict))

huangapple
  • 本文由 发表于 2023年3月4日 01:49:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/75630327.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定