新列基于筛选和多列的索引?

huangapple go评论106阅读模式
英文:

New column based on a filter and an index of multiples columns?

问题

I understand your request. Here's the translated code portion for your first scenario:

我明白你的要求以下是你的第一个情景的翻译代码部分

```python
for each row : 
    if (df['value type'] == 'value train'):
        #and (type,company) is the same
        df['train value'] = df['value']
        remove row

And here's the translated code portion for your second scenario:

以下是你的第二个情景的翻译代码部分

```python
if df['value time'] == 'present' then add to new column
英文:

I've been trying to search/think about an answer, probably with a melt or stack, but still can't seem to do it.

Here's my DF :

d = {'type' : [1, 2, 3, 4, 5, 1, 2, 3, 4, 5],
 'company' : ['A', 'B', 'C', 'D', 'E','A', 'B', 'C', 'D', 'E'],
 'value type': ['value car','value car','value car','value car','value car', 'value train','value train','value train','value train','value train',],
 'value': [0.1, 0.2, 0.3, 0.4, 0.5, 0.15, 0.25, 0.35, 0.45, 0.55] }

df = pd.DataFrame(d)

Here is what I want (I have the array on the left, I want the one on the right):
新列基于筛选和多列的索引?
As you can see, I want a new column "train value" based on the combination (type,company)

Something like

for each row : 
    if (df['value type'] == 'value train'):
        #and (type,company) is the same
        df['train value'] = df['value']
        remove row

For example, the company A from type 1 will have a new value in a new column for its train value.
Is there a way to do this properly ?

EDIT::: There was a good answer but I didn't explain myself clearly. I want only a new column with only "one value type". For example my new DF :

d = {'type' : [1, 2, 3, 4, 5, 1, 2, 3, 4, 5],
 'company' : ['A', 'B', 'C', 'D', 'E','A', 'B', 'C', 'D', 'E'],
 'month' : ['jan', 'feb', 'marc', 'apr', 'may', 'jan', 'feb', 'marc', 'apr', 'sep'],
 'business' : ['business1', 'business2', 'business3', 'business4', 'business5', 'business6', 'business7', 'business8', 'business9', 'business10'], 
 'value time': ['past', 'past', 'past', 'past', 'present', 'present', 'present', 'present', 'future', 'future'],
 'value': [0.1, 0.2, 0.3, 0.4, 0.11, 0.21, 0.31, 0.41, 0.45, 0.55] }

df = pd.DataFrame(d)

Heres what I want this time : 新列基于筛选和多列的索引?

If possible, only the values with the "present" will be in the new column. Something like

if df['value time'] == 'present' then add to new column

答案1

得分: 2

你应该对你的数据框进行重塑:

company_to_type = df.set_index('company')['type'].to_dict()
df = df.pivot(index='company', columns='value type', values='value').reset_index()
df['type'] = df.company.map(company_to_type)
df = df.rename_axis(None, axis=1)
df = df[['type', 'company', 'value train', 'value car']]

你将得到:

   type company  value train  value car
0     1       A         0.15        0.1
1     2       B         0.25        0.2
2     3       C         0.35        0.3
3     4       D         0.45        0.4
4     5       E         0.55        0.5
英文:

You should pivot your dataframe:

company_to_type = df.set_index('company')['type'].to_dict()
df = df.pivot(index='company', columns='value type', values='value').reset_index()
df['type'] = df.company.map(company_to_type)
df = df.rename_axis(None, axis=1)
df = df[['type', 'company', 'value train', 'value car']]

and you'll get

   type company  value train  value car
0     1       A         0.15        0.1
1     2       B         0.25        0.2
2     3       C         0.35        0.3
3     4       D         0.45        0.4
4     5       E         0.55        0.5

huangapple
  • 本文由 发表于 2020年1月7日 00:19:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/59615488.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定