根据另一列的值将行转换为列

huangapple go评论65阅读模式
英文:

Convert rows into columns based on values in another column

问题

我有一个类似这样的pandas数据框架:

    train	val     score
1	0.6125	0.0827	loss
2	0.8565	0.9845	precision
3	0.7596	0.982	recall
4	0.0466	0.0454	loss
5	0.9897	0.9949	precision
6	0.9884	0.9949	recall

我想将它转换成类似这样的格式:

    train_loss  train_precision  train_recall  val_loss  val_precision  val_recall
1	0.6125	    0.8565           0.7596	       0.0827	 0.9845         0.982		
2	0.0466	    0.9897           0.9884	       0.0454	 0.9949	        0.9949	
英文:

I have a pandas data frame that looks like this

    train	val     score
1	0.6125	0.0827	loss
2	0.8565	0.9845	precision
3	0.7596	0.982	recall
4	0.0466	0.0454	loss
5	0.9897	0.9949	precision
6	0.9884	0.9949	recall

and I want to convert it to something like this

    train_loss  train_precision  train_recall  val_loss  val_precision  val_recall
1	0.6125	    0.8565           0.7596	       0.0827	 0.9845         0.982		
2	0.0466	    0.9897           0.9884	       0.0454	 0.9949	        0.9949	

答案1

得分: 2

你可以使用类似以下的代码:

# 转置一个 pandas 数据框
import pandas as pd

# 创建一个数据框
df = pd.DataFrame({'train': [0.6125, 0.8565, 0.7596, 0.0466, 0.9897, 0.9884], 'val': [0.0827, 0.9845, 0.982, 0.0454, 0.9949, 0.9949], 'score': ['loss', 'precision', 'recall']*2})

# 在 score 列中找到唯一值的列表
new_cols = {f'{i}_{j}': [] for i in ['train', 'val'] for j in df.score.unique()}

# 遍历数据框并将值附加到 new_cols 字典中
for _, row in df.iterrows():
    new_cols[f'train_{row["score"]}'].append(row[0])
    new_cols[f'val_{row["score"]}'].append(row[1])

# 从 new_cols 字典创建一个新的数据框
new_df = pd.DataFrame(new_cols)
print(new_df)

这段代码返回请求的数据框。

   train_loss  train_precision  train_recall  val_loss  val_precision  val_recall
0      0.6125           0.8565        0.7596    0.0827         0.9845      0.9820       
1      0.0466           0.9897        0.9884    0.0454         0.9949      0.9949       
英文:

You can use something like the following:

# Transpose a pandas dataframe
import pandas as pd

# Create a dataframe
df = pd.DataFrame({'train': [0.6125, 0.8565, 0.7596, 0.0466, 0.9897, 0.9884], 'val': [0.0827, 0.9845, 0.982, 0.0454, 0.9949, 0.9949], 'score': ['loss', 'precision', 'recall']*2})

# Find list of unique values in the score column
new_cols = {f'{i}_{j}': [] for i in ['train', 'val'] for j in df.score.unique()}

# Iterate over the dataframe and append values to the new_cols dictionary
for _, row in df.iterrows():
    new_cols[f'train_{row["score"]}'].append(row[0])
    new_cols[f'val_{row["score"]}'].append(row[1])

# Create a new dataframe from the new_cols dictionary
new_df = pd.DataFrame(new_cols)
print(new_df)

This code returns the requested df.

   train_loss  train_precision  train_recall  val_loss  val_precision  val_recall
0      0.6125           0.8565        0.7596    0.0827         0.9845      0.9820       
1      0.0466           0.9897        0.9884    0.0454         0.9949      0.9949       

答案2

得分: 0

def transform(dataframe):
    train_loss, train_precision, train_recall, val_loss, val_precision, val_recall = ([] for i in range(6))

    for idx in dataframe.index:
        if dataframe['score'][idx] == 'loss':
            train_loss.append(dataframe['train'][idx])
            val_loss.append(dataframe['val'][idx])
        if dataframe['score'][idx] == 'precision':
            train_precision.append(dataframe['train'][idx])
            val_precision.append(dataframe['val'][idx])
        if dataframe['score'][idx] == 'recall':
            train_recall.append(dataframe['train'][idx])
            val_recall.append(dataframe['val'][idx])

    return train_loss, train_precision, train_recall, val_loss, val_precision, val_recall

df = pd.DataFrame()
df['train_loss'], df['train_precision'], df['train_recall'], df['val_loss'], df['val_precision'], df['val_recall'] = transform(dataframe)
英文:
def transform(dataframe):
  train_loss, train_precision, train_recall, val_loss, val_precision, val_recall = ([] for i in range(6))
  
  for idx in dataframe.index:
    if dataframe['score'][idx] == 'loss':
       train_loss.append(dataframe['train'][idx])
       val_loss.append(dataframe['val'][idx])
    if dataframe['score'][idx] == 'precision':
       train_precision.append(dataframe['train'][idx])
       val_precision.append(dataframe['val'][idx])
    if dataframe['score'][idx] == 'recall':
       train_recall.append(dataframe['train'][idx])
       val_recall.append(dataframe['val'][idx])

   return train_loss, train_precision, train_recall, val_loss, val_precision, val_recall
 
  
df = pd.DataFrame()
df['train_loss'], df['train_precision'], df['train_recall'], df['val_loss'], df['val_precision'], df['val_recall'] = transform(dataframe)

huangapple
  • 本文由 发表于 2023年6月13日 00:58:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/76458804.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定