在数据框中将每一行重复 N 次,其中 N 是随机的。

huangapple go评论98阅读模式
英文:

Duplicating every row in dataframe N times where N is random

问题

import pandas as pd
import numpy as np

# Your original DataFrame
df = pd.DataFrame({'id': [1, 2, 3, 4],
                   'name': ['tim', 'jim', 'john', 'bill']})

# Duplicate each row randomly 0-5 times
df = df.loc[df.index.repeat(np.random.randint(0, 6, len(df)))]

# Resetting index to maintain a clean DataFrame
df.reset_index(drop=True, inplace=True)
英文:

I have a data set like so:

Input:

id name
1  tim
2  jim
3  john
4  bill

I want to duplicate each row in my data set randomly anywhere from 0 - 5 times.

So my final data set might look something like this:

Output:

id name
1  tim
1  tim
2  jim
3  john
3  john
3  john
3  john
4  bill
4  bill
4  bill

how can i do this in python pandas?

答案1

得分: 1

你可以使用numpy.uniform生成一个随机重复数组,然后使用它来对索引进行repeat操作以进行索引:

out = df.loc[df.index.repeat(np.random.uniform(0, 5+1, size=len(df)))]

示例:

   id  name
0   1   tim
0   1   tim
0   1   tim
0   1   tim
0   1   tim
1   2   jim
1   2   jim
2   3  john
3   4  bill
3   4  bill
3   4  bill
3   4  bill

使用 np.random.seed(42) 的输出:

   id  name
0   1   tim
0   1   tim
1   2   jim
1   2   jim
1   2   jim
1   2   jim
1   2   jim
2   3  john
2   3  john
2   3  john
2   3  john
3   4  bill
3   4  bill
3   4  bill
英文:

You can generate a random array of repeats with numpy.uniform and use that to repeat your index for indexing:

out = df.loc[df.index.repeat(np.random.uniform(0, 5+1, size=len(df)))]

Example:

   id  name
0   1   tim
0   1   tim
0   1   tim
0   1   tim
0   1   tim
1   2   jim
1   2   jim
2   3  john
3   4  bill
3   4  bill
3   4  bill
3   4  bill

Output with np.random.seed(42):

   id  name
0   1   tim
0   1   tim
1   2   jim
1   2   jim
1   2   jim
1   2   jim
1   2   jim
2   3  john
2   3  john
2   3  john
2   3  john
3   4  bill
3   4  bill
3   4  bill

huangapple
  • 本文由 发表于 2023年3月7日 22:30:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/75663283.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定