英文:
Conditional shifting pandas column
问题
以下是您要翻译的内容:
import pandas as pd
import numpy as np
# 创建'i'和'price'列的数据
n = 10 # 条目数
i_values = list(range(1, n+1))
price_values = [10.99, 19.99, 5.99, 8.49, 12.99, 15.99, 9.99, 14.99, 6.99, 11.99]
# 创建DataFrame
data = {'i': i_values,
'price': price_values}
df = pd.DataFrame(data)
df['price_new'] = df.loc[df.i > 6, 'price'].shift(-3)
期望的输出:
n = 10 # 条目数
i_values = list(range(1, n+1))
price_values = [10.99, 19.99, 5.99, 8.49, 12.99, 15.99, 9.99, 14.99, 6.99, 11.99]
new_price_values = [np.NaN, np.NaN, np.NaN, 9.99, 14.99, 6.99, 11.99, np.NaN, np.NaN, np.NaN]
# 创建DataFrame
data = {'i': i_values,
'price': price_values,
'new_price': new_price_values}
df = pd.DataFrame(data)
英文:
I want to conditional shift pandas column, would want to shift all columns with i > 6 below is what I am doing and it is not working
import pandas as pd
import numpy as np
# Creating data for 'i' and 'price' columns
n = 10 # Number of entries
i_values = list(range(1, n+1))
price_values = [10.99, 19.99, 5.99, 8.49, 12.99, 15.99, 9.99, 14.99, 6.99, 11.99]
# Creating DataFrame
data = {'i': i_values,
'price': price_values}
df = pd.DataFrame(data)
df['price_new'] = df.loc[df.i>6, 'price'].shift(-3)
Expected output:
n = 10 # Number of entries
i_values = list(range(1, n+1))
price_values = [10.99, 19.99, 5.99, 8.49, 12.99, 15.99, 9.99, 14.99, 6.99, 11.99]
new_price_values = [np.NaN, np.NaN, np.NaN, 9.99, 14.99, 6.99, 11.99, np.NaN, np.NaN, np.NaN]
# Creating DataFrame
data = {'i': i_values,
'price': price_values,
'new_price': new_price_values}
df = pd.DataFrame(data)
答案1
得分: 1
应用偏移量,然后选择您希望保留的单元格。看起来您试图一次完成所有操作,只是在过程中错误地获取了索引。
您所寻求的一行代码
shift_from = 6
shift_by = -3
df['price_new'] = df.loc[df.i>(shift_from+shift_by),'price'].shift(shift_by)
这将产生与您期望的输出完全相同的结果。
为了清晰起见,拆分成两个步骤
带有可舍弃的中间列。
1) 应用偏移
df['price_shift'] = df['price'].shift(shift_by)
df
i price price_shift
0 1 10.99 8.49
1 2 19.99 12.99
2 3 5.99 15.99
3 4 8.49 9.99
4 5 12.99 14.99
5 6 15.99 6.99
6 7 9.99 11.99
7 8 14.99 NaN
8 9 6.99 NaN
9 10 11.99 NaN
2) 选择单元格
df['price_new'] = df.loc[df.i>(shift_from+shift_by), 'price_shift']
df
i price price_shift price_new
0 1 10.99 8.49 NaN
1 2 19.99 12.99 NaN
2 3 5.99 15.99 NaN
3 4 8.49 9.99 9.99
4 5 12.99 14.99 14.99
5 6 15.99 6.99 6.99
6 7 9.99 11.99 11.99
7 8 14.99 NaN NaN
8 9 6.99 NaN NaN
9 10 11.99 NaN NaN
英文:
Apply the shift, then select the cells you wish to keep. It looks like you're attempting to do it all at once and simply getting the indices wrong in the process.
What you seek as a one-liner
shift_from = 6
shift_by = -3
df['price_new'] = df.loc[df.i>(shift_from+shift_by),'price'].shift(shift_by)
This produces exactly your expected output.
Decomposed in 2 steps for clarity
With dispensable intermediate column.
1) Apply shift
df['price_shift'] = df['price'].shift(shift_by)
df
i price price_shift
0 1 10.99 8.49
1 2 19.99 12.99
2 3 5.99 15.99
3 4 8.49 9.99
4 5 12.99 14.99
5 6 15.99 6.99
6 7 9.99 11.99
7 8 14.99 NaN
8 9 6.99 NaN
9 10 11.99 NaN
2) Select cells
df['price_new'] = df.loc[df.i>(shift_from+shift_by), 'price_shift']
df
i price price_shift price_new
0 1 10.99 8.49 NaN
1 2 19.99 12.99 NaN
2 3 5.99 15.99 NaN
3 4 8.49 9.99 9.99
4 5 12.99 14.99 14.99
5 6 15.99 6.99 6.99
6 7 9.99 11.99 11.99
7 8 14.99 NaN NaN
8 9 6.99 NaN NaN
9 10 11.99 NaN NaN
答案2
得分: 0
这是一种方法:
df['new_price'] = df['price'].where(df.index >= 6, np.NaN).shift(-3)
使用df.loc[df.i>6, 'price'].shift(-3)
的问题在于它选择了最后四行(其中索引大于6的行):
>>> df.loc[df.i>6, 'price']
6 9.99
7 14.99
8 6.99
9 11.99
然后对它们进行了向前平移:
>>> df.loc[df.i>6, 'price'].shift(-3)
6 11.99
7 NaN
8 NaN
9 NaN
英文:
Here's one approach:
df['new_price'] = df['price'].where(df.index >= 6, np.NaN).shift(-3)
The problem with df.loc[df.i>6, 'price'].shift(-3)
is that it's selecting the last four rows (the ones where index is greater than 6:
>>> df.loc[df.i>6, 'price']
6 9.99
7 14.99
8 6.99
9 11.99
and then it's shifting those:
>>> df.loc[df.i>6, 'price'].shift(-3)
6 11.99
7 NaN
8 NaN
9 NaN
答案3
得分: 0
以下是翻译好的内容:
这是另一种方法。
import pandas as pd
import numpy as np
# 为'i'和'price'列创建数据
n = 10 # 条目数
i_values = list(range(1, n+1))
price_values = [10.99, 19.99, 5.99, 8.49, 12.99, 15.99, 9.99, 14.99, 6.99, 11.99]
# 创建DataFrame
data = {'i': i_values,
'price': price_values}
df = pd.DataFrame(data)
df['price_new'] = df.loc[df.i > 6, 'price']
df['price_new'] = df['price_new'].shift(-3)
所以,首先创建新列(price_new),然后应用移位。
英文:
Here is another approach.
import pandas as pd
import numpy as np
# Creating data for 'i' and 'price' columns
n = 10 # Number of entries
i_values = list(range(1, n+1))
price_values = [10.99, 19.99, 5.99, 8.49, 12.99, 15.99, 9.99, 14.99, 6.99, 11.99]
# Creating DataFrame
data = {'i': i_values,
'price': price_values}
df = pd.DataFrame(data)
df['price_new'] = df.loc[df.i>6, 'price']
df['price_new'] = df['price_new'].shift(-3)
So, first create new column (price_new), and then apply shift.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论