2023年2月19日 15:30:05go评论91阅读模式

英文:

If cell number in certain dataframe column is equal to or greater, insert duplicate row and add balance

问题

我有以下数据框：

import pandas as pd
df = pd.DataFrame()
df['number'] = (651, 651, 651, 4267, 4267, 4267, 4267, 4267, 4267, 4267, 8806, 8806, 8806, 6841, 6841, 6841, 6841)
df['name'] = ('Alex', 'Alex', 'Alex', 'Ankit', 'Ankit', 'Ankit', 'Ankit', 'Ankit', 'Ankit', 'Ankit', 'Abhishek', 'Abhishek', 'Abhishek', 'Blake', 'Blake', 'Blake', 'Blake')
df['hours'] = (8.25, 7.5, 7.5, 7.5, 14, 12, 15, 11, 6.5, 14, 15, 15, 13.5, 8, 8, 8, 8)
df['loc'] = ('Nar', 'SCC', 'RSL', 'UNIT-C', 'UNIT-C', 'UNIT-C', 'UNIT-C', 'UNIT-C', 'UNIT-C', 'UNIT-C', 'UNI', 'UNI', 'UNI', 'UNKING', 'UNKING', 'UNKING', 'UNKING')
print(df)

对于每一行的小时数在>=10且<=12之间，我需要：

将当前行的小时总数更改为10
在上面插入一个重复的行，并将剩余的小时数添加到这行

对于每一行的小时数>12：

将当前行的小时总数更改为10
在上面插入一个重复的行，并将2小时添加到这行
在上面再次插入一个重复的行，并将剩余的小时数添加到这行

新数据框的结果应该如下所示：

英文:

I have the following data-frame

import pandas as pd
df = pd.DataFrame()
df[&#39;number&#39;] = (651,651,651,4267,4267,4267,4267,4267,4267,4267,8806,8806,8806,6841,6841,6841,6841)
df[&#39;name&#39;]=(&#39;Alex&#39;,&#39;Alex&#39;,&#39;Alex&#39;,&#39;Ankit&#39;,&#39;Ankit&#39;,&#39;Ankit&#39;,&#39;Ankit&#39;,&#39;Ankit&#39;,&#39;Ankit&#39;,&#39;Ankit&#39;,&#39;Abhishek&#39;,&#39;Abhishek&#39;,&#39;Abhishek&#39;,&#39;Blake&#39;,&#39;Blake&#39;,&#39;Blake&#39;,&#39;Blake&#39;)
df[&#39;hours&#39;]=(8.25,7.5,7.5,7.5,14,12,15,11,6.5,14,15,15,13.5,8,8,8,8)
df[&#39;loc&#39;]=(&#39;Nar&#39;,&#39;SCC&#39;,&#39;RSL&#39;,&#39;UNIT-C&#39;,&#39;UNIT-C&#39;,&#39;UNIT-C&#39;,&#39;UNIT-C&#39;,&#39;UNIT-C&#39;,&#39;UNIT-C&#39;,&#39;UNIT-C&#39;,&#39;UNI&#39;,&#39;UNI&#39;,&#39;UNI&#39;,&#39;UNKING&#39;,&#39;UNKING&#39;,&#39;UNKING&#39;,&#39;UNKING&#39;)
print(df)

For each row that has hours between >=10 and <=12, I need to:

change the total of the hours of the current row to 10
insert a duplicate row to the above and add the balance of hours to this row

For each row that has hours >12

change the total of the hours to 10
insert below a duplicate row to the above and 2 hours to this row
insert another duplicate row to the above and add the balance of hours to this row

The result of the new dataframe should look like the following:

答案1

得分: 4

以下是翻译好的代码部分：

# 根据小时值对数据框进行子集化
s = df[df['hours'] < 10]
s1 = df[df['hours'] > 12]
s2 = df[df['hours'].between(10, 12)]
# 为每个子集分配重复行并连接
pd.concat([
    s,
    s1.assign(hours=10), 
    s1.assign(hours=2), 
    s1.assign(hours=s1['hours'] - 12),
    s2.assign(hours=10),
    s2.assign(hours=s2['hours'] - 10)]
).sort_index(kind='stable', ignore_index=True)

请注意，这是代码的翻译部分。

英文:

Code

# subset the dataframe based on hour value
s =  df[df[&#39;hours&#39;] &lt; 10]
s1 = df[df[&#39;hours&#39;] &gt; 12]
s2 = df[df[&#39;hours&#39;].between(10, 12)]
# Assign duplicate rows per subset and concat
pd.concat([
    s,
    s1.assign(hours=10), 
    s1.assign(hours=2), 
    s1.assign(hours=s1[&#39;hours&#39;] - 12),
    s2.assign(hours=10),
    s2.assign(hours=s2[&#39;hours&#39;] - 10)]
).sort_index(kind=&#39;stable&#39;, ignore_index=True)

Result

    number      name  hours     loc
0      651      Alex   8.25     Nar
1      651      Alex   7.50     SCC
2      651      Alex   7.50     RSL
3     4267     Ankit   7.50  UNIT-C
4     4267     Ankit  10.00  UNIT-C
5     4267     Ankit   2.00  UNIT-C
6     4267     Ankit   2.00  UNIT-C
7     4267     Ankit  10.00  UNIT-C
8     4267     Ankit   2.00  UNIT-C
9     4267     Ankit  10.00  UNIT-C
10    4267     Ankit   2.00  UNIT-C
11    4267     Ankit   3.00  UNIT-C
12    4267     Ankit  10.00  UNIT-C
13    4267     Ankit   1.00  UNIT-C
14    4267     Ankit   6.50  UNIT-C
15    4267     Ankit  10.00  UNIT-C
16    4267     Ankit   2.00  UNIT-C
17    4267     Ankit   2.00  UNIT-C
18    8806  Abhishek  10.00     UNI
19    8806  Abhishek   2.00     UNI
20    8806  Abhishek   3.00     UNI
21    8806  Abhishek  10.00     UNI
22    8806  Abhishek   2.00     UNI
23    8806  Abhishek   3.00     UNI
24    8806  Abhishek  10.00     UNI
25    8806  Abhishek   2.00     UNI
26    8806  Abhishek   1.50     UNI
27    6841     Blake   8.00  UNKING
28    6841     Blake   8.00  UNKING
29    6841     Blake   8.00  UNKING
30    6841     Blake   8.00  UNKING

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如果数据框中某列中的单元格号码等于或大于，则插入重复行并添加余额。

问题

答案1

Code

Result

如何使用VSCode调试Python命令行二进制文件（特指Poetry）。

将内容推送到浏览器…改变DOM？

问题：在Python中读取包含字符“ê：você”的葡萄牙文文本文件时出现问题。

vall-e model throws 'NoneType' object has no attribute 'optimizer_name' error while training, how to fix it?

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。