2023年4月4日 03:16:54go评论170阅读模式

英文:

find the latest task from the dataframe columns based on the task role and replace the column value with the latest column value

问题

我有一个包含以下列的数据框：

df1:
Task Start Date | Task Finish Date | Task Role
01-01-2021 01-03-2021 Lead
01-04-2021 02-02-2021 Team member
01-04-2021 02-23-2021 Unknown


我想创建另一列'Origin Role'，基于以下条件：

如果Task Role是Lead/任何其他值，那么跳过（在新列'Origin Role'中填入相同的Task Role）；

否则，只有当Task Role是Unknown时：

- 考虑具有Task Role 'Unknown'的记录的Task Start Date，并查找最新的记录（如果有多个记录），其Task Finish Date小于或等于Task Start Date（未知角色）。

- 然后，在新列'Origin Role'中用最新角色的Task Role替代未知角色。

我的期望是：

df1:
Task Start Date | Task Finish Date | Task Role | Origin Role
01-01-2021 01-03-2021 Lead Lead
01-04-2021 02-02-2021 Team member Team Member
01-04-2021 02-23-2021 Unknown Lead

英文:

I have a dataframe with the following columns

df1:
Task Start Date  |  Task Finish Date  |  Task Role
01-01-2021           01-03-2021            Lead
01-04-2021           02-02-2021            Team member
01-04-2021           02-23-2021            Unknown

I want to create another column 'Origin Role' based on the following conditions:

if the Task Role is lead/anything then skip (fill the same Task Role into the new column ('Origin Role');

else only if Task Role is Unknown then:

consider the Task Start Date of the record with Task Role 'Unknown' and find the latest record(if there are multiple) with its Task Finish Date <=(less than or equal to) Task Start Date (unknown role).
Then fill in the Task role of the latest role in place of the unknown role in the new column 'Origin Role'.

My expectation:

df1:
Task Start Date  |  Task Finish Date  |  Task Role       | Origin Role
01-01-2021           01-03-2021            Lead              Lead
01-04-2021           02-02-2021            Team member       Team Member
01-04-2021           02-23-2021            Unknown           Lead

答案1

得分: 2

代码部分已经被排除，以下是翻译好的内容：

Steps:

通过replace将"Unknown"替换为NaN
按日期排序数值
在最近的前一个日期上合并，忽略"Unknown"
使用这个值进行fillna

Output:

  Task Start Date Task Finish Date    Task Role  Origin Role
0      2021-01-01       2021-01-03         Lead         Lead
1      2021-01-04       2021-02-02  Team member  Team member
2      2021-01-04       2021-02-23      Unknown         Lead

英文:

One option is to use a merge_asof:

# ensure datetime
df[[&#39;Task Start Date&#39;, &#39;Task Finish Date&#39;]] = \
df[[&#39;Task Start Date&#39;, &#39;Task Finish Date&#39;]].apply(pd.to_datetime, dayfirst=False)


df[&#39;Origin Role&#39;] = df[&#39;Task Role&#39;].replace({&#39;Unknown&#39;: np.nan}).fillna(
    pd.merge_asof(df[&#39;Task Start Date&#39;].sort_values().reset_index(),
                  df[[&#39;Task Finish Date&#39;, &#39;Task Role&#39;]]
                    .loc[lambda d: d[&#39;Task Role&#39;].ne(&#39;Unknown&#39;)]
                    .sort_values(by=&#39;Task Finish Date&#39;),
                  left_on=&#39;Task Start Date&#39;, right_on=&#39;Task Finish Date&#39;,
                  ).set_index(&#39;index&#39;)[&#39;Task Role&#39;]
)

Steps:

replace "Unknown" by NaN
sort values by date
merge on the closest previous date, ignoring "Unknown"
fillna with this value

Output:

  Task Start Date Task Finish Date    Task Role  Origin Role
0      2021-01-01       2021-01-03         Lead         Lead
1      2021-01-04       2021-02-02  Team member  Team member
2      2021-01-04       2021-02-23      Unknown         Lead

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

根据任务角色从数据框列中找到最新的任务，并用最新的列值替换列值。

问题

答案1

如何并行执行多个使用2个GPU的Python脚本，并避免CUDA内存不足问题？

获取在运行Python命令中定义的变量。

如何将所有行合并为1行？

如何从shutil.rmtree的onerror回调函数中返回一个值？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论