2023年5月11日 19:31:48go评论103阅读模式

英文:

Create pandas data from one date and timestrings without colons

问题

Sure, here's the translated code portion:

我想从包含[GNSS][1]时间的文件中读取时间，其中还包括许多其他数据。预期结果是一个带有日期数据类型的pandas数组（索引或系列），日期是应用于数据集的日期。
在中间步骤中，我有一个时间戳列表，格式为`hhmmss`，其中包含一些无效数据：
```python
import datetime as dt
import pandas as pd
date = dt.date(2023, 5, 9)
times_from_file = ["", "", "123456", "123457", "123458", "123459", "123500"]

我可以使用以下冗长的代码片段获得所需的输出：

datetimes = pd.to_datetime(
    times_from_file, format="%H%M%S", errors="coerce"
).map(
    lambda datetime: pd.NaT
    if pd.isnull(datetime)
    else dt.datetime.combine(date, datetime.time())
)

输出结果：

DatetimeIndex([                'NaT', '2023-05-09 12:34:56',
               '2023-05-09 12:34:57', '2023-05-09 12:34:58',
               '2023-05-09 12:34:59', '2023-05-09 12:35:00'],
              dtype='datetime64[ns]', freq=None)

然而，这看起来过于复杂。我希望可以使用pd.to_timedelta解决，但不幸的是，它不允许传递格式字符串。即使是pandas.Index.map的na_action关键字也被忽略了-这就是我使用if pd.isnull(datetime)的原因。

是否有更简单的方法来做到这一点，最好利用专门构建的Pandas函数或方法？


Please note that I have translated only the code portion as per your request, and I haven't provided an answer to the translation request.
<details>
<summary>英文:</summary>
I want to read times from a file that includes [GNSS][1] times, among a lot of other data. The expected result is a pandas array (Index or Series) with datetime datatype, with the date of the dataset applied.
In an intermediate step, I have a list of timestamps in the format `hhmmss` with some invalid data mixed in:

import datetime as dt
import pandas as pd

date = dt.date(2023, 5, 9)
times_from_file = [",,,,,,"¸ "123456", "123457", "123458", "123459", "123500"]


I can get the desired output with this lengthy code snippet:

datetimes = pd.to_datetime(
times_from_file, format="%H%M%S", errors="coerce"
).map(
lambda datetime: pd.NaT
if pd.isnull(datetime)
else dt.datetime.combine(date, datetime.time())
)


Output:

DatetimeIndex([ 'NaT', '2023-05-09 12:34:56',
'2023-05-09 12:34:57', '2023-05-09 12:34:58',
'2023-05-09 12:34:59', '2023-05-09 12:35:00'],
dtype='datetime64[ns]', freq=None)


However, this looks overly complicated. I was hoping this could be solved with [`pd.to_timedelta`](https://pandas.pydata.org/docs/reference/api/pandas.to_timedelta.html) instead but unfortunately that doesn&#39;t allow passing a format string. Even the `na_action` keyword of [`pandas.Index.map`][2] is ignored – that&#39;s why I used `if pd.isnull(datetime)` instead.
Is there a simpler way to do this, preferably leveraging purpose-built Pandas functions or methods?
  [1]: https://en.wikipedia.org/wiki/Satellite_navigation
  [2]: https://pandas.pydata.org/docs/reference/api/pandas.Index.map.html
</details>
# 答案1
**得分**: 1
将`times_from_file`转换为Series，如果它还不是的话：
```python
pd.to_datetime('2023-05-09 ' + pd.Series(times_from_file), format="%Y-%m-%d %H%M%S", errors='coerce')

英文:

Convert times_from_file as a Series if it's not already the case:

&gt;&gt;&gt; pd.to_datetime(&#39;2023-05-09 &#39; + pd.Series(times_from_file), format=&quot;%Y-%m-%d %H%M%S&quot;, errors=&#39;coerce&#39;)
0                   NaT
1   2023-05-09 12:34:56
2   2023-05-09 12:34:57
3   2023-05-09 12:34:58
4   2023-05-09 12:34:59
5   2023-05-09 12:35:00
dtype: datetime64[ns]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从一个日期和时间字符串创建pandas数据，但不包括冒号。

问题

如何使用cv2.Canny边缘检测从图像中检测叶子？

如何在Python中将两个时间列相互相减？

RuntimeError: 期望所有张量在相同的设备上，但至少发现两个不同的设备

Loop a dataframe and check if there is the same name as another column.

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。