如何填补元组列表中的缺失日期

huangapple go评论70阅读模式
英文:

How to fill in missing dates in a list of tuples

问题

我有一些类似于以下的元组列表:

list_1 = [('2023-01-01', 'a'), ('2023-01-02', 'b'), ('2023-01-10', 'c')]
list_2 = [('2023-01-02', 'd'), ('2023-01-05', 'e'), ('2023-01-07', 'f')]
list_3 = [('2023-01-01', 'g'), ('2023-01-03', 'h'), ('2023-01-10', 'i')]

我需要为每个列表中的缺失日期填充None值:

list_1 = [('2023-01-01', 'a'), ('2023-01-02', 'b'), ('2023-01-03', None), ('2023-01-05', None), ('2023-01-07', None), ('2023-01-10', 'c')]
list_2 = [('2023-01-01', None), ('2023-01-02', 'd'), ('2023-01-03', None), ('2023-01-05', 'e'), ('2023-01-07', 'f'), ('2023-01-10', None)]
list_3 = [('2023-01-01', 'g'), ('2023-01-02', None), ('2023-01-03', 'h'), ('2023-01-05', None), ('2023-01-07', None), ('2023-01-10', 'i')]

元组元素的数量可以变化。

最佳和最有效的解决方案是什么?

英文:

I have couple lists of tuples like this ones:

list_1 = [('2023-01-01', 'a'), ('2023-01-02', 'b'), ('2023-01-10', 'c')]
list_2 = [('2023-01-02', 'd'), ('2023-01-05', 'e'), ('2023-01-07', 'f')]
list_3 = [('2023-01-01', 'g'), ('2023-01-03', 'h'), ('2023-01-10', 'i')]

I need to fill in the missing dates with None value for each of the lists:

list_1 = [('2023-01-01', 'a'),  ('2023-01-02', 'b'),  ('2023-01-03', None), ('2023-01-05', None), ('2023-01-07', None), ('2023-01-10', 'c')]
list_2 = [('2023-01-01', None), ('2023-01-02', 'd'),  ('2023-01-03', None), ('2023-01-05', 'e'), ('2023-01-07', 'f'), ('2023-01-10', None)]
list_3 = [('2023-01-01', 'g'),  ('2023-01-02', None), ('2023-01-03', 'h'),  ('2023-01-05', None), ('2023-01-07', None)('2023-01-10', 'i')]

The number of tuple elements can vary.

What is the best and most efficient solution to do this ?

答案1

得分: 1

抱歉,代码部分不需要翻译。以下是您要翻译的内容:

"Unfortunately, python's builtin datetime module doesn't have a 'get all dates in a range' function. But pandas has one."

"Important note: all is done with strings, and it works because of the convenient format of the dates, so that lexicographical order of strings correspond to chronological order of dates."

英文:

I cannot prove that it's the "best and most efficient", but here is one approach.

Unfortunately, python's builtin datetime module doesn't have a "get all dates in a range" function. But pandas has one.

from pandas import date_range

list_1 = [('2023-01-01', 'a'), ('2023-01-02', 'b'), ('2023-01-10', 'c')]

dates_in_list1 = {d for d,_ in list_1}
dates_in_range = {str(d)[:10] for d in date_range(min(dates_in_list1), max(dates_in_list1), freq='d')}
missing_dates = dates_in_range.difference(dates_in_list1)

new_list_1 = sorted(list_1 + [(d, None) for d in missing_dates])

print(new_list_1)
# [('2023-01-01', 'a'), ('2023-01-02', 'b'), ('2023-01-03', None), ('2023-01-04', None), ('2023-01-05', None), ('2023-01-06', None), ('2023-01-07', None), ('2023-01-08', None), ('2023-01-09', None), ('2023-01-10', 'c')]

Important note: all is done with strings, and it works because of the convenient format of the dates, so that lexicographical order of strings correspond to chronological order of dates.

huangapple
  • 本文由 发表于 2023年2月7日 01:14:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/75364498.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定