创建一个用于拼接字符串的函数在Python中

huangapple go评论98阅读模式
英文:

Create a function to concatenate string in Python

问题

Here's the translated code part:

  1. 我有一个如下定义的 Pandas 数据框
  2. ID 开始日期 结束日期
  3. 0 77 2018-07-02 2020-07-02
  4. 1 88 2019-07-02 2021-07-02
  5. 2 99 2020-07-02 2022-07-02
  6. 我想创建一个函数来返回以下结果字符串):
  7. ((日期 BETWEEN '2018-07-02' AND '2020-07-02') AND ID = 77)
  8. OR ((日期 BETWEEN '2019-07-02' AND '2021-07-02') AND ID = 88)
  9. OR ((日期 BETWEEN '2020-07-02' AND '2022-07-02') AND ID = 99)
  10. 以下是我编写的代码但它没有生成预期的结果
  11. ```python
  12. def create_string(df):
  13. date_start = df.loc[
  14. (df['ID'] == ID), '开始日期']
  15. date_end = df.loc[
  16. (df['ID'] == ID), '结束日期']
  17. date_string = f"((日期 BETWEEN '{date_start}' AND '{date_end}') AND ID = '{ID}') OR"
  18. string = ""
  19. final_string = string + date_string
  20. return final_string
  21. IDs = df.ID.copy()
  22. for ID in IDs:
  23. print(create_string(df))

希望这对你有所帮助!

英文:

I have a Pandas df as defined below:

  1. ID Start End
  2. 0 77 2018-07-02 2020-07-02
  3. 1 88 2019-07-02 2021-07-02
  4. 2 99 2020-07-02 2022-07-02

I want to create a function to return the following result (string):

  1. ((Date BETWEEN '2018-07-02' AND '2020-07-02') AND ID = 77)
  2. OR ((Date BETWEEN '2019-07-02' AND '2021-07-02') AND ID = 88)
  3. OR ((Date BETWEEN '2020-07-02' AND '2022-07-02') AND ID = 99)

Here is what I have written, but it did not generate the expected result:

  1. def create_string(df):
  2. date_start = df.loc[
  3. (df['ID'] == ID), 'Start']
  4. date_end = df.loc[
  5. (df['ID'] == ID), 'End']
  6. date_string = F"(((Date BETWEEN '{date_start}' AND '{date_end}') AND ID = '{ID}') OR"
  7. string = ""
  8. final_string = string + date_string
  9. return final_string
  10. IDs = df.ID.copy()
  11. for ID in IDs:
  12. print(create_string(df))

Any suggestions would be greatly appreciated!

答案1

得分: 0

你可以这样做:

  1. df["cond"] = df.apply(lambda r: f"((Date BETWEEN '{r['start']}' AND '{r['end']}') AND ID = {r['ID']})", axis=1)
  2. 然后输出字符串可以通过以下方式计算
  3. final_str = " OR ".join(df["cond"].tolist())
英文:

you can do like this

  1. df["cond"] = df.apply(lambda r: f"((Date BETWEEN '{r['start']}' AND 'r['end']') AND ID = r['ID'])", axis=1)

then the out string can be calculated by

  1. final_str = " OR ".join(df["cond"].tolist())

答案2

得分: 0

以下是翻译好的部分:

得到一个可运行的示例:

  1. import pandas as pd
  2. d = {'ID': [77, 78, 99], 'Start': ['2018-07-02', '2018-07-03', '2018-07-04'],
  3. 'End': ['2018-07-05', '2018-07-06', '2018-07-07']}
  4. df = pd.DataFrame(data=d)
  5. def create_string(df, id_, string):
  6. date_start = df.loc[df['ID'] == id_].iloc[0, 1]
  7. date_end = df.loc[df['ID'] == id_].iloc[0, 2]
  8. date_string = f"(((Date BETWEEN '{date_start}' AND '{date_end}') AND ID = '{id_}') OR"
  9. string += date_string
  10. return string
  11. string = ""
  12. for id_ in [77, 78, 99]:
  13. string = create_string(df, id_, string)
  14. print(string)

您提到的问题是:

  • 数据框选择中的一些索引错误。
  • 函数中每次重新定义字符串。
  • 函数没有返回新的字符串值。
英文:

Got an example working:

  1. import pandas as pd
  2. d = {'ID': [77,78,99], 'Start': ['2018-07-02', '2018-07-03', '2018-07-04'],
  3. 'End': ['2018-07-05', '2018-07-06', '2018-07-07']}
  4. df = pd.DataFrame(data=d)
  5. def create_string(df, id_, string):
  6. date_start = df.loc[df['ID'] == id_].iloc[0,1]
  7. date_end = df.loc[df['ID'] == id_].iloc[0,2]
  8. date_string = F"(((Date BETWEEN '{date_start}' AND '{date_end}') AND ID = '{id_}') OR"
  9. string += date_string
  10. return string
  11. string = ""
  12. for id_ in [77, 78,99]:
  13. string = create_string(df,id_, string)
  14. print(string)

Your problems were:

  • some indexation mistakes in the dataframe selection.
  • the string redefinition each time in the function
  • the function not returning the new string value

答案3

得分: 0

你已经快完成了。只需要在访问日期时调用 values[0],并对函数参数进行微小的更改。

  1. def create_string(df, ID):
  2. date_start = df.loc[
  3. (df['ID'] == ID), 'Start'].values[0]
  4. date_end = df.loc[
  5. (df['ID'] == ID), 'End'].values[0]
  6. date_string = f"((Date BETWEEN '{date_start}' AND '{date_end}') AND ID = {ID})"
  7. return date_string
  8. IDs = df.ID.copy()
  9. strings = []
  10. for ID in IDs:
  11. strings.append(create_string(df, ID))
  12. print(" OR\n".join(strings))

或者,你也可以以这种方式编写它。这会使你的代码看起来更清晰。

  1. def get_dates(df, ID):
  2. date_start = df.loc[df['ID'] == ID, 'Start'].values[0]
  3. date_end = df.loc[df['ID'] == ID, 'End'].values[0]
  4. return date_start, date_end
  5. def create_date_string(df, ID):
  6. date_start, date_end = get_dates(df, ID)
  7. date_string = f"((Date BETWEEN '{date_start}' AND '{date_end}') AND ID = {ID})"
  8. return date_string
  9. def create_all_strings(df):
  10. IDs = df.ID.copy()
  11. all_strings = [create_date_string(df, ID) for ID in IDs]
  12. return " OR\n".join(all_strings)
  13. print(create_all_strings(df))
英文:

You are almost there. You just need to call values[0] when accessing the dates and make minor changes to function paramters.

  1. def create_string(df, ID):
  2. date_start = df.loc[
  3. (df['ID'] == ID), 'Start'].values[0]
  4. date_end = df.loc[
  5. (df['ID'] == ID), 'End'].values[0]
  6. date_string = f"((Date BETWEEN '{date_start}' AND '{date_end}') AND ID = {ID})"
  7. return date_string
  8. IDs = df.ID.copy()
  9. strings = []
  10. for ID in IDs:
  11. strings.append(create_string(df, ID))
  12. print(" OR\n".join(strings))

Alternatively, you could write it in this manner. This makes your code look much cleaner.

  1. def get_dates(df, ID):
  2. date_start = df.loc[df['ID'] == ID, 'Start'].values[0]
  3. date_end = df.loc[df['ID'] == ID, 'End'].values[0]
  4. return date_start, date_end
  5. def create_date_string(df, ID):
  6. date_start, date_end = get_dates(df, ID)
  7. date_string = f"((Date BETWEEN '{date_start}' AND '{date_end}') AND ID = {ID})"
  8. return date_string
  9. def create_all_strings(df):
  10. IDs = df.ID.copy()
  11. all_strings = [create_date_string(df, ID) for ID in IDs]
  12. return " OR\n".join(all_strings)
  13. print(create_all_strings(df))

答案4

得分: 0

Here's the translated content:

假设某些括号是多余的,以下内容将适用于任何具有上述形状的数据框。

  1. ' OR '.join(map(lambda r: "(DATE BETWEEN '{}' AND '{}' AND ID = {})".format(r[1], r[2], r[0]), df.values.tolist()))

输出:

  1. "(DATE BETWEEN '2018-07-02' AND '2020-07-02' AND ID = 77) OR (DATE BETWEEN '2019-07-02' AND '2021-07-02' AND ID = 88) OR (DATE BETWEEN '2020-07-02' AND '2022-07-02' AND ID = 99)"

请注意,我只提供了代码部分的翻译,不包括问题或其他内容。

英文:

Assuming some parenthesis are redundant, following will work with any data-frame with above shape.

  1. ' OR '.join(map(lambda r: "(DATE BETWEEN '{}' AND '{}' AND ID = {})".format(r[1], r[2], r[0]), df.values.tolist()))

Output:

  1. "(DATE BETWEEN '2018-07-02' AND '2020-07-02' AND ID = 77) OR (DATE BETWEEN '2019-07-02' AND '2021-07-02' AND ID = 88) OR (DATE BETWEEN '2020-07-02' AND '2022-07-02' AND ID = 99)"

huangapple
  • 本文由 发表于 2023年7月6日 14:53:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76626186.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定