创建一个用于拼接字符串的函数在Python中

huangapple go评论70阅读模式
英文:

Create a function to concatenate string in Python

问题

Here's the translated code part:

我有一个如下定义的 Pandas 数据框

        ID     开始日期          结束日期
    0   77     2018-07-02    2020-07-02
    1   88     2019-07-02    2021-07-02
    2   99     2020-07-02    2022-07-02

我想创建一个函数来返回以下结果字符串):

    ((日期 BETWEEN '2018-07-02' AND '2020-07-02') AND ID = 77)
    OR ((日期 BETWEEN '2019-07-02' AND '2021-07-02') AND ID = 88)
    OR ((日期 BETWEEN '2020-07-02' AND '2022-07-02') AND ID = 99)

以下是我编写的代码但它没有生成预期的结果

```python
def create_string(df):
    date_start = df.loc[
       (df['ID'] == ID), '开始日期']
    date_end = df.loc[
       (df['ID'] == ID), '结束日期']
    date_string = f"((日期 BETWEEN '{date_start}' AND '{date_end}') AND ID = '{ID}') OR"
    string = ""
    final_string = string + date_string
    return final_string

IDs = df.ID.copy()

for ID in IDs:
    print(create_string(df))

希望这对你有所帮助!

英文:

I have a Pandas df as defined below:

    ID     Start          End
0   77     2018-07-02    2020-07-02
1   88     2019-07-02    2021-07-02
2   99     2020-07-02    2022-07-02

I want to create a function to return the following result (string):

((Date BETWEEN '2018-07-02' AND '2020-07-02') AND ID = 77)
OR ((Date BETWEEN '2019-07-02' AND '2021-07-02') AND ID = 88)
OR ((Date BETWEEN '2020-07-02' AND '2022-07-02') AND ID = 99)

Here is what I have written, but it did not generate the expected result:

def create_string(df):
    date_start = df.loc[
       (df['ID'] == ID), 'Start']
    date_end = df.loc[
       (df['ID'] == ID), 'End']
    date_string = F"(((Date BETWEEN '{date_start}' AND '{date_end}') AND ID = '{ID}') OR"
    string = ""
    final_string = string + date_string
    return final_string

IDs = df.ID.copy()

for ID in IDs:
    print(create_string(df))

Any suggestions would be greatly appreciated!

答案1

得分: 0

你可以这样做:

df["cond"] = df.apply(lambda r: f"((Date BETWEEN '{r['start']}' AND '{r['end']}') AND ID = {r['ID']})", axis=1)

然后输出字符串可以通过以下方式计算

final_str = " OR ".join(df["cond"].tolist())
英文:

you can do like this

df["cond"] = df.apply(lambda r: f"((Date BETWEEN '{r['start']}' AND 'r['end']') AND ID = r['ID'])", axis=1)

then the out string can be calculated by

final_str = " OR ".join(df["cond"].tolist())

答案2

得分: 0

以下是翻译好的部分:

得到一个可运行的示例:

import pandas as pd

d = {'ID': [77, 78, 99], 'Start': ['2018-07-02', '2018-07-03', '2018-07-04'],
     'End': ['2018-07-05', '2018-07-06', '2018-07-07']}

df = pd.DataFrame(data=d)

def create_string(df, id_, string):
    date_start = df.loc[df['ID'] == id_].iloc[0, 1]
    date_end = df.loc[df['ID'] == id_].iloc[0, 2]
    date_string = f"(((Date BETWEEN '{date_start}' AND '{date_end}') AND ID = '{id_}') OR"
    string += date_string
    return string

string = ""
for id_ in [77, 78, 99]:
    string = create_string(df, id_, string)
print(string)

您提到的问题是:

  • 数据框选择中的一些索引错误。
  • 函数中每次重新定义字符串。
  • 函数没有返回新的字符串值。
英文:

Got an example working:

import pandas as pd

d = {'ID': [77,78,99], 'Start': ['2018-07-02', '2018-07-03', '2018-07-04'], 
     'End': ['2018-07-05', '2018-07-06', '2018-07-07']}


df = pd.DataFrame(data=d)

def create_string(df, id_, string):
    date_start = df.loc[df['ID'] == id_].iloc[0,1]
    date_end = df.loc[df['ID'] == id_].iloc[0,2]
    date_string = F"(((Date BETWEEN '{date_start}' AND '{date_end}') AND ID = '{id_}') OR"
    string += date_string
    return string


string = ""
for id_ in [77, 78,99]:
    string = create_string(df,id_, string)
print(string)

Your problems were:

  • some indexation mistakes in the dataframe selection.
  • the string redefinition each time in the function
  • the function not returning the new string value

答案3

得分: 0

你已经快完成了。只需要在访问日期时调用 values[0],并对函数参数进行微小的更改。

def create_string(df, ID):
    date_start = df.loc[
       (df['ID'] == ID), 'Start'].values[0]
    date_end = df.loc[
       (df['ID'] == ID), 'End'].values[0]
    date_string = f"((Date BETWEEN '{date_start}' AND '{date_end}') AND ID = {ID})"
    return date_string
    

IDs = df.ID.copy()
strings = []

for ID in IDs:
    strings.append(create_string(df, ID))

print(" OR\n".join(strings))

或者,你也可以以这种方式编写它。这会使你的代码看起来更清晰。

def get_dates(df, ID):
    date_start = df.loc[df['ID'] == ID, 'Start'].values[0]
    date_end = df.loc[df['ID'] == ID, 'End'].values[0]
    return date_start, date_end

def create_date_string(df, ID):
    date_start, date_end = get_dates(df, ID)
    date_string = f"((Date BETWEEN '{date_start}' AND '{date_end}') AND ID = {ID})"
    return date_string

def create_all_strings(df):
    IDs = df.ID.copy()
    all_strings = [create_date_string(df, ID) for ID in IDs]
    return " OR\n".join(all_strings)

print(create_all_strings(df))
英文:

You are almost there. You just need to call values[0] when accessing the dates and make minor changes to function paramters.

def create_string(df, ID):
    date_start = df.loc[
       (df['ID'] == ID), 'Start'].values[0]
    date_end = df.loc[
       (df['ID'] == ID), 'End'].values[0]
    date_string = f"((Date BETWEEN '{date_start}' AND '{date_end}') AND ID = {ID})"
    return date_string
    

IDs = df.ID.copy()
strings = []

for ID in IDs:
    strings.append(create_string(df, ID))

print(" OR\n".join(strings))

Alternatively, you could write it in this manner. This makes your code look much cleaner.


def get_dates(df, ID):
    date_start = df.loc[df['ID'] == ID, 'Start'].values[0]
    date_end = df.loc[df['ID'] == ID, 'End'].values[0]
    return date_start, date_end

def create_date_string(df, ID):
    date_start, date_end = get_dates(df, ID)
    date_string = f"((Date BETWEEN '{date_start}' AND '{date_end}') AND ID = {ID})"
    return date_string

def create_all_strings(df):
    IDs = df.ID.copy()
    all_strings = [create_date_string(df, ID) for ID in IDs]
    return " OR\n".join(all_strings)

print(create_all_strings(df))

答案4

得分: 0

Here's the translated content:

假设某些括号是多余的,以下内容将适用于任何具有上述形状的数据框。

' OR '.join(map(lambda r: "(DATE BETWEEN '{}' AND '{}' AND ID = {})".format(r[1], r[2], r[0]), df.values.tolist()))

输出:

"(DATE BETWEEN '2018-07-02' AND '2020-07-02' AND ID = 77) OR (DATE BETWEEN '2019-07-02' AND '2021-07-02' AND ID = 88) OR (DATE BETWEEN '2020-07-02' AND '2022-07-02' AND ID = 99)"

请注意,我只提供了代码部分的翻译,不包括问题或其他内容。

英文:

Assuming some parenthesis are redundant, following will work with any data-frame with above shape.

' OR '.join(map(lambda r: "(DATE BETWEEN '{}' AND '{}' AND ID = {})".format(r[1], r[2], r[0]), df.values.tolist()))

Output:

"(DATE BETWEEN '2018-07-02' AND '2020-07-02' AND ID = 77) OR (DATE BETWEEN '2019-07-02' AND '2021-07-02' AND ID = 88) OR (DATE BETWEEN '2020-07-02' AND '2022-07-02' AND ID = 99)"

huangapple
  • 本文由 发表于 2023年7月6日 14:53:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76626186.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定