python字符串解析问题在将SQL命令保存到文件时

huangapple go评论68阅读模式
英文:

python string parsing issues when saving sql commands to file

问题

以下是代码部分的翻译:

我有字典数据正在循环遍历以构建 PostgreSQL 的 `INSERT` 命令其中字典的父键是列名父值是列值

data = {
    'id': 45,
    'col1': "foo's",
    'col2': {'dict_key': 5}
}

columns = ', '.join(data.keys())
# 用双引号替换单引号,以形成用于 PostgreSQL 列的 JSON 类型
data['col2'] = data['col2'].replace("'", '"')

with open("file.sql", "w") as f:
    command = "INSERT INTO table1({}) VALUES {};"
    f.write(command.format(columns, tuple(data.values())))

如果需要任何其他翻译或帮助,请随时告诉我。

英文:

I have dicts data I am looping through to form INSERT commands for postgresql. where the parent keys of the dicts are the column names and the parent values are the column values:

data = {
        'id': 45,
        'col1': "foo's",
        'col2': {'dict_key': 5}
       }

columns = ', '.join(data.keys())
# replace single quote with double to form a json type for psql column
data['col2'] = data['col2'].replace("'", '"')

with open("file.sql", "w") as f:
    command = "INSERT INTO table1({}) VALUES {};"    
    f.write(command.format(columns, tuple(data.values()))

The problem is that the output of this is not formatted correctly for sql to execute. This is the output of the above:

INSERT INTO table1(id, col1, col2) VALUES (45, "foo's", '{"dict_key":5}');

The json field is formatted correctly with the single quotes around the value. But col2 keeps the double quotes if the string in col2 contains a single quote. This is a problem because postgresql requires single quotes to identify TEXT input.

Is there a better way to parse data into psql insert commands?

答案1

得分: 1

Did you try using json.dump() and repr()?

columns = ', '.join(data.keys())

data['col1'] = repr(data['col1'])
data['col2'] = json.dumps(data['col2'])
...
英文:

Did you try using json.dump() and repr()?

columns = ', '.join(data.keys())

data['col1'] = repr(data['col1'])
data['col2'] = json.dumps(data['col2'])
...

答案2

得分: 0

这似乎是Python中的一种限制(或者说是一种实现细节),涉及到字符串或str__repr__()方法的定义。

尝试运行以下示例代码:

value = 'wont fix'; assert f'{value!r}' == "'wont fix'"
value = 'won\'t fix'; assert f'{value!r}' == "\"won't fix\""

可以看到,在字符串的repr中,单引号是首选的,除非字符串本身包含单引号 - 在这种情况下,会使用双引号来包装字符串的repr

一个“快速而简单”的解决方案是实现一个自定义的字符串子类,名为SQStr,它有效地覆盖了默认的repr,始终使用单引号来包装字符串:

class SQStr(str):
    def __repr__(self):
        value = self.replace("'", r"\'")
        return f"'{value}'"

如果您还想支持双转义的单引号,比如r"\\'",可以使用以下代码:

class SQStr(str):
    def __repr__(self, _escaped_sq=r"\'", _tmp_symbol="|+*+|",
                 _default_repr=str.__repr__):
        if "'" in self:
            if _escaped_sq in self:
                value = (self
                         .replace(_escaped_sq, _tmp_symbol)
                         .replace("'", _escaped_sq)
                         .replace(_tmp_symbol, r"\\'"))
            else:
                value = self.replace("'", _escaped_sq)
            return f"'{value}'"
        # 否则,字符串不包含单引号,所以我们可以使用默认的str.__repr__()
        return _default_repr(self)

现在似乎可以按预期工作:

value = 'wont fix'; assert f'{SQStr(value)!r}' == "'wont fix'"
value = 'won\'t fix'; assert f'{SQStr(value)!r}' == r"'won\'t fix'"
# 可选
value = r"won't fix, won\'t!"; assert f'{SQStr(value)!r}' == r"'won\'t fix, won\\'t!'"
英文:

This appears to be a limitation (or rather an implementation detail) in Python, with how the __repr__() for strings or str is defined.

Try this sample code out:

value = 'wont fix'; assert f'{value!r}' == "'wont fix'"
value = 'won\'t fix'; assert f'{value!r}' == '"won\'t fix"'

As can be seen, single quotes are preferred in the repr for strings, unless the string itself contains a single quote - in that case, double quotes are used to wrap the repr for the string.

A "quick and dirty" solution is to implement a custom string subclass, SQStr, which effectively overrides the default repr to always wrap a string with single quotes:

class SQStr(str):
    def __repr__(self):
        value = self.replace("'", r"\'")
        return f"'{value}'"

If you want to also support double-escaped single quotes like r"\\\'", then something like this:

class SQStr(str):
    def __repr__(self, _escaped_sq=r"\'", _tmp_symbol="|+*+|",
                 _default_repr=str.__repr__):
        if "'" in self:
            if _escaped_sq in self:
                value = (self
                         .replace(_escaped_sq, _tmp_symbol)
                         .replace("'", _escaped_sq)
                         .replace(_tmp_symbol, r"\\\'"))
            else:
                value = self.replace("'", _escaped_sq)
            return f"'{value}'"
        # else, string doesn't contain single quotes, so we
        # can use the default str.__repr__()
        return _default_repr(self)

Now it appears to work as expected:

value = 'wont fix'; assert f'{SQStr(value)!r}' == "'wont fix'"
value = 'won\'t fix'; assert f'{SQStr(value)!r}' == r"'won\'t fix'"
# optional
value = r"won't fix, won\'t!"; assert f'{SQStr(value)!r}' == r"'won\'t fix, won\\\'t!'"

huangapple
  • 本文由 发表于 2023年2月7日 03:11:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/75365612.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定