英文:
How to escape string while matching pattern in PostgreSQL
问题
我想找到一个文本列以用户给定的字符串开头的行,例如 SELECT * FROM users WHERE name LIKE 'rob%'
,但是"rob"是未经验证的用户输入。如果用户写入包含特殊模式字符的字符串,如"rob_",它将匹配"robert42"和"rob_the_man"。我需要确保字符串被字面匹配,我该如何做到?我需要在应用程序层处理转义,还是有更好的方法?
我正在使用PostgreSQL 9.1和go-pgsql用于Go。
英文:
I want to find rows where a text column begins with a user given string, e.g. SELECT * FROM users WHERE name LIKE 'rob%'
but "rob" is unvalidated user input. If the user writes a string containing a special pattern character like "rob_", it will match both "robert42" and "rob_the_man". I need to be sure that the string is matched literally, how would I do that? Do I need to handle the escaping on an application level or is it a more beautiful way?
I'm using PostgreSQL 9.1 and go-pgsql for Go.
答案1
得分: 8
_和%字符必须进行引用才能在LIKE语句中被匹配,没有其他方法可以绕过这个问题。选择是在客户端或服务器端进行引用(通常使用SQL的replace()函数,见下文)。为了在一般情况下完全正确,有几个要考虑的事情。
默认情况下,用于在_或%之前进行引用的字符是反斜杠(\),但可以通过在LIKE子句之后立即使用ESCAPE子句来更改它。
无论如何,引用字符必须在模式中重复两次,才能作为一个字符进行字面匹配。
例如:... WHERE field like 'john^%node1^^node2.uucp@%' ESCAPE '^'
将匹配*john%node1^node2.uccp@*后面的任何内容。
默认选择反斜杠存在一个问题:当standard_conforming_strings为OFF时,它已经用于其他目的(PG 9.1默认为ON,但之前的版本仍在广泛使用,这是一个需要考虑的问题)。
此外,如果在用户输入注入场景中,LIKE通配符的引用是在客户端进行的,它将额外增加对用户输入的正常字符串引用的需求。
通过查看go-pgsql的示例,可以发现它使用$N样式的占位符来表示变量...因此,这里尝试以一种比较通用的方式编写它:它适用于standard_conforming_strings为ON或OFF的情况,使用服务器端替换[%_],使用替代引用字符,引用引用字符,并避免SQL注入:
db.Query("SELECT * from USERS where name like replace(replace(replace($1,'^','^^'),'%','^%'),'_','^_') || '%' ESCAPE '^'",
variable_user_input);
英文:
The _ and % characters have to be quoted to be matched literally in a LIKE statement, there's no way around it. The choice is about doing it client-side, or server-side (typically by using the SQL replace(), see below). Also to get it 100% right in the general case, there are a few things to consider.
By default, the quote character to use before _ or % is the backslash (\), but it can be changed with an ESCAPE clause immediately following the LIKE clause.
In any case, the quote character has to be repeated twice in the pattern to be matched literally as one character.
Example: ... WHERE field like 'john^%node1^^node2.uucp@%' ESCAPE '^'
would match john%node1^node2.uccp@ followed by anything.
There's a problem with the default choice of backslash: it's already used for other purposes when standard_conforming_strings is OFF (PG 9.1 has it ON by default, but previous versions being still in wide use, this is a point to consider).
Also if the quoting for LIKE wildcard is done client-side in a user input injection scenario, it comes in addition to to the normal string-quoting already necessary on user input.
A glance at a go-pgsql example tells that it uses $N-style placeholders for variables... So here's an attempt to write it in a somehow generic way: it works with standard_conforming_strings both ON or OFF, uses server-side replacement of [%_], an alternative quote character, quoting of the quote character, and avoids sql injection:
db.Query("SELECT * from USERS where name like replace(replace(replace($1,'^','^^'),'%','^%'),'_','^_') ||'%' ESCAPE '^'",
variable_user_input);
答案2
得分: 4
为了在like
表达式中使用下划线和百分号,可以使用转义字符来转义:
SELECT * FROM users WHERE name LIKE replace(replace(user_input, '_', '\\_'), '%', '\\%');
英文:
To escape the underscore and the percent to be used in a pattern in like
expressions use the escape character:
SELECT * FROM users WHERE name LIKE replace(replace(user_input, '_', '\\_'), '%', '\\%');
答案3
得分: 2
根据我所了解,LIKE运算符中唯一的特殊字符是百分号和下划线,可以使用反斜杠手动转义。虽然不太美观,但是可以正常工作。
SELECT * FROM users WHERE name LIKE
regexp_replace('rob', '(%|_)', '\\1', 'g') || '%';
我觉得奇怪的是PostgreSQL没有提供这样的函数。谁希望用户自己编写模式呢?
英文:
As far as I can tell the only special characters with the LIKE operator is percent and underscore, and these can easily be escaped manually using backslash. It's not very beautiful but it works.
SELECT * FROM users WHERE name LIKE
regexp_replace('rob', '(%|_)', '\\\1', 'g') || '%';
I find it strange that there is no such functions shipped with PostgreSQL. Who wants their users to write their own patterns?
答案4
得分: -2
最好的答案是你根本不应该将用户输入插入到你的SQL中。即使转义SQL仍然是危险的。
以下示例使用go的db/sql库展示了一种更安全的方法。将Prepare和Exec调用替换为你的go postgresql库的等效方法。
// 问号告诉数据库服务器我们将在Exec调用中提供LIKE参数
sql := "SELECT * FROM users where name LIKE ?"
// 不需要转义,因为这不会被插入到SQL字符串中
value := "%" + user_input
// 准备完全安全的SQL字符串
stmt, err := db.Prepare(sql)
// 现在使用值执行该SQL,替换每个问号的位置
result, err := stmt.Exec(value)
这样做的好处是可以安全地使用用户输入,而不必担心它将SQL注入到你运行的语句中。在某些情况下,还可以重复使用准备好的SQL,这样可以更高效地执行多个查询。
英文:
The best answer is that you shouldn't be interpolating user input into your sql at all. Even escaping the sql is still dangerous.
The following which uses go's db/sql library illustrates a much safer way. Substitute the Prepare and Exec calls with whatever your go postgresql library's equivalents are.
// The question mark tells the database server that we will provide
// the LIKE parameter later in the Exec call
sql := "SELECT * FROM users where name LIKE ?"
// no need to escape since this won't be interpolated into the sql string.
value := "%" + user_input
// prepare the completely safe sql string.
stmt, err := db.Prepare(sql)
// Now execute that sql with the values for every occurence of the question mark.
result, err := stmt.Exec(value)
The benefits of this are that user input can safely be used without fear of it injecting sql into the statements you run. You also get the benefit of reusing the prepared sql for multiple queries which can be more efficient in certain cases.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论