SQL逃逸而无需使用PreparedStatements Criteria-API。

huangapple go评论69阅读模式
英文:

SQL Escape without PreparedStatements Criteria-API

问题

我必须转义参数以避免SQL注入问题。我有一个大的CriteriaBuilder SQL,在其中找到了下面的内容:

Expression<Integer> containsFunction = cb.function("CONTAINS", Integer.class,
    joinParty.get(MyEntity_.name), cb.literal(sb.toString())
);

这个 "sb" 是一个 StringBuilder,其中包含 SQL 注入。不管怎样,在这个操作中有一个不常见(我不熟悉的)句子 "DEFINEMERGE",其中包含连接的参数:

StringBuilder sb = new StringBuilder("DEFINEMERGE(((NEAR((");
for (int i = 0; i < nameValues.length; i++) {
    sb.append("{").append(nameValues[i]).append("}");
    if(i + 1 < nameValues.length) {
        sb.append(",");
    }
}
sb.append("),0)),(").append(nameValues[0]).append(" AND {").append(nameValues[1]).append("}))");
if(nameValues.length > 3) {
    sb.append(",(").append(nameValues[1]).append(" AND {").append(nameValues[2]).append("})");
    if(nameValues.length == 4) {
        sb.append(",(").append(nameValues[2]).append(" AND {").append(nameValues[3]).append("})");
    }
}
sb.append("), AND, MIN)");

问题是,一些 nameValues 中包含 "(",破坏了 SQL。我不确定在这种情况下如何最好地保护这些值,因为这个 CONTAINS 与 SQL 字符串字面值一起工作,而不是与 Criteria 对象一起工作。

这是生成的预期 SQL:

CONTAINS(
    table.name, 
    'DEFINEMERGE (
        (
            (NEAR( (?, {?},{?}), 0, FALSE)),
            (? AND {?} and {?})
        ),
        AND,
        min 
    )'
    ,
    1
)

这是生成的查询示例(为了公司隐私和安全性,隐藏了实际表和字段的真实名称)。这也包含了参数绑定数组:

SELECT 
    COUNT(t0.<VALUE>) 
FROM 
    SCHEME.<TABLE_NAME> t0 
    LEFT OUTER JOIN SCHEME.<TABLE_NAME> t2 ON (t2.<VALUE> = t0.<VALUE>) 
    LEFT OUTER JOIN SCHEME.<TABLE_NAME> t6 ON (t6.<VALUE> = t2.<VALUE>) 
    LEFT OUTER JOIN SCHEME.<TABLE_NAME> t7 ON (t7.<VALUE> = t6.<VALUE>) 
    LEFT OUTER JOIN SCHEME.<TABLE_NAME> t8 ON (t8.<VALUE> = t7.<VALUE>) 
    LEFT OUTER JOIN SCHEME.<TABLE_NAME> t3 ON (t3.<VALUE> = t2.<VALUE>) 
    LEFT OUTER JOIN SCHEME.<TABLE_NAME> t4 ON (t4.<VALUE> = t3.<VALUE>) 
    LEFT OUTER JOIN SCHEME.<TABLE_NAME> t5 ON (t5.<VALUE> = t4.<VALUE>), 
    SCHEME.<TABLE_NAME> t11, 
    SCHEME.<TABLE_NAME> t10, 
    SCHEME.<TABLE_NAME> t9, 
    SCHEME.<TABLE_NAME> t1 
WHERE 
    (((((((((((((((t0.<VALUE> IN (?)) AND (t0.<VALUE> IN (?))) AND (t6.<VALUE> = ?)) AND (t3.<VALUE> IN (?))) AND ((t2.<VALUE> IS NULL) OR (t2.<VALUE> = t9.<VALUE>))) AND (t0.<VALUE> = ?)) AND (CONTAINS(t1.<VALUE>, ?) > ?)) AND (t0.<VALUE> IN (?))) AND (t1.<VALUE> = ?)) AND t0.<VALUE> IN (SELECT t12.<VALUE> FROM SCHEME.<TABLE> t14 LEFT OUTER JOIN SCHEME.<TABLE> t16 ON (t16.<VALUE> = t14.<VALUE>), SCHEME.<TABLE> t13, SCHEME.<TABLE> t12, SCHEME.<TABLE> t15 WHERE ((((((t13.<VALUE> IN (?)) AND (t14.<VALUE> IN (?))) AND (t14.<VALUE> <= ?)) AND ((t14.<VALUE> IS NULL) OR (t14.<VALUE> >= ?))) AND (t16.<VALUE> = ?)) AND (((t14.<VALUE> = t13.<VALUE>) AND (t12.<VALUE> = t13.<VALUE>)) AND (t15.<VALUE> = t14.<VALUE>))))) AND (t9.<VALUE> IN (?))) AND (t11.<VALUE> IN (?))) AND (t0.<VALUE> = t11.<VALUE>)) AND (t9.<VALUE> <= ?)) AND ((t9.<VALUE> IS NULL) OR (t9.PARO_DA_END_VALIDITY >= ?))) AND (((t9.<VALUE> = t0.<VALUE>) AND (t1.<VALUE> = t0.<VALUE>)) AND (t10.<VALUE> = t9.ROTY_ID_ENGAGED_ROLE_SPEC)))
    
    bind => [1, INDI, 1, 1, 1, DEFINEMERGE(((NEAR(({(name},{name)}),0)),((name AND {name)})), AND, MIN), 0, 1, 1, 1, 1, 2020-09-07 00:00:00.0, 2020-09-07 00:00:00.0, 1, 1, 1, 2020-09-07 00:00:00.0, 2020-09-07 00:00:00.0]
英文:

I have to escape a parameters to avoid SQL injection problems. I have a big CriteriaBuilder SQL where I find the next:

Expression&lt;Integer&gt; containsFunction = cb.function(&quot;CONTAINS&quot;, Integer.class,
joinParty.get(MyEntity_.name), cb.literal(sb.toString())
);

This "sb" is a StringBuilder where the SQLi are. Anyway, is a uncommon (unknowed to me) sentence in this operación "DEFINEMERGE" that have concated arguments:

StringBuilder sb = new StringBuilder(&quot;DEFINEMERGE(((NEAR((&quot;);
for (int i = 0; i &lt; nameValues.length; i++) {
sb.append(&quot;{&quot;).append(nameValues[i]).append(&quot;}&quot;);
if(i + 1 &lt; nameValues.length) {
sb.append(&quot;,&quot;);
}
}
sb.append(&quot;),0)),(&quot;).append(nameValues[0]).append(&quot; AND {&quot;).append(nameValues[1]).append(&quot;})&quot;);
if(nameValues.length &gt; 3) {
sb.append(&quot;,(&quot;).append(nameValues[1]).append(&quot; AND {&quot;).append(nameValues[2]).append(&quot;})&quot;);
if(nameValues.length == 4) {
sb.append(&quot;,(&quot;).append(nameValues[2]).append(&quot; AND {&quot;).append(nameValues[3]).append(&quot;})&quot;);
}
}
sb.append(&quot;), AND, MIN)&quot;);

Problem is that some fo the nameValues have a "(" inside, breaking the SQL. I am not sure what is the best way of securize this values in this case because this CONTAINS works with a SQL String literal, not an criteria object.

This is the generated SQL expected:

CONTAINS(
table.name, 
&#39;DEFINEMERGE (
(
(NEAR( (?, {?},{?}), 0, FALSE)),
(? AND {?} and {?})
),
AND,
min 
)&#39;
,
1
)

This is an example of the generated query (hidden real names of tables and fields for company privacy and security) This have the parameter bind array too:

SELECT 
COUNT(t0.&lt;VALUE&gt;) 
FROM 
SCHEME.&lt;TABLE_NAME&gt; t0 
LEFT OUTER JOIN SCHEME.&lt;TABLE_NAME&gt; t2 ON (t2.&lt;VALUE&gt; = t0.&lt;VALUE&gt;) 
LEFT OUTER JOIN SCHEME.&lt;TABLE_NAME&gt; t6 ON (t6.&lt;VALUE&gt; = t2.&lt;VALUE&gt;) 
LEFT OUTER JOIN SCHEME.&lt;TABLE_NAME&gt; t7 ON (t7.&lt;VALUE&gt; = t6.&lt;VALUE&gt;) 
LEFT OUTER JOIN SCHEME.&lt;TABLE_NAME&gt; t8 ON (t8.&lt;VALUE&gt; = t7.&lt;VALUE&gt;) 
LEFT OUTER JOIN SCHEME.&lt;TABLE_NAME&gt; t3 ON (t3.&lt;VALUE&gt; = t2.&lt;VALUE&gt;) 
LEFT OUTER JOIN SCHEME.&lt;TABLE_NAME&gt; t4 ON (t4.&lt;VALUE&gt; = t3.&lt;VALUE&gt;) 
LEFT OUTER JOIN SCHEME.&lt;TABLE_NAME&gt; t5 ON (t5.&lt;VALUE&gt; = t4.&lt;VALUE&gt;), 
SCHEME.&lt;TABLE_NAME&gt; t11, 
SCHEME.&lt;TABLE_NAME&gt; t10, 
SCHEME.&lt;TABLE_NAME&gt; t9, 
SCHEME.&lt;TABLE_NAME&gt; t1 
WHERE 
((((((((((((((((t0.&lt;VALUE&gt; IN (?)) AND (t0.&lt;VALUE&gt; IN (?))) AND (t6.&lt;VALUE&gt; = ?)) AND (t3.&lt;VALUE&gt; IN (?))) AND ((t2.&lt;VALUE&gt; IS NULL) OR (t2.&lt;VALUE&gt; = t9.&lt;VALUE&gt;))) AND (t0.&lt;VALUE&gt; = ?)) AND (CONTAINS(t1.&lt;VALUE&gt;, ?) &gt; ?)) AND (t0.&lt;VALUE&gt; IN (?))) AND (t1.&lt;VALUE&gt; = ?)) AND t0.&lt;VALUE&gt; IN (SELECT t12.&lt;VALUE&gt; FROM SCHEME.&lt;TABLE&gt; t14 LEFT OUTER JOIN SCHEME.&lt;TABLE&gt; t16 ON (t16.&lt;VALUE&gt; = t14.&lt;VALUE&gt;), SCHEME.&lt;TABLE&gt; t13, SCHEME.&lt;TABLE&gt; t12, SCHEME.&lt;TABLE&gt; t15 WHERE ((((((t13.&lt;VALUE&gt; IN (?)) AND (t14.&lt;VALUE&gt; IN (?))) AND (t14.&lt;VALUE&gt; &lt;= ?)) AND ((t14.&lt;VALUE&gt; IS NULL) OR (t14.&lt;VALUE&gt; &gt;= ?))) AND (t16.&lt;VALUE&gt; = ?)) AND (((t14.&lt;VALUE&gt; = t13.&lt;VALUE&gt;) AND (t12.&lt;VALUE&gt; = t13.&lt;VALUE&gt;)) AND (t15.&lt;VALUE&gt; = t14.&lt;VALUE&gt;))))) AND (t9.&lt;VALUE&gt; IN (?))) AND (t11.&lt;VALUE&gt; IN (?))) AND (t0.&lt;VALUE&gt; = t11.&lt;VALUE&gt;)) AND (t9.&lt;VALUE&gt; &lt;= ?)) AND ((t9.&lt;VALUE&gt; IS NULL) OR (t9.PARO_DA_END_VALIDITY &gt;= ?))) AND (((t9.&lt;VALUE&gt; = t0.&lt;VALUE&gt;) AND (t1.&lt;VALUE&gt; = t0.&lt;VALUE&gt;)) AND (t10.&lt;VALUE&gt; = t9.ROTY_ID_ENGAGED_ROLE_SPEC)))
bind =&gt; [1, INDI, 1, 1, 1, DEFINEMERGE(((NEAR(({(name},{name)}),0)),((name AND {name)})), AND, MIN), 0, 1, 1, 1, 1, 2020-09-07 00:00:00.0, 2020-09-07 00:00:00.0, 1, 1, 1, 2020-09-07 00:00:00.0, 2020-09-07 00:00:00.0]

答案1

得分: 2

很遗憾,没有一种确切的方法可以完全避免SQL问题,因为实际上并不存在所谓的“SQL”。有的是不同的方言,对于正确地进行转义,了解所用的方言是必要的信息。那么,你想要转义什么?“SQL”不是一个可行的答案。可行的答案可能是“PostgreSQL的SQL”、“MySQL的SQL”、“Oracle的SQL”等。SQL更像是一个概念,而不是一个直接的规范(虽然实际上有一份规范,但它包含的内容要比你想象的少得多,而且每种SQL方言都违反了那个规范并添加了许多内容)。

这就是为什么通常的建议是:实际上,不,你不能按你的方式做,如果你想要转义这些内容,你必须通过PreparedStatement上的.setX方法来实现。

从你的问题看,你的数据库中有实际的SQL语句作为字符串字面量,这本身就是一个奇怪的情况,而且很容易导致严重的安全问题,所以尽管你可能不想听,但这种设计需要完全重做,而且听起来其他代码会获取这个SQL,然后逐字运行它。

JavaScript 曾经有类似的情况(eval),而由于 eval 存在的安全漏洞数量之多令人震惊。现在也有办法使用标头禁止在你的站点上使用 eval,并且浏览器会在 URL 中出现的某些部分周围进行处理,情况非常糟糕——现代安全指南告诉你要完全禁用这个功能,如果你想要它正常工作,那基本上是不可靠的。

考虑到自己手动编写SQL转义是一个糟糕的主意,我怀疑是否存在任何可用的库。有一个很小的机会,你的JDBC驱动程序(深入研究这些类并检查网站!)有一个实用方法可以做到这一点;考虑到每个方言都有不同的规则,这是有道理的:每个SQL引擎都有不同的JDBC驱动程序。很明显,如果你的JDBC驱动程序附带了SQL转义工具,那么它就适用于那个特定的引擎。

如果你找不到,大多数SQL方言中最简单的转义方法是创建一个允许字符的白名单,然后转义不在白名单上的每个字符。白名单只应包含绝对安全的内容(a-z、0-9,也许是_、-)。任何看起来像引号或反斜杠的内容都不应包含在列表中,我建议避免使用$,因为它通常用于变量替换,虽然在SQL中不是一个问题,但相比让你的生产服务器受到攻击,还是保险一些。

其余的内容可以进行转义。例如,在PostgreSQL中,你会将字符串:

Joe's bar & Grill

转义为

E'Joe\u0027s bar \u0026 Grill'

这里的E表示:带有转义的字符串。该算法会检查每个字符,并将白名单上的所有内容复制过去。引号和与号都不在白名单上,因此将它们替换为\u0000,其中零是字符的十六进制编码(s.charAt(i)的值,强制转换为整数,打印为十六进制数)。

这应该涵盖所有情况,但请注意(叹息)转义字符串完全不在SQL规范之内,它是PostgreSQL的一种做法。

英文:

Unfortunately, there is no guaranteed way to escape SQL, because there really is no such thing as 'SQL'. There's the dialect, and it is a required bit of info to do the job of escaping properly. So, what do you want to escape? 'SQL' is not a workable answer. Workable answers are 'postgres's SQL', 'MySQL's SQL', 'Oracle's SQL', etc. SQL is more like a notion than a direct spec (there IS an actual spec but it contains a lot less than you think it does, and every dialect of SQL out there breaks that spec and adds a ton of things to it).

That's why the normal advice is: really, no, you can't do what you want, you must go via .setX methods on PreparedStatement if you want to escape this stuff.

From your question it looks like you have actual SQL statements as string literals inside your database which is itself a bizarre scenario, and something that will rather easily lead to severe security issues, so whilst you probably don't want to hear it, this design is in need of a complete rework, and it sounds like other code will grab this SQL and then run it verbatim.

Javascript had something like this for a while (eval) and the sheer number of security leaks that occurred because eval is a thing is staggering. There are also now ways to use headers to ban eval in the javascript on your site, and browsers working around bits in the URL appearing inside these, it's that bad - modern security guidelines tell you to entirely disable this feature, and you can't really rely on it working properly if you want to.

Given that it's such a bad idea to handroll SQL escaping, I doubt there are any libraries out there you can use. There is a small chance your JDBC driver (dig into those classes and check the website!) has a utility method that can do it; given that every dialect has different rules, this makes sense: Every SQL engine has a different JDBC driver. Clearly if your JDBC driver ships with an SQL escaping tool, it is the right one for that specific engine.

If you can't find that, the simplest way to escape things in most SQL dialects is to make a whitelist of allowed characters, and to escape every character that isn't on the whitelist. The whitelist should only contain definitely, absolutely safe things (a-z, 0-9, maybe _, -. Nothing that looks like a quote or a backslash should be on the list, and I'd avoid $ just because it's commonly used for variable substitution, which is not a thing in SQL but better safe than your production servers p0wned).

The rest can then be escaped. For example, in postgres, you'd turn the string:

Joe&#39;s bar &amp; Grill

into

E&#39;Joe\u0027s bar \u0026 Grill&#39;

The E means: String with escapes. The algorithm checks every character and copied over everything on the whitelist. the quote and ampersand weren't on it, so those were replaced by \u0000 where the zeroes are the hexadecimal encoding of the character (the value of s.charAt(i), cast to an int, printed as hex number)>

That should cover all your bases, but note that (sigh) escape strings are entirely outside of the SQL spec, it's a postgresism.

huangapple
  • 本文由 发表于 2020年9月7日 18:19:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/63775643.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定