英文:
How to build a query with an optional filter resistant to SQL injections?
问题
有这样一个数据库:
还有这样一个存储函数:
CREATE FUNCTION fetch_mentor(direction_type_list_in text, education_type_list_in text, name_in text, city_id_in integer) RETURNS json
LANGUAGE plpgsql
AS
$$
DECLARE
direction_type_list_inner INT[];
education_type_list_inner INT[];
BEGIN
direction_type_list_inner = STRING_TO_ARRAY(direction_type_list_in, ',');
education_type_list_inner = STRING_TO_ARRAY(education_type_list_in, ',');
RETURN (SELECT JSON_AGG(rows)
FROM (SELECT m.id AS "ID"
, FORMAT('%s %s', m.firstname, m.lastname) AS "fullName"
, m.photo_url AS "photoURL"
, m.video_url AS "videoURL"
, (SELECT JSON_AGG(mc.name)
FROM mentors.mentor_competence mc
WHERE mc.mentor_id = m.id) AS "competenceList"
, (SELECT JSON_AGG(el.name)
FROM mentors.mentor_employment me
INNER JOIN lists.employment_list el ON el.id = me.employment_id
WHERE me.mentor_id = m.id) AS "competenceList"
, m.certified AS "certified"
, cl.display_name AS "cityName"
, (SELECT JSON_AGG(dtl.display_name)
FROM mentors.mentor_direction_type mdt
INNER JOIN lists.direction_type_list dtl ON mdt.direction_type_id = dtl.id
WHERE mdt.mentor_id = m.id) "directionList"
, (SELECT JSON_AGG(etl.display_name)
FROM mentors.mentor_education_type met
INNER JOIN lists.education_type_list etl ON met.education_type_id = etl.id
WHERE met.mentor_id = m.id) "educationList"
FROM mentors.mentor m
LEFT JOIN lists.city_list cl ON m.city_id = cl.id
WHERE approved = TRUE
AND (direction_type_list_in ISNULL OR
m.id IN (SELECT m.id
FROM mentors.mentor_direction_type mdt
WHERE mdt.direction_type_id = ANY (direction_type_list_inner)))
AND (education_type_list_inner ISNULL OR
m.id IN (SELECT m.id
FROM mentors.mentor_education_type met
WHERE met.education_type_id = ANY (education_type_list_inner)))
AND (name_in ISNULL OR
(m.firstname LIKE FORMAT('%%%s%%', $3) OR m.lastname LIKE FORMAT('%%%s%%', $3)))
AND (city_id_in ISNULL OR m.city_id = city_id_in)
) rows);
END;
$$;
ALTER FUNCTION fetch_mentor(TEXT, TEXT, TEXT, INTEGER) OWNER TO postgres;
它返回这样的 JSON:
[
{
"ID": 3,
"fullName": "fsafd 413",
"photoURL": "sadf",
"videoURL": "dsa",
"competenceList": [
"a",
"s",
"f"
],
"employmentList": [
"a",
"b",
"c"
],
"certified": false,
"cityName": null,
"directionList": [
"x",
"z"
],
"educationList": [
"offline"
]
}
]
以下是输入的参数:
type FetchMentorsParams struct {
DirectionTypeList []int `json:"directionTypeList"`
EducationTypeList []int `json:"educationTypeList"`
MentorName *string `json:"mentorName"`
}
所有过滤器都是可选的。也就是说,如果没有过滤器,那么将输出所有记录。例如,如果 Direction Type List = [1,2]
,则只返回 mentor_direction_type
表中 direction_type_id = 1
或 2
的记录。以此类推。
但是有没有一种方法可以形成一个抵御 SQL 注入的查询,而不使用存储函数?如果像 len(direction Type List) != 0
这样做,然后在循环中添加子查询,那么查询将容易受到 SQL 注入攻击。
总的来说,在一个查询中尝试做所有事情有多好?我看到一个选项:在主查询之前,对 mentor_direction_type
和 mentor_education_type
进行查询,然后执行 where mentor_id in (查询结果)
。
像这样的查询 (SELECT JSON_AGG(el.name) FROM mentors.mentor_employment me INNER JOIN lists.employment_list el ON el.id = me.employment_id WHERE me.mentor_id = m.id) AS "competenceList"
是正常的吗?我不确定它是否经过了优化,但我没有看到其他的选择。
我正在使用 Go、SQLX 和 PGX。
英文:
There is such a database:
And such a stored function:
CREATE FUNCTION fetch_mentor(direction_type_list_in text, education_type_list_in text, name_in text, city_id_in integer) RETURNS json
LANGUAGE plpgsql
AS
$$
DECLARE
direction_type_list_inner INT[];
education_type_list_inner INT[];
BEGIN
direction_type_list_inner = STRING_TO_ARRAY(direction_type_list_in, ',');
education_type_list_inner = STRING_TO_ARRAY(education_type_list_in, ',');
RETURN (SELECT JSON_AGG(rows)
FROM (SELECT m.id AS "ID"
, FORMAT('%s %s', m.firstname, m.lastname) AS "fullName"
, m.photo_url AS "photoURL"
, m.video_url AS "videoURL"
, (SELECT JSON_AGG(mc.name)
FROM mentors.mentor_competence mc
WHERE mc.mentor_id = m.id) AS "competenceList"
, (SELECT JSON_AGG(el.name)
FROM mentors.mentor_employment me
INNER JOIN lists.employment_list el ON el.id = me.employment_id
WHERE me.mentor_id = m.id) AS "competenceList"
, m.certified AS "certified"
, cl.display_name AS "cityName"
, (SELECT JSON_AGG(dtl.display_name)
FROM mentors.mentor_direction_type mdt
INNER JOIN lists.direction_type_list dtl ON mdt.direction_type_id = dtl.id
WHERE mdt.mentor_id = m.id) "directionList"
, (SELECT JSON_AGG(etl.display_name)
FROM mentors.mentor_education_type met
INNER JOIN lists.education_type_list etl ON met.education_type_id = etl.id
WHERE met.mentor_id = m.id) "educationList"
FROM mentors.mentor m
LEFT JOIN lists.city_list cl ON m.city_id = cl.id
WHERE approved = TRUE
AND (direction_type_list_in ISNULL OR
m.id IN (SELECT m.id
FROM mentors.mentor_direction_type mdt
WHERE mdt.direction_type_id = ANY (direction_type_list_inner)))
AND (education_type_list_inner ISNULL OR
m.id IN (SELECT m.id
FROM mentors.mentor_education_type met
WHERE met.education_type_id = ANY (education_type_list_inner)))
AND (name_in ISNULL OR
(m.firstname LIKE FORMAT('%%%s%%', $3) OR m.lastname LIKE FORMAT('%%%s%%', $3)))
AND (city_id_in ISNULL OR m.city_id = city_id_in)
) rows);
END;
$$;
ALTER FUNCTION fetch_mentor(TEXT, TEXT, TEXT, INTEGER) OWNER TO postgres;
Which returns such a json:
[
{
"ID": 3,
"fullName": "fsafd 413",
"photoURL": "sadf",
"videoURL": "dsa",
"competenceList": [
"a",
"s",
"f"
],
"employmentList": [
"a",
"b",
"c"
],
"certified": false,
"cityName": null,
"directionList": [
"x",
"z"
],
"educationList": [
"offline"
]
}
]
The following parameters come to the input:
type FetchMentorsParams struct {
DirectionTypeList []int `json:"directionTypeList"`
EducationTypeList []int `json:"educationTypeList"`
MentorName *string `json:"mentorName"`
}
All filters are optional. That is, if there are no filters, then all records are output. If, for example, Direction Type List = [1,2]
, then only those records should be returned where mentor_id in the table mentor_direction_type
is in the column where direction_type_id = 1 or 2
. And so on.
But is there a way to form a query without a stored function that is resistant to SQL Injection? If you do something like len(direction Type List) != 0
and then add subqueries in the for loop, then the query will be vulnerable to sql injection.
And in general, how good is it to try to do everything in one query? I see an option: before the main query, make query to mentor_direction_type
and to mentor_education_type
and then do where mentor_id in (the result of querys)
.
And it is normal to make such querys as - (SELECT JSON_AGG(el.name) FROM mentors.mentor_employment me INNER JOIN lists.employment_list el ON el.id = me.employment_id WHERE me.mentor_id = m.id) AS "competenceList"
, I'm not sure if it's optimized, but I don't see any other option.
Im using Go, SQLX, PGX.
答案1
得分: 1
为了分解你的问题。
函数
这样一个存储函数:
尽管在其他编程语言中编写plpgsql
函数似乎是合乎逻辑的,但在Postgresql中,这些函数有一个注意事项,即查询规划器无法正确评估函数内部运行的成本。这意味着随着数据的增长,查询规划器的准确性将逐渐降低,认为函数的得分相对较低,而实际上可能是潜在的高得分。
除非你的函数可以使用其中一个不稳定性修饰符,否则我会始终将其编写为标准查询语句。
如果优化是一个问题,你可以使用embed
Go包,并使用漂亮整洁的.sql文件进行操作
过滤器和SQL注入
所有过滤器都是可选的。
我也欣赏你想要一个能够接受任何一组参数并输出正确结果的查询的想法。
然而,这可能会陷入评估所有可能的组合和场景的兔子洞。
有时候,最好有一点冗余,同时也有更小、更好定义意图的查询。
而且制作这样的查询是正常的 - (SELECT JSON_AGG(el.name) FROM mentors.mentor_employment me INNER JOIN lists.employment_list el ON el.id = me.employment_id WHERE me.mentor_id = m.id) AS "competenceList",我不确定它是否被优化了,但我没有看到其他选择。
嵌套查询并不一定是坏事,请记住,对我们来说清晰的SQL并不总是对查询规划器来说清晰。
在评估子查询之前,我建议你使用EXPLAIN ANALYZE
运行fetch_mentor
函数的内容,这将告诉你查询规划器将如何处理它,并帮助你了解是否存在任何瓶颈。
如果你对EXPLAIN ANALYZE
不熟悉,你可以复制其输出并将其粘贴到诸如https://flame-explain.com/visualize/input之类的工具中进行可视化分析。
英文:
To break down your question.
Functions
> And such a stored function:
Despite writing plpgsql
functions seems logical coming from other programming languages, in Postgresql these functions have a caveat, that is the planner will not be able to properly evaluate the cost of what is running inside the function. This means that as data grows, your planner will become increasingly off, thinking that your function has a somewhat low score while instead it can be potentially high.
Unless your function can use one of the volatility modifiers then I would always write it as a standard query statement.
If optimisation is a concern, you can use the embed
go package and work with nice and tidy .sql files
Filters and SQL injection
> All filters are optional.
I also appreciate the fact that you want a query "to rule them all" which is capable of accepting any set of arguments and spit out the correct output.
However this can turn into a rabbit hole of evaluating all the possible combinations and scenarios.
Sometimes it is better to have a little bit of redundancy while also having smaller, better intent-defined, queries.
> And it is normal to make such querys as - (SELECT JSON_AGG(el.name) FROM mentors.mentor_employment me INNER JOIN lists.employment_list el ON el.id = me.employment_id WHERE me.mentor_id = m.id) AS "competenceList", I'm not sure if it's optimized, but I don't see any other option.
Nested queries are not necessarily bad, remember that SQL that is clear to us is not always clear to the planner.
Before judging the subquery, I would recommend you to run the content of your fetch_mentor
function with an EXPLAIN ANALYZE
, this will tell you how the planner will tackle it and it will help you understanding if there are any bottlenecks.
If you are new to EXPLAIN ANALYZE
, you can copy the output of it and paste it in tools such as https://flame-explain.com/visualize/input
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论