英文:
How to validate a JSON with jq?
问题
I can provide you with the translated text:
我正在编写一个Bash库,在其中需要读取一个包含用户输入的JSON文件;JSON不会被伪造,但根据上游如何验证表单,它可能包含对Bash脚本有害的值。
在将JSON加载到Bash数组的函数中,我还不知道必须存在的键;不过,我希望在这里执行一些规则:
1. JSON必须包含一个单一对象。
2. 所有键都应该命名为Shell变量。
3. 所有值必须是标量。
4. 值中不能包含`NUL`字节。
我可以强制执行**#2**和**#4**,但我不知道如何执行**#1**和**#3**:
```sh
jq -r '
to_entries[] |
if (.key | test("^[_a-zA-Z]\\w*$") | not)
then
("错误:非法键 " + (.key | tojson) + "\n" |
halt_error(1))
else
if (.value | tostring | test("\u0000"))
then
("错误:值中包含NUL字节 " + (.value | tojson) + "\n" |
halt_error(1))
else
# 虚拟输出
.key + "=" + (.value | @sh)
end
end
'
结果相当意外:halt_error
函数半正常工作,但我得到了一个核心转储...
错误:非法键 ""
task_uuid='5d7ea654-b649-452b-a8fd-002b56be9a59'
task_name='hello world'
user_email=null
min=7
max=100
ave=2.5
file_in='data.txt'
file_out='results.txt'
*** jq 中的错误:受损的双向链表 ***
...
中止 (核心已转储)
使用jq
是否可能执行所有这些验证,还是我需要切换到其他工具?
<details>
<summary>英文:</summary>
I'm writing a bash library where I need to read a JSON file that contains user input; the JSON won't be forged but it might contain armful values for bash scripts, depending on how the form was validated upstream.
In the function that loads the JSON into a bash array, I'm not aware of the keys that must exist yet; still, there are some rules that I would like to enforce here:
1. The JSON must contain a single object
2. All keys shall be named like shell variables
3. All values must be scalars
4. No `NUL` byte in values
```json
{
"": "empty key shall fail",
"illegal key shall fail": "keys shall follow the naming convention of shell variables",
"non_scalar_shall_fail_1": [1,2],
"non_scalar_shall_fail_2": {"k":"v"},
"nul_byte_shall_fail": "file.php\u0000.txt"
}
{
"task_uuid": "5d7ea654-b649-452b-a8fd-002b56be9a59",
"task_name": "hello world",
"user_email": null,
"min": 7,
"max": 100,
"ave": 2.5,
"file_in": "data.txt",
"file_out": "results.txt"
}
I can enforce #2 and #4 but I don't know how to do it for #1 and #3:
jq -r '
to_entries[] |
if (.key | test("\\A[_a-zA-Z]\\w*\\z") | not)
then
("error: illegal key " + (.key | tojson) + "\n" |
halt_error(1))
else
if (.value | tostring | test("\u0000"))
then
("error: NUL byte in value " + (.value | tojson) + "\n" |
halt_error(1))
else
# dummy output
.key + "=" + (.value | @sh)
end
end
'
The result is quite unexpected: the halt_error
half-works and I get a coredump...
error: illegal key ""
task_uuid='5d7ea654-b649-452b-a8fd-002b56be9a59'
task_name='hello world'
user_email=null
min=7
max=100
ave=2.5
file_in='data.txt'
file_out='results.txt'
*** Error in `jq': corrupted double-linked list ***
...
Aborted (core dumped)
Is it even possible to do all those validations with jq
or do I need to switch to an other tool?
答案1
得分: 4
是的,JQ可以做到这一切。以下是我将如何编写它:
jq -nr '
def valid_key: test("^[A-Za-z_][0-9A-Za-z_]*$");
def valid_value:
type != "array" and type != "object" and (
type != "string" or (test("\u0000") | not)
);
def valid_input:
type == "object" and (
to_entries | all(
(.key | valid_key) and (.value | valid_value)
)
);
input |
if valid_input and isempty(inputs) then
to_entries[] | "\(.key)=\(.value | @sh)"
else
"bad input\n" | halt_error(1)
end'
英文:
Yes, JQ can do all that. Here is how I'd write it:
jq -nr '
def valid_key: test("^[A-Za-z_][0-9A-Za-z_]*$");
def valid_value:
type != "array" and type != "object" and (
type != "string" or (test("\u0000") | not)
);
def valid_input:
type == "object" and (
to_entries | all(
(.key | valid_key) and (.value | valid_value)
)
);
input |
if valid_input and isempty(inputs) then
to_entries[] | "\(.key)=\(.value | @sh)"
else
"bad input\n" | halt_error(1)
end'
答案2
得分: 2
这是一个演示如何进行测试的示例。然而,对于生产环境,显然需要更多的优化。
"" 不合法 (#2)
"illegal key shall fail" 不合法 (#2)
[1,2] 不是标量 (#3)
{"k":"v"} 不是标量 (#3)
".txt" 前面有NUL字符 (#4)
task_uuid='5d7ea654-b649-452b-a8fd-002b56be9a59'
task_name='hello world'
user_email=null
min=7
max=100
ave=2.5
file_in='data.txt'
file_out='results.txt'
英文:
Here's a demo how to do the testing. For production, however, it'd obviously need more refinement.
if type != "object" then "\(@json) is not an object (#1)"
else to_entries[] |
if .key | test("\\A[_a-zA-Z]\\w*\\z") | not then "\(.key | @json) is illegal (#2)"
else .value |
if isempty(scalars) then "\(.) is not a scalar (#3)"
else tostring |
if test("\u0000") then "\(.[index("\u0000")+1:] | @json) is preceded by NUL (#4)"
else empty
end
end
end
end // (to_entries[] | .key + "=" + (.value | @sh))
"" is illegal (#2)
"illegal key shall fail" is illegal (#2)
[1,2] is not a scalar (#3)
{"k":"v"} is not a scalar (#3)
".txt" is preceded by NUL (#4)
task_uuid='5d7ea654-b649-452b-a8fd-002b56be9a59'
task_name='hello world'
user_email=null
min=7
max=100
ave=2.5
file_in='data.txt'
file_out='results.txt'
答案3
得分: 2
以下是@oguzismail答案的一个变体。主要区别在于“driver”程序会识别输入中的所有验证错误,只要它是有效的JSON实体流。
假设1.json和2.json是对应于问题中两个示例JSON对象的文件:
(cat 1.json 2.json 1.json1) | jq -n '
# 如果条件为假,那么(msg|debug);在所有情况下,都会发出。
def verify(cond; msg):
. as $in
| if cond then . else (msg | debug) | $in end;
def valid_key: test("^[A-Za-z_][0-9A-Za-z_]*$");
def valid_value:
type != "array" and
type != "object" and
(type != "string" or (test("\u0000") | not) );
# $count用于消息和返回值
def validate($count):
if type == "object"
then to_entries[]
| verify(.key | valid_key; "实体 #\($count) 中的无效键: \(.key)")
| verify(.value | valid_value; "实体 #\($count) 中的无效值: \(.value)")
else "实体 #\($count) 不是JSON对象" | debug
end
| $count;
reduce inputs as $in (0;
.+1
| . as $count
| $in | validate($count) )
| verify(.==1; "只允许一个实体")
| empty
'
生成的结果:
["DEBUG:","实体 #1 中的无效键: "]
["DEBUG:","实体 #1 中的无效键: illegal key shall fail"]
["DEBUG:","实体 #1 中的无效值: [1,2]"]
["DEBUG:","实体 #1 中的无效值: {"k":"v"}"]
["DEBUG:","实体 #1 中的无效值: file.php\u0000.txt"]
["DEBUG:","实体 #3 中的无效键: "]
["DEBUG:","实体 #3 中的无效键: illegal key shall fail"]
["DEBUG:","实体 #3 中的无效值: [1,2]"]
["DEBUG:","实体 #3 中的无效值: {"k":"v"}"]
["DEBUG:","实体 #3 中的无效值: file.php\u0000.txt"]
["DEBUG:","只允许一个实体"]
英文:
The following is a variant of @oguzismail's answer. The main difference is that the "driver" program identifies all validation errors in the input, so long as it is a valid stream of JSON entities.
Assuming 1.json and 2.json are the files corresponding to the two sample JSON objects in the Q:
(cat 1.json 2.json 1.json1) | jq -n '
# If cond is falsey, then (msg|debug); in all cases, emit .
def verify(cond; msg):
. as $in
| if cond then . else (msg | debug) | $in end;
def valid_key: test("^[A-Za-z_][0-9A-Za-z_]*$");
def valid_value:
type != "array" and
type != "object" and
(type != "string" or (test("\u0000") | not) );
# $count is for the messages and for the return value
def validate($count):
if type == "object"
then to_entries[]
| verify(.key | valid_key; "invalid key in entity #\($count): \(.key)")
| verify(.value | valid_value; "invalid value in entity #\($count): \(.value)")
else "entity #\($count) is not a JSON object" | debug
end
| $count;
reduce inputs as $in (0;
.+1
| . as $count
| $in | validate($count) )
| verify(.==1; "only one entity is allowed")
| empty
'
produces:
["DEBUG:","invalid key in entity #1: "]
["DEBUG:","invalid key in entity #1: illegal key shall fail"]
["DEBUG:","invalid value in entity #1: [1,2]"]
["DEBUG:","invalid value in entity #1: {\"k\":\"v\"}"]
["DEBUG:","invalid value in entity #1: file.php\u0000.txt"]
["DEBUG:","invalid key in entity #3: "]
["DEBUG:","invalid key in entity #3: illegal key shall fail"]
["DEBUG:","invalid value in entity #3: [1,2]"]
["DEBUG:","invalid value in entity #3: {\"k\":\"v\"}"]
["DEBUG:","invalid value in entity #3: file.php\u0000.txt"]
["DEBUG:","only one entity is allowed"]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论