如何使用jq验证JSON?

huangapple go评论59阅读模式
英文:

How to validate a JSON with jq?

问题

I can provide you with the translated text:

我正在编写一个Bash库,在其中需要读取一个包含用户输入的JSON文件;JSON不会被伪造,但根据上游如何验证表单,它可能包含对Bash脚本有害的值。

在将JSON加载到Bash数组的函数中,我还不知道必须存在的键;不过,我希望在这里执行一些规则:

1. JSON必须包含一个单一对象。
2. 所有键都应该命名为Shell变量。
3. 所有值必须是标量。
4. 值中不能包含`NUL`字节。

我可以强制执行**#2**和**#4**,但我不知道如何执行**#1**和**#3**:

```sh
jq -r '
    to_entries[] |
    if (.key | test("^[_a-zA-Z]\\w*$") | not)
    then
        ("错误:非法键 " + (.key | tojson) + "\n" |
        halt_error(1))
    else
        if (.value | tostring | test("\u0000"))
        then
            ("错误:值中包含NUL字节 " + (.value | tojson) + "\n" |
            halt_error(1))
        else
            # 虚拟输出
            .key + "=" + (.value | @sh)
        end
    end
'

结果相当意外:halt_error 函数半正常工作,但我得到了一个核心转储...

错误:非法键 ""
task_uuid='5d7ea654-b649-452b-a8fd-002b56be9a59'
task_name='hello world'
user_email=null
min=7
max=100
ave=2.5
file_in='data.txt'
file_out='results.txt'
*** jq 中的错误:受损的双向链表 ***
...
中止 (核心已转储)

使用jq是否可能执行所有这些验证,还是我需要切换到其他工具?



<details>
<summary>英文:</summary>

I&#39;m writing a bash library where I need to read a JSON file that contains user input; the JSON won&#39;t be forged but it might contain armful values for bash scripts, depending on how the form was validated upstream.

In the function that loads the JSON into a bash array, I&#39;m not aware of the keys that must exist yet; still, there are some rules that I would like to enforce here:

1. The JSON must contain a single object
2. All keys shall be named like shell variables
3. All values must be scalars
4. No `NUL` byte in values

```json
{
  &quot;&quot;: &quot;empty key shall fail&quot;,
  &quot;illegal key shall fail&quot;: &quot;keys shall follow the naming convention of shell variables&quot;,
  &quot;non_scalar_shall_fail_1&quot;: [1,2],
  &quot;non_scalar_shall_fail_2&quot;: {&quot;k&quot;:&quot;v&quot;},
  &quot;nul_byte_shall_fail&quot;: &quot;file.php\u0000.txt&quot;
}
{
  &quot;task_uuid&quot;: &quot;5d7ea654-b649-452b-a8fd-002b56be9a59&quot;,
  &quot;task_name&quot;: &quot;hello world&quot;,
  &quot;user_email&quot;: null,
  &quot;min&quot;: 7,
  &quot;max&quot;: 100,
  &quot;ave&quot;: 2.5,
  &quot;file_in&quot;: &quot;data.txt&quot;,
  &quot;file_out&quot;: &quot;results.txt&quot;
}

I can enforce #2 and #4 but I don't know how to do it for #1 and #3:

jq -r &#39;
    to_entries[] |
    if (.key | test(&quot;\\A[_a-zA-Z]\\w*\\z&quot;) | not)
    then
        (&quot;error: illegal key &quot; + (.key | tojson) + &quot;\n&quot; |
        halt_error(1))
    else
        if (.value | tostring | test(&quot;\u0000&quot;))
        then
            (&quot;error: NUL byte in value &quot; + (.value | tojson) + &quot;\n&quot; |
            halt_error(1))
        else
            # dummy output
            .key + &quot;=&quot; + (.value | @sh)
        end
    end
&#39;

The result is quite unexpected: the halt_error half-works and I get a coredump...

error: illegal key &quot;&quot;
task_uuid=&#39;5d7ea654-b649-452b-a8fd-002b56be9a59&#39;
task_name=&#39;hello world&#39;
user_email=null
min=7
max=100
ave=2.5
file_in=&#39;data.txt&#39;
file_out=&#39;results.txt&#39;
*** Error in `jq&#39;: corrupted double-linked list ***
...
Aborted                 (core dumped)

Is it even possible to do all those validations with jq or do I need to switch to an other tool?

答案1

得分: 4

是的,JQ可以做到这一切。以下是我将如何编写它:

jq -nr '
def valid_key: test("^[A-Za-z_][0-9A-Za-z_]*$");
def valid_value:
    type != "array" and type != "object" and (
        type != "string" or (test("\u0000") | not)
    );
def valid_input:
    type == "object" and (
        to_entries | all(
            (.key | valid_key) and (.value | valid_value)
        )
    );
input |
    if valid_input and isempty(inputs) then
        to_entries[] | "\(.key)=\(.value | @sh)"
    else
        "bad input\n" | halt_error(1)
    end'
英文:

Yes, JQ can do all that. Here is how I'd write it:

jq -nr &#39;
def valid_key: test(&quot;^[A-Za-z_][0-9A-Za-z_]*$&quot;);
def valid_value:
    type != &quot;array&quot; and type != &quot;object&quot; and (
        type != &quot;string&quot; or (test(&quot;\u0000&quot;) | not)
    );
def valid_input:
    type == &quot;object&quot; and (
        to_entries | all(
            (.key | valid_key) and (.value | valid_value)
        )
    );
input |
    if valid_input and isempty(inputs) then
        to_entries[] | &quot;\(.key)=\(.value | @sh)&quot;
    else
        &quot;bad input\n&quot; | halt_error(1)
    end&#39;

答案2

得分: 2

这是一个演示如何进行测试的示例。然而,对于生产环境,显然需要更多的优化。
"" 不合法 (#2)
"illegal key shall fail" 不合法 (#2)
[1,2] 不是标量 (#3)
{"k":"v"} 不是标量 (#3)
".txt" 前面有NUL字符 (#4)
task_uuid='5d7ea654-b649-452b-a8fd-002b56be9a59'
task_name='hello world'
user_email=null
min=7
max=100
ave=2.5
file_in='data.txt'
file_out='results.txt'

演示链接

英文:

Here's a demo how to do the testing. For production, however, it'd obviously need more refinement.

if type != &quot;object&quot; then &quot;\(@json) is not an object (#1)&quot;
else to_entries[] |
  if .key | test(&quot;\\A[_a-zA-Z]\\w*\\z&quot;) | not then &quot;\(.key | @json) is illegal (#2)&quot;
  else .value |
    if isempty(scalars) then &quot;\(.) is not a scalar (#3)&quot;
    else tostring |
      if test(&quot;\u0000&quot;) then &quot;\(.[index(&quot;\u0000&quot;)+1:] | @json) is preceded by NUL (#4)&quot;
      else empty
      end
    end
  end
end // (to_entries[] | .key + &quot;=&quot; + (.value | @sh))
&quot;&quot; is illegal (#2)
&quot;illegal key shall fail&quot; is illegal (#2)
[1,2] is not a scalar (#3)
{&quot;k&quot;:&quot;v&quot;} is not a scalar (#3)
&quot;.txt&quot; is preceded by NUL (#4)
task_uuid=&#39;5d7ea654-b649-452b-a8fd-002b56be9a59&#39;
task_name=&#39;hello world&#39;
user_email=null
min=7
max=100
ave=2.5
file_in=&#39;data.txt&#39;
file_out=&#39;results.txt&#39;

Demo

答案3

得分: 2

以下是@oguzismail答案的一个变体。主要区别在于“driver”程序会识别输入中的所有验证错误,只要它是有效的JSON实体流。

假设1.json和2.json是对应于问题中两个示例JSON对象的文件:

(cat 1.json 2.json 1.json1) | jq -n '

# 如果条件为假,那么(msg|debug);在所有情况下,都会发出。
def verify(cond; msg): 
  . as $in
  | if cond then . else (msg | debug) | $in end;

def valid_key: test("^[A-Za-z_][0-9A-Za-z_]*$");

def valid_value:
  type != "array" and 
  type != "object" and
  (type != "string" or (test("\u0000") | not) );

# $count用于消息和返回值
def validate($count):
   if type == "object"
   then to_entries[]
   | verify(.key | valid_key; "实体 #\($count) 中的无效键: \(.key)")
   | verify(.value | valid_value; "实体 #\($count) 中的无效值: \(.value)")
   else "实体 #\($count) 不是JSON对象" | debug
   end
   | $count;

reduce inputs as $in (0;
  .+1
  | . as $count
  | $in | validate($count) )
| verify(.==1; "只允许一个实体")
| empty
'

生成的结果:

["DEBUG:","实体 #1 中的无效键: "]
["DEBUG:","实体 #1 中的无效键: illegal key shall fail"]
["DEBUG:","实体 #1 中的无效值: [1,2]"]
["DEBUG:","实体 #1 中的无效值: {"k":"v"}"]
["DEBUG:","实体 #1 中的无效值: file.php\u0000.txt"]
["DEBUG:","实体 #3 中的无效键: "]
["DEBUG:","实体 #3 中的无效键: illegal key shall fail"]
["DEBUG:","实体 #3 中的无效值: [1,2]"]
["DEBUG:","实体 #3 中的无效值: {"k":"v"}"]
["DEBUG:","实体 #3 中的无效值: file.php\u0000.txt"]
["DEBUG:","只允许一个实体"]
英文:

The following is a variant of @oguzismail's answer. The main difference is that the "driver" program identifies all validation errors in the input, so long as it is a valid stream of JSON entities.

Assuming 1.json and 2.json are the files corresponding to the two sample JSON objects in the Q:

(cat 1.json 2.json 1.json1) | jq -n &#39;

# If cond is falsey, then (msg|debug); in all cases, emit .
def verify(cond; msg): 
  . as $in
  | if cond then . else (msg | debug) | $in end;

def valid_key: test(&quot;^[A-Za-z_][0-9A-Za-z_]*$&quot;);

def valid_value:
  type != &quot;array&quot; and 
  type != &quot;object&quot; and
  (type != &quot;string&quot; or (test(&quot;\u0000&quot;) | not) );

# $count is for the messages and for the return value
def validate($count):
   if type == &quot;object&quot;
   then to_entries[]
   | verify(.key | valid_key; &quot;invalid key in entity #\($count): \(.key)&quot;)
   | verify(.value | valid_value; &quot;invalid value in entity #\($count): \(.value)&quot;)
   else &quot;entity #\($count) is not a JSON object&quot; | debug
   end
   | $count;

reduce inputs as $in (0;
  .+1
  | . as $count
  | $in | validate($count) )
| verify(.==1; &quot;only one entity is allowed&quot;)
| empty
&#39;

produces:

[&quot;DEBUG:&quot;,&quot;invalid key in entity #1: &quot;]
[&quot;DEBUG:&quot;,&quot;invalid key in entity #1: illegal key shall fail&quot;]
[&quot;DEBUG:&quot;,&quot;invalid value in entity #1: [1,2]&quot;]
[&quot;DEBUG:&quot;,&quot;invalid value in entity #1: {\&quot;k\&quot;:\&quot;v\&quot;}&quot;]
[&quot;DEBUG:&quot;,&quot;invalid value in entity #1: file.php\u0000.txt&quot;]
[&quot;DEBUG:&quot;,&quot;invalid key in entity #3: &quot;]
[&quot;DEBUG:&quot;,&quot;invalid key in entity #3: illegal key shall fail&quot;]
[&quot;DEBUG:&quot;,&quot;invalid value in entity #3: [1,2]&quot;]
[&quot;DEBUG:&quot;,&quot;invalid value in entity #3: {\&quot;k\&quot;:\&quot;v\&quot;}&quot;]
[&quot;DEBUG:&quot;,&quot;invalid value in entity #3: file.php\u0000.txt&quot;]
[&quot;DEBUG:&quot;,&quot;only one entity is allowed&quot;]

huangapple
  • 本文由 发表于 2023年6月15日 19:19:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76481941.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定