匹配规则的JSON与数据的JSON以在Python 3中查找值

huangapple go评论59阅读模式
英文:

Match rule json with data json to find value in python3

问题

Python3

需要根据数据应用规则查找文章ID

给定以JSON格式包含文章ID和相应属性规则的规则集。

例如:rules.json

{
  "rules": [
    {
      "articleId": "art1",
      "properties_rule": [
        {
          "condition": "EQ",
          "logicalOperator": "AND",
          "propertyId": 487,
          "value": "aaaa"
        },
        {
          "condition": "EQ",
          "logicalOperator": "",
          "propertyId": 487,
          "value": "zzzz"
        }
      ]
    },
    {
      "articleId": "art2",
      "properties_rule": [
        {
          "condition": "GTE",
          "logicalOperator": "AND",
          "propertyId": 487,
          "value": "bbbb"
        },
        {
          "condition": "LTE",
          "logicalOperator": "",
          "propertyId": 487,
          "value": "eeee"
        }
      ]
    },
    {
      "articleId": "art3",
      "properties_rule": [
        {
          "condition": "GTE",
          "logicalOperator": "",
          "propertyId": 487,
          "value": "ffff"
        }
      ]
    }
  ]
}

以及以JSON格式包含某些属性值的数据集。类似于data.json

{
  "data": {
    "1": {
      "properties_values": [
        {
          "value": {
            "property_id": 487,
            "property_value": "aaaa",
            "response_id": 1
          }
        }
      ]
    },
    "2": {
      "properties_values": [
        {
          "value": {
            "property_id": 487,
            "property_value": "bbbb",
            "response_id": 2
          }
        }
      ]
    },
    "3": {
      "properties_values": [
        {
          "value": {
            "property_id": 487,
            "property_value": "eeee",
            "response_id": 3
          }
        }
      ]
    }
  }
}

任务是根据规则在数据上应用规则,并确定与规则JSON中指定的“properties_rule”数组中的条件相匹配的文章ID。

我们如何使用提供的规则和数据JSON来确定每个数据JSON条目对应的“articleId”,根据规则JSON中“properties_rule”数组中指定的条件?在数据JSON中,“property_id”字段对应规则JSON中的“propertyId”字段,“property_value”字段对应规则JSON中的“value”字段。

尝试解决这个问题,但不喜欢使用太多for循环

要根据规则JSON中“properties_rule”数组中指定的条件确定数据JSON中每个条目的“articleId”,可以按照以下Python步骤进行操作:

  1. 使用json模块将规则和数据JSON加载到Python字典中。
  2. 循环遍历数据JSON中的每个条目。
  3. 对于每个条目,循环遍历规则JSON以根据“properties_rule”数组中指定的条件找到匹配的文章ID。
  4. 对于每个规则,循环遍历数据条目中的“properties_values”数组以查找匹配的属性ID。
  5. 如果找到匹配的属性ID,请检查属性值是否符合规则中指定的条件。如果不符合条件,请继续下一个规则。
  6. 如果所有条件都满足,请返回与规则关联的文章ID。
  7. 如果找不到匹配的文章ID,请返回默认值或引发错误。

以下是实施此逻辑的示例代码:

import json

# 将规则和数据JSON文件加载到内存中
with open('rules.json', 'r') as f:
    rules_data = json.load(f)
rules = rules_data['rules']

with open('data.json', 'r') as f:
    data = json.load(f)
data = data['data']

# 循环遍历每个规则并检查它是否匹配数据
for rule in rules:
    properties_rule = rule['properties_rule']
    articleId = rule['articleId']
    
    # 检查“properties_rule”数组中的所有条件是否满足
    matched = True
    for prop_rule in properties_rule:
        propertyId = prop_rule['propertyId']
        value = prop_rule['value']
        condition = prop_rule['condition']
        
        # 检查是否有任何调查响应与条件匹配
        response_matched = False
        for key, value_dict in data.items():
            properties_value = value_dict['properties_values']
            for prop_value in properties_value:
                survey_response = prop_value['value']
                if survey_response['property_id'] == propertyId:
                    property_value = survey_response['property_value']
                    if condition == 'EQ' and property_value == value:
                        response_matched = True
                    elif condition == 'GTE' and property_value >= value:
                        response_matched = True
                    elif condition == 'LTE' and property_value <= value:
                        response_matched = True
        # 如果没有调查响应与条件匹配,请将matched设置为False
        if not response_matched:
            matched = False
            break
    
    # 如果满足所有条件,请返回该规则关联的文章ID
    if matched:
        print("文章ID:", articleId)
        break

# 如果没有规则与数据匹配,请返回默认的文章ID
if not matched:
    print("默认文章ID")

但是寻找更高效的方法,不希望使用太多for循环

英文:

PYTHON3

Need to Find articleid by applying rules on data

Given a set of rules in JSON format, which includes article IDs and corresponding properties rules.

Eg: rules.json

{
  &quot;rules&quot;: [
    {
      &quot;articleId&quot;: &quot;art1&quot;,
      &quot;properties_rule&quot;: [
        {
          &quot;condition&quot;: &quot;EQ&quot;,
          &quot;logicalOperator&quot;: &quot;AND&quot;,
          &quot;propertyId&quot;: 487,
          &quot;value&quot;: &quot;aaaa&quot;
        },
        {
          &quot;condition&quot;: &quot;EQ&quot;,
          &quot;logicalOperator&quot;: &quot;&quot;,
          &quot;propertyId&quot;: 487,
          &quot;value&quot;: &quot;zzzz&quot;
        }
      ],
      
    },
    {
      &quot;articleId&quot;: &quot;art2&quot;,
      &quot;properties_rule&quot;: [
        {
          &quot;condition&quot;: &quot;GTE&quot;,
          &quot;logicalOperator&quot;: &quot;AND&quot;,
          &quot;propertyId&quot;: 487,
          &quot;value&quot;: &quot;bbbb&quot;
        },
        {
          &quot;condition&quot;: &quot;LTE&quot;,
          &quot;logicalOperator&quot;: &quot;&quot;,
          &quot;propertyId&quot;: 487,
          &quot;value&quot;: &quot;eeee&quot;
        }
      ],
      
    },
    {
      &quot;articleId&quot;: &quot;art3&quot;,
      &quot;properties_rule&quot;: [
        {
          &quot;condition&quot;: &quot;GTE&quot;,
          &quot;logicalOperator&quot;: &quot;&quot;,
          &quot;propertyId&quot;: 487,
          &quot;value&quot;: &quot;ffff&quot;
        }
      ],
      
    }
  ]
}

Qs well as a set of data in JSON format, which includes values for certain properties.
Like Eg: data.json

{
  &quot;data&quot;: {
    &quot;1&quot;: {
      &quot;properties_values&quot;: [
        {
          &quot;value&quot;: {
            &quot;property_id&quot;: 487,
            &quot;property_value&quot;: &quot;aaaa&quot;,
            &quot;response_id&quot;: 1
          }
        }
      ]
    },
    &quot;2&quot;: {
      &quot;properties_values&quot;: [
        {
          &quot;value&quot;: {
            &quot;property_id&quot;: 487,
            &quot;property_value&quot;: &quot;bbbb&quot;,
            &quot;response_id&quot;: 2
          }
        }
      ]
    },
    &quot;3&quot;: {
      &quot;properties_values&quot;: [
        {
          &quot;value&quot;: {
            &quot;property_id&quot;: 487,
            &quot;property_value&quot;: &quot;eeee&quot;,
            &quot;response_id&quot;: 3
          }
        }
      ]
    }
  }
}

the task is to apply the rules on the data and determine the article ID that matches the rules.

How can we use the provided rules and data JSON to determine the corresponding "articleId" for each entry in the "data" JSON, based on the conditions specified in the "properties_rule" arrays of the rules JSON? The "property_id" field in the "data" JSON corresponds to the "propertyId" field in the "rules" JSON, and the "property_value" field in the "data" JSON corresponds to the "value" field in the "rules" JSON.


Tried this To Solve this But didn't liked so many for loops

To determine the corresponding "articleId" for each entry in the "data" JSON based on the conditions specified in the "properties_rule" arrays of the rules JSON, we can follow the following steps in Python:

  1. Load the rules and data JSON into Python dictionaries using the json module.
  2. Loop through each entry in the "data" JSON.
  3. For each entry, loop through the "rules" JSON to find a matching article ID based on the conditions specified in the "properties_rule" arrays.
  4. For each rule, loop through the "properties_values" array in the data entry to find a matching property ID.
  5. If a matching property ID is found, check if the property value meets the condition specified in the rule. If it does not meet the condition, move on to the next rule.
  6. If all conditions are met, return the article ID associated with the rule.
  7. If no matching article ID is found, return a default value or raise an error.

Here is sample code that implements this logic:

import json

# Load the rules and data JSON files into memory
with open(&#39;rules.json&#39;, &#39;r&#39;) as f:
    rules_data = json.load(f)
rules = rules_data[&#39;rules&#39;]

with open(&#39;data.json&#39;, &#39;r&#39;) as f:
    data = json.load(f)
data = data[&#39;data&#39;]

# Loop through each rule and check if it matches the data
for rule in rules:
    properties_rule = rule[&#39;properties_rule&#39;]
    articleId = rule[&#39;articleId&#39;]
    
    # Check if all the conditions in the properties_rule array are satisfied
    matched = True
    for prop_rule in properties_rule:
        propertyId = prop_rule[&#39;propertyId&#39;]
        value = prop_rule[&#39;value&#39;]
        condition = prop_rule[&#39;condition&#39;]
        
        # Check if any of the survey responses match the condition
        response_matched = False
        for key, value_dict in data.items():
            properties_value = value_dict[&#39;properties_value&#39;]
            for prop_value in properties_value:
                survey_response = prop_value[&#39;survey_response&#39;]
                if survey_response[&#39;property_id&#39;] == propertyId:
                    property_value = survey_response[&#39;property_value&#39;]
                    if condition == &#39;EQ&#39; and property_value == value:
                        response_matched = True
                    elif condition == &#39;GTE&#39; and property_value &gt;= value:
                        response_matched = True
                    elif condition == &#39;LTE&#39; and property_value &lt;= value:
                        response_matched = True
        # If none of the survey responses match the condition, set matched to False
        if not response_matched:
            matched = False
            break
    
    # If all the conditions are satisfied, return the articleId of that rule
    if matched:
        print(&quot;Article ID: &quot;, articleId)
        break

# If none of the rules match the data, return a default articleId
if not matched:
    print(&quot;Default Article ID&quot;)

But Looking for effecient way of doing it. Don't want so many for loops

答案1

得分: 1

> 寻找高效的执行方式。不想使用太多的循环

你可以将每个 properties_rule 列表转换为一个函数或 lambda 表达式(伴随一个要提取参数的 propertyId 列表和一个要返回的 articleId)。

def rule_to_lambda(rule_dict):
    opRef = {'EQ': '==', 'GTE': '>=', 'LTE': '<=', 'GT': '>', 'LT': '<'}

    lStr1, lStr2, prop_ids, andOr = 'lambda ', ':', [], ''
    for ri, pr in enumerate(rule_dict['properties_rule'], 1):
        prop_ids.append(pr['propertyId'])
        prVal, lStr1 = pr['value'], f'{lStr1}{"" if ri == 1 else ", "}p{ri}'

        if ri != 1:
            lStr2 += f' {andOr}'
        andOr = 'or' if pr.get('logicalOperator') == 'OR' else 'and'

        cond = f'{repr(prVal)} {opRef.get(pr["condition"])} p{ri}'
        lStr2 += f' (p{ri} is not None and {cond})'

    # no rules to check --> return True by default
    if not prop_ids:
        lStr1, lStr2 = 'lambda *x', ': True'

    return rule_dict['articleId'], prop_ids, eval(lStr1 + lStr2)

请注意,使用 eval 在许多情况下被认为是不安全的,但我认为在这种情况下可以接受,因为你确切地知道可以放入 lStr1+lStr2 中的内容(除了 pr['value'],但这是从解析的 JSON 中提取的,因此不会是一个表达式)。

如果在你的问题中使用 rules.json 样本,

# import json
# with open('rules.json', 'r') as f: rules = json.load(f)

rules_list = [rule_to_lambda(r) for r in rules['rules']]

然后 rules_list 将 [有效地] 与以下内容相同:

>py &gt; [ &gt; ('art1', [487, 487], lambda p1, p2: (p1 is not None and 'aaaa' == p1) and (p2 is not None and 'zzzz' == p2)), &gt; ('art2', [487, 487], lambda p1, p2: (p1 is not None and 'bbbb' >= p1) and (p2 is not None and 'eeee' <= p2)), &gt; ('art3', [487], lambda p1: (p1 is not None and 'ffff' >= p1)) &gt; ] &gt;

[你仍然需要循环遍历每个 data 中的每个 properties_values,但这种方式你只需要循环遍历一次每个 properties_rule 字典,然后你可以使用创建的 lambda 表达式。]


由于你需要 property_value(对应于 property_id)作为 lambda 表达式的参数,你可能还希望使用一个函数来基于 property_id 查找 data 值的 property_value

def get_property_value(prop_id, prop_vals: list):
    for p in prop_vals:
        if prop_id == p['value']['property_id']:
            return p['value']['property_value']

# {k:get_property_value(487,v['properties_values']) for k,v in data['data'].items()}
# 返回 {'1': 'aaaa', '2': 'bbbb', '3': 'eeee'}

现在你可以遍历 data 如下:

# import json
# with open('rules.json', 'r') as f: rules = json.load(f)
# with open('data.json', 'r') as f: data = json.load(f)

# rules_list = [rule_to_lambda(r) for r in rules['rules']]
for k, v in data['data'].items():
    for a, prop_ids, ruleFunc in rules_list:
        pvals = v['properties_values']
        pvals = [get_property_value(pi, pvals) for pi in prop_ids]
        if ruleFunc(*pvals):
            print(k, ':', pvals, '---> Article ID:', a)
            break  ## 只打印第一个匹配的规则
    else:
        print(k, '---> Default Article ID')

使用你的样本 data.jsonrules.json,这将在所有 3 行上打印 Article ID: art3,但这不足为奇,因为除非 p1p2 不同(而它们不同,因为两个规则的 prop_ids 都是 [487, 487]),否则不可能出现 art1art2 的规则。

> py &gt; 1 : ['aaaa'] ---> Article ID: art3 &gt; 2 : ['bbbb'] ---> Article ID: art3 &gt; 3 : ['eeee'] ---> Article ID: art3 &gt;

英文:

> Looking for efficient way of doing it. Don't want so many for loops

You could convert each properties_rule list into a function or lambda expression (accompanied by a list of propertyIds to extract the arguments and an articleId to return).

def rule_to_lambda(rule_dict):
    opRef = {&#39;EQ&#39;: &#39;==&#39;, &#39;GTE&#39;: &#39;&gt;=&#39;, &#39;LTE&#39;: &#39;&lt;=&#39;, &#39;GT&#39;: &#39;&gt;&#39;, &#39;LT&#39;: &#39;&lt;&#39;}

    lStr1, lStr2, prop_ids, andOr = &#39;lambda &#39;, &#39;:&#39;, [], &#39;&#39;
    for ri, pr in enumerate(rule_dict[&#39;properties_rule&#39;], 1):
        prop_ids.append(pr[&#39;propertyId&#39;])
        prVal, lStr1 = pr[&#39;value&#39;], f&#39;{lStr1}{&quot;&quot; if ri == 1 else &quot;, &quot;}p{ri}&#39;

        if ri != 1: lStr2 += f&#39; {andOr}&#39;
        andOr = &#39;or&#39; if pr.get(&#39;logicalOperator&#39;)==&#39;OR&#39; else &#39;and&#39;

        cond = f&#39;{repr(prVal)} {opRef.get(pr[&quot;condition&quot;])} p{ri}&#39;
        lStr2 += f&#39; (p{ri} is not None and {cond})&#39;

    # no rules to check --&gt; return True by default     
    if not prop_ids: lStr1, lStr2 = &#39;lambda *x&#39;, &#39;: True&#39;
 
    return rule_dict[&#39;articleId&#39;], prop_ids, eval(lStr1+lStr2)

<sup>Note that using eval is often considered unsafe, but I think it's alright in this case since you know exactly what can go into lStr1+lStr2 (except for pr[&#39;value&#39;], but that's coming from parsed JSON, so it won't be an expression).</sup>

If used on the sample of rules.json in your question,

# import json
# with open(&#39;rules.json&#39;, &#39;r&#39;) as f: rules = json.load(f)

rules_list = [rule_to_lambda(r) for r in rules[&#39;rules&#39;]]

then rules_list would [effectively] be the same as

>py
&gt; [
&gt; (&#39;art1&#39;, [487, 487], lambda p1, p2: (p1 is not None and &#39;aaaa&#39; == p1) and (p2 is not None and &#39;zzzz&#39; == p2)),
&gt; (&#39;art2&#39;, [487, 487], lambda p1, p2: (p1 is not None and &#39;bbbb&#39; &gt;= p1) and (p2 is not None and &#39;eeee&#39; &lt;= p2)),
&gt; (&#39;art3&#39;, [487], lambda p1: (p1 is not None and &#39;ffff&#39; &gt;= p1))
&gt; ]
&gt;

[You'll still have to loop through every rule for every properties_values in data, but this way you only have to loop through every properties_rule dictionary once, and then you can use the lambda expressions created.]


Since you need the property_value (that corresponds to the property_ids) as arguments for the lambda expressions, you might also want to use a function to find the property_value of a data value based on a property_id:

def get_property_value(prop_id, prop_vals:list):
    for p in prop_vals:
        if prop_id == p[&#39;value&#39;][&#39;property_id&#39;]: 
            return p[&#39;value&#39;][&#39;property_value&#39;]

# {k:get_property_value(487,v[&#39;properties_values&#39;]) for k,v in data[&#39;data&#39;].items()}
# returns {&#39;1&#39;: &#39;aaaa&#39;, &#39;2&#39;: &#39;bbbb&#39;, &#39;3&#39;: &#39;eeee&#39;}

Now you can loop though data like

# import json
# with open(&#39;rules.json&#39;, &#39;r&#39;) as f: rules = json.load(f)
# with open(&#39;data.json&#39;, &#39;r&#39;) as f: data = json.load(f)

# rules_list = [rule_to_lambda(r) for r in rules[&#39;rules&#39;]]
for k, v in data[&#39;data&#39;].items():
    for a, prop_ids, ruleFunc in rules_list:
        pvals = v[&#39;properties_values&#39;] 
        pvals = [get_property_value(pi, pvals) for pi in prop_ids]
        if ruleFunc(*pvals):
            print(k, &#39;:&#39;, pvals, &#39;---&gt; Article ID:&#39;, a)
            break ## print only for the first rule that matches
    else: print(k, &#39;---&gt; Default Article ID&#39;)

With your sample data.json and rules.json, this prints Article ID: art3 on all 3 lines, but that's unsurprising since the art1 and art2 rule are impossible unless p1 and p2 are different [and they aren't since prop_ids for both rules is [487,487]].

> py
&gt; 1 : [&#39;aaaa&#39;] ---&gt; Article ID: art3
&gt; 2 : [&#39;bbbb&#39;] ---&gt; Article ID: art3
&gt; 3 : [&#39;eeee&#39;] ---&gt; Article ID: art3
&gt;

huangapple
  • 本文由 发表于 2023年3月12日 11:39:39
  • 转载请务必保留本文链接:https://go.coder-hub.com/75710936.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定