英文:
Match rule json with data json to find value in python3
问题
Python3
需要根据数据应用规则查找文章ID
给定以JSON格式包含文章ID和相应属性规则的规则集。
例如:rules.json
{
"rules": [
{
"articleId": "art1",
"properties_rule": [
{
"condition": "EQ",
"logicalOperator": "AND",
"propertyId": 487,
"value": "aaaa"
},
{
"condition": "EQ",
"logicalOperator": "",
"propertyId": 487,
"value": "zzzz"
}
]
},
{
"articleId": "art2",
"properties_rule": [
{
"condition": "GTE",
"logicalOperator": "AND",
"propertyId": 487,
"value": "bbbb"
},
{
"condition": "LTE",
"logicalOperator": "",
"propertyId": 487,
"value": "eeee"
}
]
},
{
"articleId": "art3",
"properties_rule": [
{
"condition": "GTE",
"logicalOperator": "",
"propertyId": 487,
"value": "ffff"
}
]
}
]
}
以及以JSON格式包含某些属性值的数据集。类似于data.json
:
{
"data": {
"1": {
"properties_values": [
{
"value": {
"property_id": 487,
"property_value": "aaaa",
"response_id": 1
}
}
]
},
"2": {
"properties_values": [
{
"value": {
"property_id": 487,
"property_value": "bbbb",
"response_id": 2
}
}
]
},
"3": {
"properties_values": [
{
"value": {
"property_id": 487,
"property_value": "eeee",
"response_id": 3
}
}
]
}
}
}
任务是根据规则在数据上应用规则,并确定与规则JSON中指定的“properties_rule”数组中的条件相匹配的文章ID。
我们如何使用提供的规则和数据JSON来确定每个数据JSON条目对应的“articleId”,根据规则JSON中“properties_rule”数组中指定的条件?在数据JSON中,“property_id”字段对应规则JSON中的“propertyId”字段,“property_value”字段对应规则JSON中的“value”字段。
尝试解决这个问题,但不喜欢使用太多for循环
要根据规则JSON中“properties_rule”数组中指定的条件确定数据JSON中每个条目的“articleId”,可以按照以下Python步骤进行操作:
- 使用json模块将规则和数据JSON加载到Python字典中。
- 循环遍历数据JSON中的每个条目。
- 对于每个条目,循环遍历规则JSON以根据“properties_rule”数组中指定的条件找到匹配的文章ID。
- 对于每个规则,循环遍历数据条目中的“properties_values”数组以查找匹配的属性ID。
- 如果找到匹配的属性ID,请检查属性值是否符合规则中指定的条件。如果不符合条件,请继续下一个规则。
- 如果所有条件都满足,请返回与规则关联的文章ID。
- 如果找不到匹配的文章ID,请返回默认值或引发错误。
以下是实施此逻辑的示例代码:
import json
# 将规则和数据JSON文件加载到内存中
with open('rules.json', 'r') as f:
rules_data = json.load(f)
rules = rules_data['rules']
with open('data.json', 'r') as f:
data = json.load(f)
data = data['data']
# 循环遍历每个规则并检查它是否匹配数据
for rule in rules:
properties_rule = rule['properties_rule']
articleId = rule['articleId']
# 检查“properties_rule”数组中的所有条件是否满足
matched = True
for prop_rule in properties_rule:
propertyId = prop_rule['propertyId']
value = prop_rule['value']
condition = prop_rule['condition']
# 检查是否有任何调查响应与条件匹配
response_matched = False
for key, value_dict in data.items():
properties_value = value_dict['properties_values']
for prop_value in properties_value:
survey_response = prop_value['value']
if survey_response['property_id'] == propertyId:
property_value = survey_response['property_value']
if condition == 'EQ' and property_value == value:
response_matched = True
elif condition == 'GTE' and property_value >= value:
response_matched = True
elif condition == 'LTE' and property_value <= value:
response_matched = True
# 如果没有调查响应与条件匹配,请将matched设置为False
if not response_matched:
matched = False
break
# 如果满足所有条件,请返回该规则关联的文章ID
if matched:
print("文章ID:", articleId)
break
# 如果没有规则与数据匹配,请返回默认的文章ID
if not matched:
print("默认文章ID")
但是寻找更高效的方法,不希望使用太多for循环
英文:
PYTHON3
Need to Find articleid by applying rules on data
Given a set of rules in JSON format, which includes article IDs and corresponding properties rules.
Eg: rules.json
{
"rules": [
{
"articleId": "art1",
"properties_rule": [
{
"condition": "EQ",
"logicalOperator": "AND",
"propertyId": 487,
"value": "aaaa"
},
{
"condition": "EQ",
"logicalOperator": "",
"propertyId": 487,
"value": "zzzz"
}
],
},
{
"articleId": "art2",
"properties_rule": [
{
"condition": "GTE",
"logicalOperator": "AND",
"propertyId": 487,
"value": "bbbb"
},
{
"condition": "LTE",
"logicalOperator": "",
"propertyId": 487,
"value": "eeee"
}
],
},
{
"articleId": "art3",
"properties_rule": [
{
"condition": "GTE",
"logicalOperator": "",
"propertyId": 487,
"value": "ffff"
}
],
}
]
}
Qs well as a set of data in JSON format, which includes values for certain properties.
Like Eg: data.json
{
"data": {
"1": {
"properties_values": [
{
"value": {
"property_id": 487,
"property_value": "aaaa",
"response_id": 1
}
}
]
},
"2": {
"properties_values": [
{
"value": {
"property_id": 487,
"property_value": "bbbb",
"response_id": 2
}
}
]
},
"3": {
"properties_values": [
{
"value": {
"property_id": 487,
"property_value": "eeee",
"response_id": 3
}
}
]
}
}
}
the task is to apply the rules on the data and determine the article ID that matches the rules.
How can we use the provided rules and data JSON to determine the corresponding "articleId" for each entry in the "data" JSON, based on the conditions specified in the "properties_rule" arrays of the rules JSON? The "property_id" field in the "data" JSON corresponds to the "propertyId" field in the "rules" JSON, and the "property_value" field in the "data" JSON corresponds to the "value" field in the "rules" JSON.
Tried this To Solve this But didn't liked so many for loops
To determine the corresponding "articleId" for each entry in the "data" JSON based on the conditions specified in the "properties_rule" arrays of the rules JSON, we can follow the following steps in Python:
- Load the rules and data JSON into Python dictionaries using the json module.
- Loop through each entry in the "data" JSON.
- For each entry, loop through the "rules" JSON to find a matching article ID based on the conditions specified in the "properties_rule" arrays.
- For each rule, loop through the "properties_values" array in the data entry to find a matching property ID.
- If a matching property ID is found, check if the property value meets the condition specified in the rule. If it does not meet the condition, move on to the next rule.
- If all conditions are met, return the article ID associated with the rule.
- If no matching article ID is found, return a default value or raise an error.
Here is sample code that implements this logic:
import json
# Load the rules and data JSON files into memory
with open('rules.json', 'r') as f:
rules_data = json.load(f)
rules = rules_data['rules']
with open('data.json', 'r') as f:
data = json.load(f)
data = data['data']
# Loop through each rule and check if it matches the data
for rule in rules:
properties_rule = rule['properties_rule']
articleId = rule['articleId']
# Check if all the conditions in the properties_rule array are satisfied
matched = True
for prop_rule in properties_rule:
propertyId = prop_rule['propertyId']
value = prop_rule['value']
condition = prop_rule['condition']
# Check if any of the survey responses match the condition
response_matched = False
for key, value_dict in data.items():
properties_value = value_dict['properties_value']
for prop_value in properties_value:
survey_response = prop_value['survey_response']
if survey_response['property_id'] == propertyId:
property_value = survey_response['property_value']
if condition == 'EQ' and property_value == value:
response_matched = True
elif condition == 'GTE' and property_value >= value:
response_matched = True
elif condition == 'LTE' and property_value <= value:
response_matched = True
# If none of the survey responses match the condition, set matched to False
if not response_matched:
matched = False
break
# If all the conditions are satisfied, return the articleId of that rule
if matched:
print("Article ID: ", articleId)
break
# If none of the rules match the data, return a default articleId
if not matched:
print("Default Article ID")
But Looking for effecient way of doing it. Don't want so many for loops
答案1
得分: 1
> 寻找高效的执行方式。不想使用太多的循环
你可以将每个 properties_rule
列表转换为一个函数或 lambda
表达式(伴随一个要提取参数的 propertyId
列表和一个要返回的 articleId
)。
def rule_to_lambda(rule_dict):
opRef = {'EQ': '==', 'GTE': '>=', 'LTE': '<=', 'GT': '>', 'LT': '<'}
lStr1, lStr2, prop_ids, andOr = 'lambda ', ':', [], ''
for ri, pr in enumerate(rule_dict['properties_rule'], 1):
prop_ids.append(pr['propertyId'])
prVal, lStr1 = pr['value'], f'{lStr1}{"" if ri == 1 else ", "}p{ri}'
if ri != 1:
lStr2 += f' {andOr}'
andOr = 'or' if pr.get('logicalOperator') == 'OR' else 'and'
cond = f'{repr(prVal)} {opRef.get(pr["condition"])} p{ri}'
lStr2 += f' (p{ri} is not None and {cond})'
# no rules to check --> return True by default
if not prop_ids:
lStr1, lStr2 = 'lambda *x', ': True'
return rule_dict['articleId'], prop_ids, eval(lStr1 + lStr2)
请注意,使用 eval
在许多情况下被认为是不安全的,但我认为在这种情况下可以接受,因为你确切地知道可以放入 lStr1+lStr2
中的内容(除了 pr['value']
,但这是从解析的 JSON 中提取的,因此不会是一个表达式)。
如果在你的问题中使用 rules.json
样本,
# import json
# with open('rules.json', 'r') as f: rules = json.load(f)
rules_list = [rule_to_lambda(r) for r in rules['rules']]
然后 rules_list
将 [有效地] 与以下内容相同:
>py > [ > ('art1', [487, 487], lambda p1, p2: (p1 is not None and 'aaaa' == p1) and (p2 is not None and 'zzzz' == p2)), > ('art2', [487, 487], lambda p1, p2: (p1 is not None and 'bbbb' >= p1) and (p2 is not None and 'eeee' <= p2)), > ('art3', [487], lambda p1: (p1 is not None and 'ffff' >= p1)) > ] >
[你仍然需要循环遍历每个 data
中的每个 properties_values
,但这种方式你只需要循环遍历一次每个 properties_rule
字典,然后你可以使用创建的 lambda
表达式。]
由于你需要 property_value
(对应于 property_id
)作为 lambda
表达式的参数,你可能还希望使用一个函数来基于 property_id
查找 data
值的 property_value
:
def get_property_value(prop_id, prop_vals: list):
for p in prop_vals:
if prop_id == p['value']['property_id']:
return p['value']['property_value']
# {k:get_property_value(487,v['properties_values']) for k,v in data['data'].items()}
# 返回 {'1': 'aaaa', '2': 'bbbb', '3': 'eeee'}
现在你可以遍历 data
如下:
# import json
# with open('rules.json', 'r') as f: rules = json.load(f)
# with open('data.json', 'r') as f: data = json.load(f)
# rules_list = [rule_to_lambda(r) for r in rules['rules']]
for k, v in data['data'].items():
for a, prop_ids, ruleFunc in rules_list:
pvals = v['properties_values']
pvals = [get_property_value(pi, pvals) for pi in prop_ids]
if ruleFunc(*pvals):
print(k, ':', pvals, '---> Article ID:', a)
break ## 只打印第一个匹配的规则
else:
print(k, '---> Default Article ID')
使用你的样本 data.json
和 rules.json
,这将在所有 3 行上打印 Article ID: art3
,但这不足为奇,因为除非 p1
和 p2
不同(而它们不同,因为两个规则的 prop_ids
都是 [487, 487]
),否则不可能出现 art1
和 art2
的规则。
> py > 1 : ['aaaa'] ---> Article ID: art3 > 2 : ['bbbb'] ---> Article ID: art3 > 3 : ['eeee'] ---> Article ID: art3 >
英文:
> Looking for efficient way of doing it. Don't want so many for loops
You could convert each properties_rule
list into a function or lambda
expression (accompanied by a list of propertyId
s to extract the arguments and an articleId
to return).
def rule_to_lambda(rule_dict):
opRef = {'EQ': '==', 'GTE': '>=', 'LTE': '<=', 'GT': '>', 'LT': '<'}
lStr1, lStr2, prop_ids, andOr = 'lambda ', ':', [], ''
for ri, pr in enumerate(rule_dict['properties_rule'], 1):
prop_ids.append(pr['propertyId'])
prVal, lStr1 = pr['value'], f'{lStr1}{"" if ri == 1 else ", "}p{ri}'
if ri != 1: lStr2 += f' {andOr}'
andOr = 'or' if pr.get('logicalOperator')=='OR' else 'and'
cond = f'{repr(prVal)} {opRef.get(pr["condition"])} p{ri}'
lStr2 += f' (p{ri} is not None and {cond})'
# no rules to check --> return True by default
if not prop_ids: lStr1, lStr2 = 'lambda *x', ': True'
return rule_dict['articleId'], prop_ids, eval(lStr1+lStr2)
<sup>Note that using eval
is often considered unsafe, but I think it's alright in this case since you know exactly what can go into lStr1+lStr2
(except for pr['value']
, but that's coming from parsed JSON, so it won't be an expression).</sup>
If used on the sample of rules.json
in your question,
# import json
# with open('rules.json', 'r') as f: rules = json.load(f)
rules_list = [rule_to_lambda(r) for r in rules['rules']]
then rules_list
would [effectively] be the same as
>py
> [
> ('art1', [487, 487], lambda p1, p2: (p1 is not None and 'aaaa' == p1) and (p2 is not None and 'zzzz' == p2)),
> ('art2', [487, 487], lambda p1, p2: (p1 is not None and 'bbbb' >= p1) and (p2 is not None and 'eeee' <= p2)),
> ('art3', [487], lambda p1: (p1 is not None and 'ffff' >= p1))
> ]
>
[You'll still have to loop through every rule for every properties_values
in data
, but this way you only have to loop through every properties_rule
dictionary once, and then you can use the lambda
expressions created.]
Since you need the property_value
(that corresponds to the property_id
s) as arguments for the lambda
expressions, you might also want to use a function to find the property_value
of a data
value based on a property_id
:
def get_property_value(prop_id, prop_vals:list):
for p in prop_vals:
if prop_id == p['value']['property_id']:
return p['value']['property_value']
# {k:get_property_value(487,v['properties_values']) for k,v in data['data'].items()}
# returns {'1': 'aaaa', '2': 'bbbb', '3': 'eeee'}
Now you can loop though data
like
# import json
# with open('rules.json', 'r') as f: rules = json.load(f)
# with open('data.json', 'r') as f: data = json.load(f)
# rules_list = [rule_to_lambda(r) for r in rules['rules']]
for k, v in data['data'].items():
for a, prop_ids, ruleFunc in rules_list:
pvals = v['properties_values']
pvals = [get_property_value(pi, pvals) for pi in prop_ids]
if ruleFunc(*pvals):
print(k, ':', pvals, '---> Article ID:', a)
break ## print only for the first rule that matches
else: print(k, '---> Default Article ID')
With your sample data.json
and rules.json
, this prints Article ID: art3
on all 3 lines, but that's unsurprising since the art1
and art2
rule are impossible unless p1
and p2
are different [and they aren't since prop_ids
for both rules is [487,487]
].
> py
> 1 : ['aaaa'] ---> Article ID: art3
> 2 : ['bbbb'] ---> Article ID: art3
> 3 : ['eeee'] ---> Article ID: art3
>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论