英文:
How to write a REGEX search that will search through blocks of multiline patterns and only match on a block that contains a specified string?
问题
我正在努力编写一个正则表达式模式,以在同一文件中搜索多个YARA规则。我已经想出了一个可以从多行中的每个YARA规则的开头到结尾匹配的模式。现在我想要匹配整个YARA规则,以及每个YARA规则,但仅当它包含字符串"BANANAS"时。
我现在遇到的问题是,我的正则表达式从一个YARA规则的开头一直匹配到包含字符串"BANANAS"的YARA规则的结尾,但它还抓取了开始和结束点之间不包含"BANANAS"的所有YARA规则。我漏掉了什么,以便只捕获包含指定字符串的规则?
这是我目前使用的正则表达式模式:
^rule\s[\s\S]*?^\}$
^rule\s[\s\S]*?(?=BANANAS)[\s\S]*?^\}$
第一个模式从开头到结束匹配每个单独的YARA规则。
第二个模式包含前瞻,试图仅在包含指定字符串的情况下匹配每个YARA规则。
为了澄清,我希望避免使用任何内置的多行匹配函数。这就是为什么我使用[\s\S]*
而不是.*
。
我正在使用上面的正则表达式模式来匹配下面的文本示例。我指定的字符串"BANANAS"位于下面的YARA规则中的<description = "foo">
字段中。
rule RULENAME
{
meta:
author = "abcdef"
last_update = "abcdef"
description = "TURKEY"
hash = "abcdef" //dumped
strings:
$mz = "MZ"
$low0 = "malware" ascii wide
$low1 = "hello world" ascii wide
$low2 = "sus" wide
$low3 = "keyLogger" wide
$low4 = "bot" wide
$low5 = "usb" wide
condition:
$mz at 0 and ((3 of ($low*))
}
rule RULENAME
{
meta:
author = "abcdef"
last_update = "abcdef"
description = "BANANAS"
hash = "abcdef" //dumped
strings:
$mz = "MZ"
$low0 = "malware" ascii wide
$low1 = "hello world" ascii wide
$low2 = "sus" wide
$low3 = "keyLogger" wide
$low4 = "bot" wide
$low5 = "usb" wide
condition:
$mz at 0 and ((3 of ($low*))
}
rule RULENAME
{
meta:
author = "abcdef"
last_update = "abcdef"
description = "CHICKEN"
hash = "abcdef" //dumped
strings:
$mz = "MZ"
$low0 = "malware" ascii wide
$low1 = "hello world" ascii wide
$low2 = "sus" wide
$low3 = "keyLogger" wide
$low4 = "bot" wide
$low5 = "usb" wide
condition:
$mz at 0 and ((3 of ($low*))
}
英文:
I'm working to write a regex pattern that will search through multiple YARA rules within the same file. The pattern I've come up with already matches each YARA rule individually from beginning to end across multiple lines. Now I want to match the entire YARA rule, and each one individually, but only if it contains the string "BANANAS" somewhere within the rule.
The problem I'm now having is that my regex matches from the beginning of a YARA rule all the way to the end of the YARA rule that does contain the string "BANANAS", BUT it also grabs every YARA rule in between the start and end points that DO NOT contain "BANANAS". What am I missing to only grab the rules that contain my specified string?
These are the current regex patterns I'm using:
^rule\s[\s\S]*?^\}$
^rule\s[\s\S]*?(?=BANANAS)[\s\S]*?^\}$
The first pattern matches each individual YARA rule from beginning to end.
The second pattern contains the lookahead and is attempting to match each YARA rule only if it contains the specified string.
To clarify, I want to avoid using any built in app functions for multiline matching. Which is why I'm using [\s\S]*
instead of .*
I'm using the above regex pattern to match on the text below as an example. The string "BANANAS" that I'm specifying is located in the <description = "foo">
field within the YARA rules below.
rule RULENAME
{
meta:
author = "abcdef"
last_update = "abcdef"
description = "TURKEY"
hash = "abcdef" //dumped
strings:
$mz = "MZ"
$low0 = "malware" ascii wide
$low1 = "hello world" ascii wide
$low2 = "sus" wide
$low3 = "keyLogger" wide
$low4 = "bot" wide
$low5 = "usb" wide
condition:
$mz at 0 and ((3 of ($low*))
}
rule RULENAME
{
meta:
author = "abcdef"
last_update = "abcdef"
description = "BANANAS"
hash = "abcdef" //dumped
strings:
$mz = "MZ"
$low0 = "malware" ascii wide
$low1 = "hello world" ascii wide
$low2 = "sus" wide
$low3 = "keyLogger" wide
$low4 = "bot" wide
$low5 = "usb" wide
condition:
$mz at 0 and ((3 of ($low*))
}
rule RULENAME
{
meta:
author = "abcdef"
last_update = "abcdef"
description = "CHICKEN"
hash = "abcdef" //dumped
strings:
$mz = "MZ"
$low0 = "malware" ascii wide
$low1 = "hello world" ascii wide
$low2 = "sus" wide
$low3 = "keyLogger" wide
$low4 = "bot" wide
$low5 = "usb" wide
condition:
$mz at 0 and ((3 of ($low*))
}
答案1
得分: 1
^rule\s[^}]BANANAS[^}]?^}$
我认为这可能有效:
我无法复制您的截图,但似乎它匹配了两个规则,因为单个匹配可以跨越多个规则,所以它从第一个规则开始,然后匹配到包含BANANAS的规则的末尾。如果您将BANANAS作为底部规则,您可能会看到它匹配您示例中的所有3个规则。我用**[^}]替换了[\s\S]**以防止这种情况发生。
英文:
I think this could work:
^rule\s[^}]*BANANAS[^}]*?^}$
I didn't manage to reproduce your screenshot, but that looks like it's matching two rules because a single match can span multiple rules, so it started from the first rule and then matched up to the end of the rule with BANANAS in it. If you would have BANANAS as the bottom rule you would probably see it match all of the 3 rules in your example. I replaced [\s\S] with [^}] to prevent this.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论