2023年7月13日 22:58:17go评论54阅读模式

英文:

How to write a REGEX search that will search through blocks of multiline patterns and only match on a block that contains a specified string?

问题

我正在努力编写一个正则表达式模式，以在同一文件中搜索多个YARA规则。我已经想出了一个可以从多行中的每个YARA规则的开头到结尾匹配的模式。现在我想要匹配整个YARA规则，以及每个YARA规则，但仅当它包含字符串"BANANAS"时。

我现在遇到的问题是，我的正则表达式从一个YARA规则的开头一直匹配到包含字符串"BANANAS"的YARA规则的结尾，但它还抓取了开始和结束点之间不包含"BANANAS"的所有YARA规则。我漏掉了什么，以便只捕获包含指定字符串的规则？

这是我目前使用的正则表达式模式：

^rule\s[\s\S]*?^\}$
^rule\s[\s\S]*?(?=BANANAS)[\s\S]*?^\}$

第一个模式从开头到结束匹配每个单独的YARA规则。
第二个模式包含前瞻，试图仅在包含指定字符串的情况下匹配每个YARA规则。

为了澄清，我希望避免使用任何内置的多行匹配函数。这就是为什么我使用[\s\S]*而不是.*。

我正在使用上面的正则表达式模式来匹配下面的文本示例。我指定的字符串"BANANAS"位于下面的YARA规则中的<description = "foo">字段中。

失败结果的图片

rule RULENAME
{
    meta:
        author = &quot;abcdef&quot;
        last_update = &quot;abcdef&quot;
        description = &quot;TURKEY&quot;
        hash = &quot;abcdef&quot; //dumped
    strings:
        $mz = &quot;MZ&quot;
		$low0 = &quot;malware&quot; ascii wide
		$low1 = &quot;hello world&quot; ascii wide
		$low2 = &quot;sus&quot; wide
		$low3 = &quot;keyLogger&quot; wide
		$low4 = &quot;bot&quot; wide
		$low5 = &quot;usb&quot; wide
    condition:
        $mz at 0 and ((3 of ($low*))
}
rule RULENAME
{
    meta:
        author = &quot;abcdef&quot;
        last_update = &quot;abcdef&quot;
        description = &quot;BANANAS&quot;
        hash = &quot;abcdef&quot; //dumped
    strings:
        $mz = &quot;MZ&quot;
		$low0 = &quot;malware&quot; ascii wide
		$low1 = &quot;hello world&quot; ascii wide
		$low2 = &quot;sus&quot; wide
		$low3 = &quot;keyLogger&quot; wide
		$low4 = &quot;bot&quot; wide
		$low5 = &quot;usb&quot; wide
    condition:
        $mz at 0 and ((3 of ($low*))
}
rule RULENAME
{
    meta:
        author = &quot;abcdef&quot;
        last_update = &quot;abcdef&quot;
        description = &quot;CHICKEN&quot;
        hash = &quot;abcdef&quot; //dumped
    strings:
        $mz = &quot;MZ&quot;
		$low0 = &quot;malware&quot; ascii wide
		$low1 = &quot;hello world&quot; ascii wide
		$low2 = &quot;sus&quot; wide
		$low3 = &quot;keyLogger&quot; wide
		$low4 = &quot;bot&quot; wide
		$low5 = &quot;usb&quot; wide
    condition:
        $mz at 0 and ((3 of ($low*))
}

英文:

I'm working to write a regex pattern that will search through multiple YARA rules within the same file. The pattern I've come up with already matches each YARA rule individually from beginning to end across multiple lines. Now I want to match the entire YARA rule, and each one individually, but only if it contains the string "BANANAS" somewhere within the rule.

The problem I'm now having is that my regex matches from the beginning of a YARA rule all the way to the end of the YARA rule that does contain the string "BANANAS", BUT it also grabs every YARA rule in between the start and end points that DO NOT contain "BANANAS". What am I missing to only grab the rules that contain my specified string?

These are the current regex patterns I'm using:

^rule\s[\s\S]*?^\}$
^rule\s[\s\S]*?(?=BANANAS)[\s\S]*?^\}$

The first pattern matches each individual YARA rule from beginning to end.
The second pattern contains the lookahead and is attempting to match each YARA rule only if it contains the specified string.

To clarify, I want to avoid using any built in app functions for multiline matching. Which is why I'm using [\s\S]* instead of .*

I'm using the above regex pattern to match on the text below as an example. The string "BANANAS" that I'm specifying is located in the <description = "foo"> field within the YARA rules below.

Picture of Failed results

rule RULENAME
{
    meta:
        author = &quot;abcdef&quot;
        last_update = &quot;abcdef&quot;
        description = &quot;TURKEY&quot;
        hash = &quot;abcdef&quot; //dumped
    strings:
        $mz = &quot;MZ&quot;
		$low0 = &quot;malware&quot; ascii wide
		$low1 = &quot;hello world&quot; ascii wide
		$low2 = &quot;sus&quot; wide
		$low3 = &quot;keyLogger&quot; wide
		$low4 = &quot;bot&quot; wide
		$low5 = &quot;usb&quot; wide
    condition:
        $mz at 0 and ((3 of ($low*))
}
rule RULENAME
{
    meta:
        author = &quot;abcdef&quot;
        last_update = &quot;abcdef&quot;
        description = &quot;BANANAS&quot;
        hash = &quot;abcdef&quot; //dumped
    strings:
        $mz = &quot;MZ&quot;
		$low0 = &quot;malware&quot; ascii wide
		$low1 = &quot;hello world&quot; ascii wide
		$low2 = &quot;sus&quot; wide
		$low3 = &quot;keyLogger&quot; wide
		$low4 = &quot;bot&quot; wide
		$low5 = &quot;usb&quot; wide
    condition:
        $mz at 0 and ((3 of ($low*))
}
rule RULENAME
{
    meta:
        author = &quot;abcdef&quot;
        last_update = &quot;abcdef&quot;
        description = &quot;CHICKEN&quot;
        hash = &quot;abcdef&quot; //dumped
    strings:
        $mz = &quot;MZ&quot;
		$low0 = &quot;malware&quot; ascii wide
		$low1 = &quot;hello world&quot; ascii wide
		$low2 = &quot;sus&quot; wide
		$low3 = &quot;keyLogger&quot; wide
		$low4 = &quot;bot&quot; wide
		$low5 = &quot;usb&quot; wide
    condition:
        $mz at 0 and ((3 of ($low*))
}

答案1

得分: 1

^rule\s[^}]BANANAS[^}]?^}$

我认为这可能有效：

我无法复制您的截图，但似乎它匹配了两个规则，因为单个匹配可以跨越多个规则，所以它从第一个规则开始，然后匹配到包含BANANAS的规则的末尾。如果您将BANANAS作为底部规则，您可能会看到它匹配您示例中的所有3个规则。我用**[^}]替换了[\s\S]**以防止这种情况发生。

英文:

I think this could work:

^rule\s[^}]*BANANAS[^}]*?^}$

I didn't manage to reproduce your screenshot, but that looks like it's matching two rules because a single match can span multiple rules, so it started from the first rule and then matched up to the end of the rule with BANANAS in it. If you would have BANANAS as the bottom rule you would probably see it match all of the 3 rules in your example. I replaced [\s\S] with [^}] to prevent this.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

How to write a REGEX search that will search through blocks of multiline patterns and only match on a block that contains a specified string?

问题

答案1

在处理扁平文件中的十六进制、IP和时间戳值时，可以使用正则表达式。

使用贪婪行为匹配字符串在x次出现之后

Golang使用正则表达式拆分字符串

提取字符串中的前导数字，但长度会变化。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论