英文:
Preg_replace finds a match where there shouldn't be one
问题
我正在做自己的简单Markdown格式化工具。我正在修复最后一个问题,但在处理我的代码块格式化器时遇到了一个问题。出于某种原因,它在不应该匹配任何内容的地方多次匹配。
$matches = [
"```\ncode block \n```",
"code block \n"
];
private function code_block_format($matches): string
{
// 获取一行
$regex = '/([^\n]*)\n?/';
// 将该行包装在<code>元素中并添加换行符
$repl = '<code>$1</code>' . "\n";
// 移除尾随的换行符和空格
$matches[1] = trim($matches[1]);
$ret = preg_replace($regex, $repl, $matches[1]); // 这会返回格式不正确的字符串
$ret = "<pre>\n" . $ret . "</pre>";
return $ret;
}
preg_replace只返回<code>code block</code>\n
,但出于某种原因,我多次获取额外的元素<code>code block</code>\n<code></code>\n
。
对于这种情况,有什么可能导致它附着在其中某处的空字符串的帮助吗?
编辑
我的目标是创建一个类似于您在此处编写的代码块元素,其中```标签之间可以有空行,因此应匹配只包含\n的行。
英文:
So I'm doing my own simple Markdown formatter. I'm fixing the last of the issues when I ran into an issue with my code block formatter. For some reason it matches an extra time where there shouldn't be anything to match.
$matches = [
"```\ncode block \n```",
"code block \n"
];
private function code_block_format($matches): string
{
// get a line
$regex = '/([^\n]*)\n?/';
// wrap that line into <code> elem + new line
$repl = '<code>$1</code>' . "\n";
// remove trailing linebreaks + spaces
$matches[1] = trim($matches[1]);
$ret = preg_replace($regex, $repl, $matches[1]); // this returns the badly formatted string
$ret = "<pre>\n" . $ret . "</pre>";
return $ret;
}
The preg_replace just return <code>code block</code>\n
but for some reason I get an extra element <code>code block</code>\n<code></code>\n
Any help on what in the world could be causing it to latch onto a "" string somewhere in there?
Edit
My goal is to make a codeblock element similar to what you can write here where there can be empty lines between the ``` tags, so lines with simply \n should be matched as well.
答案1
得分: 0
$regex = '/([^\n]*)\n?/';
返回不包含 \n
零次或多次的字符串,基本上是所有内容。
将 *
更改为 +
,表示它出现一次或多次。
$regex = '/([^\n]+)\n?/';
我实际上无法确定为什么 *
会返回第二个组。/[^a]*/g
对于不包含 a
的任何文本都返回两个组,但我期望只有一个。
尽管如此,您的代码似乎过于复杂。您只是想要从 $match[1]
中去除空格并用 <code></code>
包围它吗?
您可以直接将标记连接到修剪后的 $matches[1]
:
return '<code>' . $matches[1] . '</code';
英文:
$regex = '/([^\n]*)\n?/';
returns strings that do not contain \n
zero or more times, so basically everything.
Change *
to +
, which means it occurs one or more times.
$regex = '/([^\n]+)\n?/';
I actually can't figure out exactly why *
is returning the second group. /[^a]*/g
returns two groups for any text that doesn't include an a
, and I would expect one.
Although, your code seems needlessly complex. Are you just trying to remove white space from $match[1] with trim(), then surround it with <code></code>
?
You can just concatenate the tags onto the trimmed $matches[1]:
return '<code>' . $matches[1] . '</code';
答案2
得分: 0
这个正则表达式可以尝试匹配零个或多个初始匹配,并且可以创建一个只有换行符的分组...
$regex = '/([^\n]+)\n?/';
它应该输出:
<pre>
<code>code block</code>
</pre>
英文:
You can try this since your initial is ZERO or more matches and can create a group where there is just a newline...
$regex = '/([^\n]+)\n?/';
It should output:
<pre>
<code>code block</code>
</pre>
答案3
得分: 0
([^\n]+\n?|\n)
允许我捕获带有文本或空行的行,这符合我的要求。
英文:
Okay, got an idea from the answers and found the regex I works as I want. ([^\n]+\n?|\n)
allows me to capture a line with text or the empty lines as I wanted.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论