`.htaccess与虚拟主机 – 插入符匹配 ^`

huangapple go评论65阅读模式
英文:

htaccess vs virtualhost - Caret match ^

问题

I've been coming up with some redirects for a client, where they needed a redirect on domain.com to have leading www, and a sub.another.com that has been aliased to domain.com to not have leading www. I've found a solution, which kinda-worked.

My senior told me, that the caret-only match (e.g. the RewriteRule ^ part) does not work in .htaccess, apparently only in virtualhost. I know that there are some minor differences between these, but I was unable to find any source (and he wasn't able to provide it too) that says that ^ match doesn't work in .htaccess.

Apparently, the correct rule would be RewriteRule ^(.*) https://sub.another.com/$1 [R=301,L]. Can anyone please explain why the match doesn't work in .htaccess? If the internet is to be believed, both ^ and ^(.*) should match the entire line. The latter explicitly tells to match anything after the beginning of the line, while the former matches the beginning of the line itself (so should also match an empty string, see this answer).

英文:

I've been coming up with some redirects for a client, where they needed a redirect on domain.com to have leading www, and a sub.another.com that has been aliased to domain.com to not have leading www. I've found a solution, which kinda-worked.

My senior told me, that the caret-only match (e.g. the RewriteRule ^ part) does not work in .htaccess, apparently only in virtualhost. I know that there are some minor differences between these, but I was unable to find any source (and he wasn't able to provide it too) that says that ^ match doesn't work in .htaccess.

Apparently, the correct rule would be RewriteRule ^(.*) https://sub.another.com/$1 [R=301,L]. Can anyone please explain why the match doesn't work in .htaccess? If the internet is to be believed, both ^ and ^(.*) should match the entire line. The latter explicitly tells to match anything after beginning of the line, while the former matches the beginning of the line itself (so should also match an empty string, see this answer).

答案1

得分: 1

以下是您要翻译的内容:

My senior told me, that the caret-only match (e.g. the RewriteRule ^ part) does not work in .htaccess, apparently only in virtualhost

我的前辈告诉我,插入符匹配(例如 RewriteRule ^ 部分)在 .htaccess 中不起作用,显然只在虚拟主机中起作用。

That is nonsense. The ^ (regex syntax) simply asserts the start-of-string. Nothing more. So, if an argument takes a regex (regardless of context) the ^ is always going to "work", depending on what you are trying to match. (I'm wondering whether you really mean the slash prefix on the URL-path?)

那是无稽之谈。^(正则表达式语法)只是断言字符串的开头,没有其他意义。因此,如果一个参数使用正则表达式(不管上下文如何),^ 总是会“起作用”,取决于你想要匹配的内容。(我在想你是否真的指的是 URL 路径上的斜杠前缀?)

apparently only in virtualhost

显然只在虚拟主机中。

You could argue the opposite is true. ^$ will match in .htaccess, but not in a virtualhost context. (This matches the root directory / homepage in a .htaccess context.)

你可以说相反的情况也成立。^$.htaccess 中会匹配,但在虚拟主机上下文中不会匹配。(这在 .htaccess 上下文中匹配根目录/主页。)

This is simply due to what is matched in these different contexts. (There are 4 contexts in the Apache config: .htaccess, directory, virtualhost and server.)

这仅仅是因为在这些不同的上下文中匹配的内容不同。(在 Apache 配置中有 4 个上下文:.htaccessdirectoryvirtualhostserver。)

In a virtualhost context, the RewriteRule pattern matches against the full root-relative URL-path, starting with a slash. So, instead of ^$, you would need to match ^/$ instead (to match the root directory).

在虚拟主机上下文中,RewriteRulepattern 与完整的根相对 URL 路径匹配,以斜杠开头。因此,你需要匹配 ^/$ 而不是 ^$(以匹配根目录)。

Whereas in .htaccess (and directory) context, the URL-path that is matched is relative to the directory that contains the .htaccess file (the directory-prefix has been removed) and consequently does not start with a slash.

然而,在 .htaccess(和 directory)上下文中,匹配的 URL 路径是相对于包含 .htaccess 文件的目录的,目录前缀已被删除,因此以斜杠开头。

For example, given the URL https://example.com/foo/bar/baz...

例如,假设 URL 是 https://example.com/foo/bar/baz...

  • In a virtualhost context, the RewriteRule pattern matches against the URL-path /foo/bar/baz (the full URL-path starting with a slash.)

  • 在虚拟主机上下文中,RewriteRulepattern 与 URL 路径 /foo/bar/baz 匹配(完整的 URL 路径以斜杠开头)。

  • In .htaccess in the root directory then the RewriteRule pattern matches against foo/bar/baz (no slash prefix). But if the .htaccess file is located in the /foo subdirectory (not the root) then the URL-path that is matched is: bar/baz only (/foo is omitted).

  • 在根目录的 .htaccess 中,RewriteRulepatternfoo/bar/baz 匹配(没有斜杠前缀)。但如果 .htaccess 文件位于 /foo 子目录中(而不是根目录),则匹配的 URL 路径是:仅为 bar/baz/foo 被省略)。

Apparently, the correct rule would be RewriteRule ^(.*) https://sub.another.com/$1 [R=301,L]. Can anyone please explain why the match doesn't work in .htaccess?

显然,正确的规则应该是 RewriteRule ^(.*) https://sub.another.com/$1 [R=301,L]。有人能解释为什么匹配在 .htaccess 中不起作用吗?

This does work in .htaccess. In fact, this would only work properly in .htaccess, since if you used this rule in a virtualhost context then you would end up with a double-slash at the start of the URL-path in the redirected request. (Of course, other rules can affect this behaviour and the location of the .htaccess file - if in a subdirectory - will change this behaviour also.)

这在 .htaccess是有效的。实际上,在 .htaccess 中使用这个规则才能正常工作,因为如果你在虚拟主机上下文中使用这个规则,那么在重定向请求的 URL 路径开头将出现双斜杠。(当然,其他规则可能会影响这种行为,而且 .htaccess 文件的位置 - 如果在子目录中 - 也会改变这种行为。)

If the internet is to be believed, both ^ and ^(.*) should match the entire line.

如果相信互联网,^^(.*) 都应该匹配整行。

^ by itself does not match anything, it simply asserts the start-of-string, so is always "successful". Whereas ^(.*) asserts the start-of-string and then matches the rest of the URL-path and captures this in a backreference. In fact, the ^ prefix in ^(.*) is entirely optional (when matching URL-paths) since the regex is _gre

英文:

> My senior told me, that the caret-only match (e.g. the RewriteRule ^ part) does not work in .htaccess, apparently only in virtualhost

That is nonsense. The ^ (regex syntax) simply asserts the start-of-string. Nothing more. So, if an argument takes a regex (regardless of context) the ^ is always going to "work", depending on what you are trying to match. (I'm wondering whether you really mean the slash prefix on the URL-path?)

> apparently only in virtualhost

You could argue the opposite is true. ^$ will match in .htaccess, but not in a virtualhost context. (This matches the root directory / homepage in a .htaccess context.)

This is simply due to what is matched in these different contexts. (There are 4 contexts in the Apache config: .htaccess, directory, virtualhost and server.)

In a virtualhost context, the RewriteRule pattern matches against the full root-relative URL-path, starting with a slash. So, instead of ^$, you would need to match ^/$ instead (to match the root directory).

Whereas in .htaccess (and directory) context, the URL-path that is matched is relative to the directory that contains the .htaccess file (the directory-prefix has been removed) and consequently does not start with a slash.

For example, given the URL https://example.com/foo/bar/baz...

  • In a virtualhost context, the RewriteRule pattern matches against the URL-path /foo/bar/baz (the full URL-path starting with a slash.)

  • In .htaccess in the root directory then the RewriteRule pattern matches against foo/bar/baz (no slash prefix). But if the .htaccess file is located in the /foo subdirectory (not the root) then the URL-path that is matched is: bar/baz only (/foo is omitted).

> Apparently, the correct rule would be RewriteRule ^(.*) https://sub.another.com/$1 [R=301,L]. Can anyone please explain why the match doesn't work in .htaccess?

This does work in .htaccess. In fact, this would only work properly in .htaccess, since if you used this rule in a virtualhost context then you would end up with a double-slash at the start of the URL-path in the redirected request. (Of course, other rules can affect this behaviour and the location of the .htaccess file - if in a subdirectory - will change this behaviour also.)

> If the internet is to be believed, both ^ and ^(.*) should match the entire line.

^ by itself does not match anything, it simply asserts the start-of-string, so is always "successful". Whereas ^(.*) asserts the start-of-string and then matches the rest of the URL-path and captures this in a backreference. In fact, the ^ prefix in ^(.*) is entirely optional (when matching URL-paths) since the regex is greedy. ^(.*) and (.*) are "the same".

Since you are using a $1 backreference in the substitution string you need to actually capture something, so you need (.*) (since ^ does not match/capture anything, it's simply an assertion). But this is the same, regardless of where you are using this directive: .htaccess or virtualhost.

In this particular rule, you could use ^ only and use the REQUEST_URI server variable in the substitution string instead. For example:

RewriteRule ^ https://sub.another.com%{REQUEST_URI} [R=301,L]

Note that the REQUEST_URI variable contains the entire URL-path, including the slash prefix, so the slash is omitted in the substitution string. This rule is arguably "better" than using the backreference (your rule above) since it works anywhere (any directory), regardless of context and uses a more efficient regex.

huangapple
  • 本文由 发表于 2023年5月17日 16:05:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76269831.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定