英文:
htaccess vs virtualhost - Caret match ^
问题
I've been coming up with some redirects for a client, where they needed a redirect on domain.com to have leading www, and a sub.another.com that has been aliased to domain.com to not have leading www. I've found a solution, which kinda-worked.
My senior told me, that the caret-only match (e.g. the RewriteRule ^
part) does not work in .htaccess, apparently only in virtualhost. I know that there are some minor differences between these, but I was unable to find any source (and he wasn't able to provide it too) that says that ^
match doesn't work in .htaccess.
Apparently, the correct rule would be RewriteRule ^(.*) https://sub.another.com/$1 [R=301,L]
. Can anyone please explain why the match doesn't work in .htaccess? If the internet is to be believed, both ^
and ^(.*)
should match the entire line. The latter explicitly tells to match anything after the beginning of the line, while the former matches the beginning of the line itself (so should also match an empty string, see this answer).
英文:
I've been coming up with some redirects for a client, where they needed a redirect on domain.com to have leading www, and a sub.another.com that has been aliased to domain.com to not have leading www. I've found a solution, which kinda-worked.
My senior told me, that the caret-only match (e.g. the RewriteRule ^
part) does not work in .htaccess, apparently only in virtualhost. I know that there are some minor differences between these, but I was unable to find any source (and he wasn't able to provide it too) that says that ^
match doesn't work in .htaccess.
Apparently, the correct rule would be RewriteRule ^(.*) https://sub.another.com/$1 [R=301,L]
. Can anyone please explain why the match doesn't work in .htaccess? If the internet is to be believed, both ^
and ^(.*)
should match the entire line. The latter explicitly tells to match anything after beginning of the line, while the former matches the beginning of the line itself (so should also match an empty string, see this answer).
答案1
得分: 1
以下是您要翻译的内容:
My senior told me, that the caret-only match (e.g. the
RewriteRule ^
part) does not work in.htaccess
, apparently only in virtualhost
我的前辈告诉我,插入符匹配(例如 RewriteRule ^
部分)在 .htaccess
中不起作用,显然只在虚拟主机中起作用。
That is nonsense. The ^
(regex syntax) simply asserts the start-of-string. Nothing more. So, if an argument takes a regex (regardless of context) the ^
is always going to "work", depending on what you are trying to match. (I'm wondering whether you really mean the slash prefix on the URL-path?)
那是无稽之谈。^
(正则表达式语法)只是断言字符串的开头,没有其他意义。因此,如果一个参数使用正则表达式(不管上下文如何),^
总是会“起作用”,取决于你想要匹配的内容。(我在想你是否真的指的是 URL 路径上的斜杠前缀?)
apparently only in virtualhost
显然只在虚拟主机中。
You could argue the opposite is true. ^$
will match in .htaccess
, but not in a virtualhost context. (This matches the root directory / homepage in a .htaccess
context.)
你可以说相反的情况也成立。^$
在 .htaccess
中会匹配,但在虚拟主机上下文中不会匹配。(这在 .htaccess
上下文中匹配根目录/主页。)
This is simply due to what is matched in these different contexts. (There are 4 contexts in the Apache config: .htaccess
, directory, virtualhost and server.)
这仅仅是因为在这些不同的上下文中匹配的内容不同。(在 Apache 配置中有 4 个上下文:.htaccess
、directory、virtualhost 和 server。)
In a virtualhost context, the RewriteRule
pattern matches against the full root-relative URL-path, starting with a slash. So, instead of ^$
, you would need to match ^/$
instead (to match the root directory).
在虚拟主机上下文中,RewriteRule
的 pattern 与完整的根相对 URL 路径匹配,以斜杠开头。因此,你需要匹配 ^/$
而不是 ^$
(以匹配根目录)。
Whereas in .htaccess
(and directory) context, the URL-path that is matched is relative to the directory that contains the .htaccess
file (the directory-prefix has been removed) and consequently does not start with a slash.
然而,在 .htaccess
(和 directory)上下文中,匹配的 URL 路径是相对于包含 .htaccess
文件的目录的,目录前缀已被删除,因此不以斜杠开头。
For example, given the URL https://example.com/foo/bar/baz
...
例如,假设 URL 是 https://example.com/foo/bar/baz
...
-
In a virtualhost context, the
RewriteRule
pattern matches against the URL-path/foo/bar/baz
(the full URL-path starting with a slash.) -
在虚拟主机上下文中,
RewriteRule
的 pattern 与 URL 路径/foo/bar/baz
匹配(完整的 URL 路径以斜杠开头)。 -
In
.htaccess
in the root directory then theRewriteRule
pattern matches againstfoo/bar/baz
(no slash prefix). But if the.htaccess
file is located in the/foo
subdirectory (not the root) then the URL-path that is matched is:bar/baz
only (/foo
is omitted). -
在根目录的
.htaccess
中,RewriteRule
的 pattern 与foo/bar/baz
匹配(没有斜杠前缀)。但如果.htaccess
文件位于/foo
子目录中(而不是根目录),则匹配的 URL 路径是:仅为bar/baz
(/foo
被省略)。
Apparently, the correct rule would be
RewriteRule ^(.*) https://sub.another.com/$1 [R=301,L]
. Can anyone please explain why the match doesn't work in.htaccess
?
显然,正确的规则应该是 RewriteRule ^(.*) https://sub.another.com/$1 [R=301,L]
。有人能解释为什么匹配在 .htaccess
中不起作用吗?
This does work in .htaccess
. In fact, this would only work properly in .htaccess
, since if you used this rule in a virtualhost context then you would end up with a double-slash at the start of the URL-path in the redirected request. (Of course, other rules can affect this behaviour and the location of the .htaccess
file - if in a subdirectory - will change this behaviour also.)
这在 .htaccess
中是有效的。实际上,在 .htaccess
中使用这个规则才能正常工作,因为如果你在虚拟主机上下文中使用这个规则,那么在重定向请求的 URL 路径开头将出现双斜杠。(当然,其他规则可能会影响这种行为,而且 .htaccess
文件的位置 - 如果在子目录中 - 也会改变这种行为。)
If the internet is to be believed, both
^
and^(.*)
should match the entire line.
如果相信互联网,^
和 ^(.*)
都应该匹配整行。
^
by itself does not match anything, it simply asserts the start-of-string, so is always "successful". Whereas ^(.*)
asserts the start-of-string and then matches the rest of the URL-path and captures this in a backreference. In fact, the ^
prefix in ^(.*)
is entirely optional (when matching URL-paths) since the regex is _gre
英文:
> My senior told me, that the caret-only match (e.g. the RewriteRule ^
part) does not work in .htaccess
, apparently only in virtualhost
That is nonsense. The ^
(regex syntax) simply asserts the start-of-string. Nothing more. So, if an argument takes a regex (regardless of context) the ^
is always going to "work", depending on what you are trying to match. (I'm wondering whether you really mean the slash prefix on the URL-path?)
> apparently only in virtualhost
You could argue the opposite is true. ^$
will match in .htaccess
, but not in a virtualhost context. (This matches the root directory / homepage in a .htaccess
context.)
This is simply due to what is matched in these different contexts. (There are 4 contexts in the Apache config: .htaccess
, directory, virtualhost and server.)
In a virtualhost context, the RewriteRule
pattern matches against the full root-relative URL-path, starting with a slash. So, instead of ^$
, you would need to match ^/$
instead (to match the root directory).
Whereas in .htaccess
(and directory) context, the URL-path that is matched is relative to the directory that contains the .htaccess
file (the directory-prefix has been removed) and consequently does not start with a slash.
For example, given the URL https://example.com/foo/bar/baz
...
-
In a virtualhost context, the
RewriteRule
pattern matches against the URL-path/foo/bar/baz
(the full URL-path starting with a slash.) -
In
.htaccess
in the root directory then theRewriteRule
pattern matches againstfoo/bar/baz
(no slash prefix). But if the.htaccess
file is located in the/foo
subdirectory (not the root) then the URL-path that is matched is:bar/baz
only (/foo
is omitted).
> Apparently, the correct rule would be RewriteRule ^(.*) https://sub.another.com/$1 [R=301,L]
. Can anyone please explain why the match doesn't work in .htaccess
?
This does work in .htaccess
. In fact, this would only work properly in .htaccess
, since if you used this rule in a virtualhost context then you would end up with a double-slash at the start of the URL-path in the redirected request. (Of course, other rules can affect this behaviour and the location of the .htaccess
file - if in a subdirectory - will change this behaviour also.)
> If the internet is to be believed, both ^
and ^(.*)
should match the entire line.
^
by itself does not match anything, it simply asserts the start-of-string, so is always "successful". Whereas ^(.*)
asserts the start-of-string and then matches the rest of the URL-path and captures this in a backreference. In fact, the ^
prefix in ^(.*)
is entirely optional (when matching URL-paths) since the regex is greedy. ^(.*)
and (.*)
are "the same".
Since you are using a $1
backreference in the substitution string you need to actually capture something, so you need (.*)
(since ^
does not match/capture anything, it's simply an assertion). But this is the same, regardless of where you are using this directive: .htaccess
or virtualhost.
In this particular rule, you could use ^
only and use the REQUEST_URI
server variable in the substitution string instead. For example:
RewriteRule ^ https://sub.another.com%{REQUEST_URI} [R=301,L]
Note that the REQUEST_URI
variable contains the entire URL-path, including the slash prefix, so the slash is omitted in the substitution string. This rule is arguably "better" than using the backreference (your rule above) since it works anywhere (any directory), regardless of context and uses a more efficient regex.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论