.htaccess 不使用 .php 和不使用尾随斜杠

huangapple go评论55阅读模式
英文:

.htaccess no .php and no trailing slash

问题

我有一个由.php文件制作的简单网站,没有数据库。我使用htaccess来强制执行以下规则:

  • 强制使用https
  • 强制在网址开头使用www
  • 移除.php扩展名(https://www.example.com/page.php -> https://www.example.com/page)
  • 不允许尾随斜杠

这最后一个规则,不允许尾随斜杠,不起作用。相反,它导致404错误。这是我正在尝试解决的问题。如果有人打开 https://www.example.com/page/,我希望它重定向到 https://www.example.com/page,而不是给出404错误。

以下是我目前正在使用的相关.htaccess行。它基于html5 boilerplate htaccess,并添加了复制粘贴片段,因为我没有.htaccess知识。

ErrorDocument 404 /errors/404.php

Options -MultiViews

<IfModule mod_rewrite.c>

    # (1)
    RewriteEngine On

    # (2)
    Options +FollowSymlinks

</IfModule>

# 转向https

<IfModule mod_rewrite.c>
  RewriteEngine On
  RewriteCond %{HTTPS} !=on
  RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R] # 转向 301
</IfModule>

# 强制在网址开头使用www

<IfModule mod_rewrite.c>

  RewriteEngine On

#     # (1)
  RewriteCond %{HTTPS} =on
  RewriteRule ^ - [E=PROTO:https]
  RewriteCond %{HTTPS} !=on
  RewriteRule ^ - [E=PROTO:http]

#     # (2)
#     # RewriteCond %{HTTPS} !=on

  RewriteCond %{HTTP_HOST} !^www\. [NC]
  RewriteCond %{SERVER_ADDR} !=127.0.0.1
  RewriteCond %{SERVER_ADDR} !=::1
  RewriteRule ^ %{ENV:PROTO}://www.%{HTTP_HOST}%{REQUEST_URI} [L,R] # <- 用于测试,用于生产环境使用 [L,R=301]

</IfModule>

# 移除.php扩展名

<IfModule mod_rewrite.c>
  RewriteEngine On
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule ^([^\.]+)$ $1.php [NC,L]
</IfModule>

# 移除尾随斜杠

<IfModule mod_rewrite.c>
  RewriteEngine On
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_URI} (.*)/$
  RewriteRule ^(.*)/$ $1 [L,R] # <- 用于测试,用于生产环境使用 [L,R=301]
</IfModule>
英文:

I have a simple website made out of .php files, no database. I used htaccess to enforce following rules:

  • force https
  • force www at beginning
  • remove .php extension (https://www.example.com/page.php -> https://www.example.com/page)
  • no trailing slash

This last rule, no trailing slash, does not work. Instead it leads to an error 404. That is the problem I am trying to solve. If someone opens https://www.example.com/page/ I want it to redirect to https://www.example.com/page and not give a 404.

Here is the relevant .htaccess lines I'm currently using. It is based on html5 boilerplate htaccess with added copy paste snippets because I have no .htaccess knowledge.

ErrorDocument 404 /errors/404.php

Options -MultiViews

&lt;IfModule mod_rewrite.c&gt;

    # (1)
    RewriteEngine On

    # (2)
    Options +FollowSymlinks

&lt;/IfModule&gt;

# to https

&lt;IfModule mod_rewrite.c&gt;
  RewriteEngine On
  RewriteCond %{HTTPS} !=on
  RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R] # to 301
&lt;/IfModule&gt;

# force www at beginning

&lt;IfModule mod_rewrite.c&gt;

  RewriteEngine On

#     # (1)
  RewriteCond %{HTTPS} =on
  RewriteRule ^ - [E=PROTO:https]
  RewriteCond %{HTTPS} !=on
  RewriteRule ^ - [E=PROTO:http]

#     # (2)
#     # RewriteCond %{HTTPS} !=on

  RewriteCond %{HTTP_HOST} !^www\. [NC]
  RewriteCond %{SERVER_ADDR} !=127.0.0.1
  RewriteCond %{SERVER_ADDR} !=::1
  RewriteRule ^ %{ENV:PROTO}://www.%{HTTP_HOST}%{REQUEST_URI} [L,R] # &lt;- for test, for prod use [L,R=301]

&lt;/IfModule&gt;

# remove .php

&lt;IfModule mod_rewrite.c&gt;
  RewriteEngine On
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule ^([^\.]+)$ $1.php [NC,L]
&lt;/IfModule&gt;

# remove trailing /

&lt;IfModule mod_rewrite.c&gt;
  RewriteEngine On
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_URI} (.*)/$
  RewriteRule ^(.*)/$ $1 [L,R] # &lt;- for test, for prod use [L,R=301]
&lt;/IfModule&gt;

答案1

得分: 0

这些规则的顺序是错误的。对于/page/的请求首先被重写为/page/.php(这自然会导致404),然后才移除尾随斜杠(因为它以.php结尾,所以不再有尾随斜杠)。

然而,移除尾随斜杠的规则应该检查请求是否_不是_一个_目录_,而不是_不是_一个文件。而检查REQUEST_URI的第二个条件是多余的。你还忽略了_替代_字符串上的斜杠前缀(并且没有定义RewriteBase),因此这将导致一个格式错误的重定向。

不过,你的规则可以大大简化。不需要<IfModule>包装或多个RewriteEngine On指令,而非www到www的重定向不必保留方案(HTTP或HTTPS),因为它总是HTTPS。

你可以更简洁地编写你的规则,如下所示:

ErrorDocument 404 /errors/404.php

Options +FollowSymLinks -MultiViews

RewriteEngine On

# 转到https
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

# 强制www在开头
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteCond %{SERVER_ADDR} !=127.0.0.1
RewriteCond %{SERVER_ADDR} !=::1
RewriteRule ^ https://www.%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

# 移除尾随斜杠
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)/$ /$1 [R=301,L]

# 移除.php(实际上,这会“添加”.php”,而不会“删除”任何东西)
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^([^.]+)$ $1.php [L]

我会质疑检查SERVER_ADDR的两个条件是否真的有必要,因为你在HTTP到HTTPS规则上没有类似的东西。

在正则表达式字符类内使用字面点时不需要转义。

你标记为“移除.php”的规则实际上并没有“移除”任何东西。它在已经删除了.php扩展名的请求上“追加”.php”扩展名。最好在尝试重写请求之前先测试相应的.php文件是否存在,而不是无条件地追加.php,希望文件存在(这可能会在某些情况下导致意外错误,至少会在.php请求上记录404,而不是实际请求的URL)。

根据你原来的规则块,如果请求HTTP +非www,将导致两次重定向,因为首先将HTTP重定向到HTTPS。这实际上是实施HSTS的要求,但否则,你可以通过颠倒前两个规则来避免这个双重重定向。(不过,删除尾随斜杠的重定向如原样编写也会导致额外的重定向。如果需要,可以更改这一点,但否则不会立即引发问题。)

注意:要小心行尾注释(我已经移除了它们)。Apache不支持它们。它们似乎会工作,因为配置指令的解析方式,但如果省略了任何可选参数,你将因无效语法而获得500内部服务器错误。 (但是是的,始终首先使用302 - 临时 - 重定向进行测试。)

更新:

一切都按预期运行,除了这个:example.com/nonexistingfile.php。这会导致404,但奇怪的是不是我的自定义404(ErrorDocument 404 /errors/404.php),而是一个'服务器备份404'。只是一个纯文本“文件未找到。” 有关此问题有什么想法?

这可能与PHP在你的服务器上的安装方式有关(与上述指令无关)。可能你的服务器最终将所有.php请求代理到后端PHP引擎,从本质上绕过了你的.htaccess/ErrorDocument指令。

作为一种解决方法(尽管这可能是可取的),你可以尝试首先使用mod_rewrite强制对包含.php扩展名的任何请求返回404(因为 - 我假设 - 客户端不应该直接请求.php文件)。

例如,尝试在RewriteEngine On指令之后(或在“转到https”规则之后,以确保404响应始终在HTTPS上进行)添加以下内容:

# 强制任何“直接”请求到“.php”文件返回404
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule \.php$ - [R=404]

REDIRECT_STATUS环境变量的检查确保此规则仅适用于来自客户端的直接请求,而不适用于服务器内部重写的请求(通过最后一个规则)。

英文:

> # remove .php
>
> <IfModule mod_rewrite.c>
> RewriteEngine On
> RewriteCond %{REQUEST_FILENAME} !-f
> RewriteRule ^([^.]+)$ $1.php [NC,L]
> </IfModule>
>
> # remove trailing /
>
> <IfModule mod_rewrite.c>
> RewriteEngine On
> RewriteCond %{REQUEST_FILENAME} !-f
> RewriteCond %{REQUEST_URI} (.)/$
> RewriteRule ^(.
)/$ $1 [L,R] # <- for test, for prod use [L,R=301]
> </IfModule>

These rules are in the wrong order. A request for /page/ is first rewritten to /page/.php (which naturally results in a 404) before you are removing the trailing slash (it no longer has a trailing slash since it ends in .php).

However, your rule to remove the trailing slash should be checking that the request is not a directory, not that it is not a file. And the second condition that checks against REQUEST_URI is superfluous. You are also missing a slash prefix on the substitution string (and there is no RewriteBase defined), so this would have resulted in a malformed redirect.

However, your rules can be greatly simplified. No need for the &lt;IfModule&gt; wrappers or multiple RewriteEngine On directives and the non-www to www redirect unnecessarily preserves the scheme (HTTP or HTTPS), when it is always HTTPS.

Your rules could be written more succinctly like this:

ErrorDocument 404 /errors/404.php

Options +FollowSymLinks -MultiViews

RewriteEngine On

# to https
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

# force www at beginning
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteCond %{SERVER_ADDR} !=127.0.0.1
RewriteCond %{SERVER_ADDR} !=::1
RewriteRule ^ https://www.%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

# remove trailing /
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)/$ /$1 [R=301,L]

# remove .php (Actually, this &quot;appends&quot; .php, it doesn&#39;t &quot;remove&quot; anything)
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^([^.]+)$ $1.php [L]

I would question whether the two conditions that check against SERVER_ADDR are really necessary here, since you don't have something similar on the HTTP to HTTPS rule.

No need to backslash-escape literal dots when used inside a regex character class.

The rule that you have labelled "remove .php", doesn't actually "remove" anything. It appends the .php extension on requests where the .php extension has already been removed. It is better to first test that the corresponding .php file exists before attempting to rewrite the request, rather than unconditionally appending .php in the hope that the file exists (this can result in unexpected errors in some scenarios and at the very least logs the 404 on the .php request, rather than the URL that was actually requested).

This rule block (as per your original rule block) will also result in two redirects if requesting HTTP + non-www since you are redirecting HTTP to HTTPS on the same host first. This is actually a requirement if implementing HSTS, but otherwise you can avoid this double redirect by reversing the first two rules. (However, the redirect that removes the trailing slash would also result in an additional redirect as written. This can be changed if so desired, but otherwise does not cause an immediate issue.)

NB: Be careful with line-end comments (I've removed them). They are not supported by Apache. They might appear to work just because of the way config directives are parsed, but if you have omitted any optional arguments then you'll get a 500 Internal Server Error due to invalid syntax. (But yes, always test first with 302 - temporary - redirects.)


UPDATE:

> everything works as it should except this: example.com/nonexistingfile.php. This causes a 404 but weirdly not my custom 404 (ErrorDocument 404 /errors/404.php) but a 'server backup 404'. Just a plain text "File not found." Any ideas on that?

This is likely related to the way PHP is installed on your server (nothing related to the above directives). It's possible that your server is ultimately proxying all .php requests to the backend PHP engine, essentially bypassing your .htaccess/ErrorDocument directive.

<!--
As a workaround (although this is probably desirable anyway) you could try first removing
-->

As a workaround you could try forcing a 404 using mod_rewrite for any request that contains a .php extension (since - I assume - no client should be making a direct request to a .php file).

For example, try adding the following immediately after the RewriteEngine On directive (or after the "to https" rule to ensure the 404 response is always over HTTPS):

# Force any &quot;direct&quot; request to a &quot;.php&quot; file to return a 404
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule \.php$ - [R=404]

The check against the REDIRECT_STATUS env var ensures this rule only applies to direct requests from the client and not requests that have been internally rewritten on the server (by the last rule).

huangapple
  • 本文由 发表于 2023年3月31日 22:12:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/75899527.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定