<< Back to man.ChinaUnix.net

<-
Apache > HTTP Server > 文档 > 版本2.2 > 模块
   致谢 | 译者声明 | 本篇译者:<虚位以待> | 本篇译稿完成时间:?年?月?日 | 获取最新版本

Apache模块 mod_rewrite

说明一个基于一定规则的实时重写URL请求的引擎
状态扩展(E)
模块名rewrite_module
源文件mod_rewrite.c
兼容性仅在 Apache 1.3 及以后的版本中可用

概述

此模块提供了一个基于正则表达式分析器的重写引擎来实时重写URL请求。它支持每个完整规则可以拥有不限数量的子规则以及附加条件规则的灵活而且强大的URL操作机制。此URL操作可以依赖于各种测试,比如服务器变量、环境变量、HTTP头、时间标记,甚至各种格式的用于匹配URL组成部分的查找数据库。

此模块可以操作URL的所有部分(包括路径信息部分),在服务器级的(httpd.conf)和目录级的(.htaccess)配置都有效,还可以生成最终请求字符串。此重写操作的结果可以是内部子处理,也可以是外部请求的转向,甚至还可以是内部代理处理。

但是,所有这些功能和灵活性带来一个问题,那就是复杂性,因此,不要指望一天之内就能看懂整个模块。

更多的讨论、细节、示例,请查看详细的URL重写文档

top

特殊字符的引用

在Apache 1.3.20中,TestStringSubstitution中的特殊字符可以用前导斜杠(\)来实现转义(即忽略其特殊含义而视之为普通字符)。 比如,Substitution可以用"\$"来包含一个美元符号,以避免mod_rewrite把它视为反向引用。

top

环境变量

此模块会跟踪两个额外的(非标准)CGI/SSI环境变量,SCRIPT_URLSCRIPT_URI。他们包含了当前资源的逻辑网络视图,而标准CGI/SSI变量SCRIPT_NAMESCRIPT_FILENAME包含的是物理系统视图。

注意:这些变量保持的是其最初被请求时的URI/URL,即在任何重写操作之前的URI/URL。其重要性在于他们是重写操作重写URL到物理路径名的原始依据。

示例

SCRIPT_NAME=/sw/lib/w3s/tree/global/u/rse/.www/index.html
SCRIPT_FILENAME=/u/rse/.www/index.html
SCRIPT_URL=/u/rse/
SCRIPT_URI=http://en1.engelschall.com/u/rse/
top

实用方案

我们提供了URL重写指南高级URL重写指南文档,列举了许多基于URL的问题的实用方案,其中你可以找到真实有用的规则集。

top

RewriteBase 指令

说明设置目录级重写的基准URL
语法RewriteBase URL-path
默认值参见使用方法
作用域directory, .htaccess
覆盖项FileInfo
状态扩展(E)
模块mod_rewrite

RewriteBase指令显式地设置了目录级重写的基准URL。在下文中,你可以看见RewriteRule可以用于目录级的配置文件中(.htaccess)并在局部范围内起作用,即规则实际处理的只是剥离了本地路径前缀的一部分。处理结束后,这个路径会被自动地附着回去。默认值是"RewriteBase physical-directory-path"。

在对一个新的URL进行替换时,此模块必须把这个URL重新注入到服务器处理中。为此,它必须知道其对应的URL前缀或者说URL基准。通常,此前缀就是对应的文件路径。但是,大多数网站URL不是直接对应于其物理文件路径的,因而一般不能做这样的假定! 所以在这种情况下,就必须用RewriteBase指令来指定正确的URL前缀。

如果你的网站服务器URL不是与物理文件路径直接对应的,而又需要使用RewriteBase指令,则必须在每个对应的.htaccess文件中指定RewriteRule

例如,目录级配置文件内容如下:

#
#  /abc/def/.htaccess -- per-dir config file for directory /abc/def
#  Remember: /abc/def is the physical path of /xyz, i.e., the server
#            has a 'Alias /xyz /abc/def' directive 例如,
#

RewriteEngine On

#  let the server know that we were reached via /xyz and not
#  via the physical path prefix /abc/def
RewriteBase   /xyz

#  now the rewriting rules
RewriteRule   ^oldstuff\.html$  newstuff.html

上述例子中,对/xyz/oldstuff.html的请求被正确地重写为物理的文件/abc/def/newstuff.html

For Apache Hackers

以下列出了内部处理的详细步骤:

Request:
  /xyz/oldstuff.html

Internal Processing:
  /xyz/oldstuff.html     -> /abc/def/oldstuff.html  (per-server Alias)
  /abc/def/oldstuff.html -> /abc/def/newstuff.html  (per-dir    RewriteRule)
  /abc/def/newstuff.html -> /xyz/newstuff.html      (per-dir    RewriteBase)
  /xyz/newstuff.html     -> /abc/def/newstuff.html  (per-server Alias)

Result:
  /abc/def/newstuff.html

虽然这个过程看来很繁复,但是由于目录级重写的到来时机已经太晚了,它不得不把这个(重写)请求重新注入到Apache核心中,所以Apache内部确实是这样处理的。但是:它的开销并不象看起来的那样大,因为重新注入完全在Apache服务器内部进行,而且这样的过程在Apache内部也为其他许多操作所使用。所以,你可以充分信任其设计和实现是正确的。

top

RewriteCond 指令

说明定义重写发生的条件
语法 RewriteCond TestString CondPattern
作用域server config, virtual host, directory, .htaccess
覆盖项FileInfo
状态扩展(E)
模块mod_rewrite

RewriteCond指令定义了一个规则的条件,即在一个RewriteRule指令之前有一个或多个RewriteCond指令。条件之后的重写规则仅在当前URI与pattern匹配并且符合这些条件的时候才会起作用。

TestString是一个纯文本的字符串,但是还可以包含下列可扩展的成分:

Special Notes:

  1. The variables SCRIPT_FILENAME and REQUEST_FILENAME contain the same value, i.e., the value of the filename field of the internal request_rec structure of the Apache server. The first name is just the commonly known CGI variable name while the second is the consistent counterpart to REQUEST_URI (which contains the value of the uri field of request_rec).
  2. There is the special format: %{ENV:variable} where variable can be any environment variable. This is looked-up via internal Apache structures and (if not found there) via getenv() from the Apache server process.
  3. There is the special format: %{SSL:variable} where variable is the name of an SSL environment variable; this can be used whether or not mod_ssl is loaded, but will always expand to the empty string if it is not. Example: %{SSL:SSL_CIPHER_USEKEYSIZE} may expand to 128.
  4. There is the special format: %{HTTP:header} where header can be any HTTP MIME-header name. This is looked-up from the HTTP request. Example: %{HTTP:Proxy-Connection} is the value of the HTTP header "Proxy-Connection:".
  5. There is the special format %{LA-U:variable} for look-aheads which perform an internal (URL-based) sub-request to determine the final value of variable. Use this when you want to use a variable for rewriting which is actually set later in an API phase and thus is not available at the current stage. For instance when you want to rewrite according to the REMOTE_USER variable from within the per-server context (httpd.conf file) you have to use %{LA-U:REMOTE_USER} because this variable is set by the authorization phases which come after the URL translation phase where mod_rewrite operates. On the other hand, because mod_rewrite implements its per-directory context (.htaccess file) via the Fixup phase of the API and because the authorization phases come before this phase, you just can use %{REMOTE_USER} there.
  6. There is the special format: %{LA-F:variable} which performs an internal (filename-based) sub-request to determine the final value of variable. Most of the time this is the same as LA-U above.

CondPattern is the condition pattern, i.e., a regular expression which is applied to the current instance of the TestString, i.e., TestString is evaluated and then matched against CondPattern.

Remember: CondPattern is a perl compatible regular expression with some additions:

  1. You can prefix the pattern string with a '!' character (exclamation mark) to specify a non-matching pattern.
  2. There are some special variants of CondPatterns. Instead of real regular expression strings you can also use one of the following:
    • '<CondPattern' (is lexically lower)
      Treats the CondPattern as a plain string and compares it lexically to TestString. True if TestString is lexically lower than CondPattern.
    • '>CondPattern' (is lexically greater)
      Treats the CondPattern as a plain string and compares it lexically to TestString. True if TestString is lexically greater than CondPattern.
    • '=CondPattern' (is lexically equal)
      Treats the CondPattern as a plain string and compares it lexically to TestString. True if TestString is lexically equal to CondPattern, i.e the two strings are exactly equal (character by character). If CondPattern is just "" (two quotation marks) this compares TestString to the empty string.
    • '-d' (is directory)
      Treats the TestString as a pathname and tests if it exists and is a directory.
    • '-f' (is regular file)
      Treats the TestString as a pathname and tests if it exists and is a regular file.
    • '-s' (is regular file with size)
      Treats the TestString as a pathname and tests if it exists and is a regular file with size greater than zero.
    • '-l' (is symbolic link)
      Treats the TestString as a pathname and tests if it exists and is a symbolic link.
    • '-x' (has executable permissions)
      Treats the TestString as a pathname and tests if it exists and has execution permissions. These permissions are determined depending on the underlying OS.
    • '-F' (is existing file via subrequest)
      Checks if TestString is a valid file and accessible via all the server's currently-configured access controls for that path. This uses an internal subrequest to determine the check, so use it with care because it decreases your servers performance!
    • '-U' (is existing URL via subrequest)
      Checks if TestString is a valid URL and accessible via all the server's currently-configured access controls for that path. This uses an internal subrequest to determine the check, so use it with care because it decreases your server's performance!

    Notice

    All of these tests can also be prefixed by an exclamation mark ('!') to negate their meaning.

Additionally you can set special flags for CondPattern by appending

[flags]

as the third argument to the RewriteCond directive. Flags is a comma-separated list of the following flags:

Example:

To rewrite the Homepage of a site according to the "User-Agent:" header of the request, you can use the following:

RewriteCond  %{HTTP_USER_AGENT}  ^Mozilla.*
RewriteRule  ^/$                 /homepage.max.html  [L]

RewriteCond  %{HTTP_USER_AGENT}  ^Lynx.*
RewriteRule  ^/$                 /homepage.min.html  [L]

RewriteRule  ^/$                 /homepage.std.html  [L]

Interpretation: If you use Netscape Navigator as your browser (which identifies itself as 'Mozilla'), then you get the max homepage, which includes Frames, etc. If you use the Lynx browser (which is Terminal-based), then you get the min homepage, which contains no images, no tables, etc. If you use any other browser you get the standard homepage.

top

RewriteEngine 指令

说明Enables or disables runtime rewriting engine
语法RewriteEngine on|off
默认值RewriteEngine off
作用域server config, virtual host, directory, .htaccess
覆盖项FileInfo
状态扩展(E)
模块mod_rewrite

RewriteEngine directive enables or disables the runtime rewriting engine. If it is set to off this module does no runtime processing at all. It does not even update the SCRIPT_URx environment variables.

Use this directive to disable the module instead of commenting out all the RewriteRule directives!

Note that, by default, rewrite configurations are not inherited. This means that you need to have a RewriteEngine on directive for each virtual host in which you wish to use it.

top

RewriteLock 指令

说明Sets the name of the lock file used for RewriteMap synchronization
语法RewriteLock file-path
作用域server config
状态扩展(E)
模块mod_rewrite

This directive sets the filename for a synchronization lockfile which mod_rewrite needs to communicate with RewriteMap programs. Set this lockfile to a local path (not on a NFS-mounted device) when you want to use a rewriting map-program. It is not required for other types of rewriting maps.

top

RewriteLog 指令

说明Sets the name of the file used for logging rewrite engine processing
语法RewriteLog file-path
作用域server config, virtual host
状态扩展(E)
模块mod_rewrite

RewriteLog directive sets the name of the file to which the server logs any rewriting actions it performs. If the name does not begin with a slash ('/') then it is assumed to be relative to the Server Root. The directive should occur only once per server config.

To disable the logging of rewriting actions it is not recommended to set Filename to /dev/null, because although the rewriting engine does not then output to a logfile it still creates the logfile output internally. This will slow down the server with no advantage to the administrator! To disable logging either remove or comment out the RewriteLog directive or use RewriteLogLevel 0!

安全

See the Apache Security Tips document for details on why your security could be compromised if the directory where logfiles are stored is writable by anyone other than the user that starts the server.

示例

RewriteLog "/usr/local/var/apache/logs/rewrite.log"

top

RewriteLogLevel 指令

说明Sets the verbosity of the log file used by the rewrite engine
语法RewriteLogLevel Level
默认值RewriteLogLevel 0
作用域server config, virtual host
状态扩展(E)
模块mod_rewrite

RewriteLogLevel directive sets the verbosity level of the rewriting logfile. The default level 0 means no logging, while 9 or more means that practically all actions are logged.

To disable the logging of rewriting actions simply set Level to 0. This disables all rewrite action logs.

Using a high value for Level will slow down your Apache server dramatically! Use the rewriting logfile at a Level greater than 2 only for debugging!

示例

RewriteLogLevel 3

top

RewriteMap 指令

说明Defines a mapping function for key-lookup
语法RewriteMap MapName MapType:MapSource
作用域server config, virtual host
状态扩展(E)
模块mod_rewrite
兼容性The choice of different dbm types is available in Apache 2.0.41 及以后的版本中可用

RewriteMap directive defines a Rewriting Map which can be used inside rule substitution strings by the mapping-functions to insert/substitute fields through a key lookup. The source of this lookup can be of various types.

MapName is the name of the map and will be used to specify a mapping-function for the substitution strings of a rewriting rule via one of the following constructs:

${ MapName : LookupKey }
${ MapName : LookupKey | DefaultValue }

When such a construct occurs the map MapName is consulted and the key LookupKey is looked-up. If the key is found, the map-function construct is substituted by SubstValue. If the key is not found then it is substituted by DefaultValue or by the empty string if no DefaultValue was specified.

For example, you might define a RewriteMap as:

RewriteMap examplemap txt:/path/to/file/map.txt

You would then be able to use this map in a RewriteRule as follows:

RewriteRule ^/ex/(.*) ${examplemap:$1}

The following combinations for MapTypeMapSource can be used:

RewriteMap directive can occur more than once. For each mapping-function use one RewriteMap directive to declare its rewriting mapfile. While you cannot declare a map in per-directory context it is of course possible to use this map in per-directory context.

注意

For plain text and DBM format files the looked-up keys are cached in-core until the mtime of the mapfile changes or the server does a restart. This way you can have map-functions in rules which are used for every request. This is no problem, because the external lookup only happens once!
top

RewriteOptions 指令

说明Sets some special options for the rewrite engine
语法RewriteOptions Options
作用域server config, virtual host, directory, .htaccess
覆盖项FileInfo
状态扩展(E)
模块mod_rewrite
兼容性MaxRedirects is no longer available in version 2.1 及以后的版本中可用

RewriteOptions directive sets some special options for the current per-server or per-directory configuration. The Option string can be currently only one:

inherit
This forces the current configuration to inherit the configuration of the parent. In per-virtual-server context this means that the maps, conditions and rules of the main server are inherited. In per-directory context this means that conditions and rules of the parent directory's .htaccess configuration are inherited.
top

RewriteRule 指令

说明Defines rules for the rewriting engine
语法RewriteRule Pattern Substitution
作用域server config, virtual host, directory, .htaccess
覆盖项FileInfo
状态扩展(E)
模块mod_rewrite
兼容性The cookie-flag is available in Apache 2.0.40 及以后的版本中可用

RewriteRule directive is the real rewriting workhorse. The directive can occur more than once. Each directive then defines one single rewriting rule. The definition order of these rules is important, because this order is used when applying the rules at run-time.

Pattern is a perl compatible regular expression which gets applied to the current URL. Here "current" means the value of the URL when this rule gets applied. This may not be the originally requested URL, because any number of rules may already have matched and made alterations to it.

Some hints about the syntax of regular expressions:

Text:
  .           Any single character
  [chars]     Character class: One  of chars
  [^chars]    Character class: None of chars
  text1|text2 Alternative: text1 or text2

Quantifiers:
  ?           0 or 1 of the preceding text
  *           0 or N of the preceding text (N > 0)
  +           1 or N of the preceding text (N > 1)

Grouping:
  (text)      Grouping of text
              (either to set the borders of an alternative or
              for making backreferences where the Nth group can 
              be used on the RHS of a RewriteRule with $N)

Anchors:
  ^           Start of line anchor
  $           End   of line anchor

Escaping:
  \char       escape that particular char
              (for instance to specify the chars ".[]()" etc.)

For more information about regular expressions have a look at the perl regular expression manpage ("perldoc perlre"). If you are interested in more detailed information about regular expressions and their variants (POSIX regex etc.) have a look at the following dedicated book on this topic:

Mastering Regular Expressions, 2nd Edition
Jeffrey E.F. Friedl
O'Reilly & Associates, Inc. 2002
ISBN 0-596-00289-0

Additionally in mod_rewrite the NOT character ('!') is a possible pattern prefix. This gives you the ability to negate a pattern; to say, for instance: "if the current URL does NOT match this pattern". This can be used for exceptional cases, where it is easier to match the negative pattern, or as a last default rule.

Notice

When using the NOT character to negate a pattern you cannot have grouped wildcard parts in the pattern. This is impossible because when the pattern does NOT match, there are no contents for the groups. In consequence, if negated patterns are used, you cannot use $N in the substitution string!

Substitution of a rewriting rule is the string which is substituted for (or replaces) the original URL for which Pattern matched. Beside plain text you can use

  1. back-references $N to the RewriteRule pattern
  2. back-references %N to the last matched RewriteCond pattern
  3. server-variables as in rule condition test-strings (%{VARNAME})
  4. mapping-function calls (${mapname:key|default})

Back-references are $N (N=0..9) identifiers which will be replaced by the contents of the Nth group of the matched Pattern. The server-variables are the same as for the TestString of a RewriteCond directive. The mapping-functions come from the RewriteMap directive and are explained there. These three types of variables are expanded in the order of the above list.

As already mentioned above, all the rewriting rules are applied to the Substitution (in the order of definition in the config file). The URL is completely replaced by the Substitution and the rewriting process goes on until there are no more rules unless explicitly terminated by a L flag - see below.

There is a special substitution string named '-' which means: NO substitution! Sounds silly? No, it is useful to provide rewriting rules which only match some URLs but do no substitution, 例如,in conjunction with the C (chain) flag to be able to have more than one pattern to be applied before a substitution occurs.

Query String

Pattern will not match against the query string. Instead, you must use a RewriteCond with the %{QUERY_STRING} variable. You can, however, create URLs in the substitution string containing a query string part. Just use a question mark inside the substitution string to indicate that the following stuff should be re-injected into the query string. When you want to erase an existing query string, end the substitution string with just the question mark. To combine a new query string with an old one, use the [QSA] flag (see below).

Substitution of Absolute URLs

There is a special feature: When you prefix a substitution field with http://thishost[:thisport] then mod_rewrite automatically strips it out. This auto-reduction on implicit external redirect URLs is a useful and important feature when used in combination with a mapping-function which generates the hostname part. Have a look at the first example in the example section below to understand this.

Remember: An unconditional external redirect to your own server will not work with the prefix http://thishost because of this feature. To achieve such a self-redirect, you have to use the R-flag (see below).

Additionally you can set special flags for Substitution by appending

[flags]

as the third argument to the RewriteRule directive. Flags is a comma-separated list of the following flags:

注意

Never forget that Pattern is applied to a complete URL in per-server configuration files. But in per-directory configuration files, the per-directory prefix (which always is the same for a specific directory!) is automatically removed for the pattern matching and automatically added after the substitution has been done. This feature is essential for many sorts of rewriting, because without this prefix stripping you have to match the parent directory which is not always possible.

There is one exception: If a substitution string starts with "http://" then the directory prefix will not be added and an external redirect or proxy throughput (if flag P is used!) is forced!

注意

To enable the rewriting engine for per-directory configuration files you need to set "RewriteEngine On" in these files "Options FollowSymLinks" must be enabled. If your administrator has disabled override of FollowSymLinks for a user's directory, then you cannot use the rewriting engine. This restriction is needed for security reasons.

Here are all possible substitution combinations and their meanings:

Inside per-server configuration (httpd.conf)
for request "GET /somepath/pathinfo":

Given Rule                                      Resulting Substitution
----------------------------------------------  ----------------------------------
^/somepath(.*) otherpath$1                      not supported, because invalid!

^/somepath(.*) otherpath$1  [R]                 not supported, because invalid!

^/somepath(.*) otherpath$1  [P]                 not supported, because invalid!
----------------------------------------------  ----------------------------------
^/somepath(.*) /otherpath$1                     /otherpath/pathinfo

^/somepath(.*) /otherpath$1 [R]                 http://thishost/otherpath/pathinfo
                                                via external redirection

^/somepath(.*) /otherpath$1 [P]                 not supported, because silly!
----------------------------------------------  ----------------------------------
^/somepath(.*) http://thishost/otherpath$1      /otherpath/pathinfo

^/somepath(.*) http://thishost/otherpath$1 [R]  http://thishost/otherpath/pathinfo
                                                via external redirection

^/somepath(.*) http://thishost/otherpath$1 [P]  not supported, because silly!
----------------------------------------------  ----------------------------------
^/somepath(.*) http://otherhost/otherpath$1     http://otherhost/otherpath/pathinfo
                                                via external redirection

^/somepath(.*) http://otherhost/otherpath$1 [R] http://otherhost/otherpath/pathinfo
                                                via external redirection
                                                (the [R] flag is redundant)

^/somepath(.*) http://otherhost/otherpath$1 [P] http://otherhost/otherpath/pathinfo
                                                via internal proxy

Inside per-directory configuration for /somepath
(i.e., file .htaccess in dir /physical/path/to/somepath containing RewriteBase /somepath)
for request "GET /somepath/localpath/pathinfo":

Given Rule                                      Resulting Substitution
----------------------------------------------  ----------------------------------
^localpath(.*) otherpath$1                      /somepath/otherpath/pathinfo

^localpath(.*) otherpath$1  [R]                 http://thishost/somepath/otherpath/pathinfo
                                                via external redirection

^localpath(.*) otherpath$1  [P]                 not supported, because silly!
----------------------------------------------  ----------------------------------
^localpath(.*) /otherpath$1                     /otherpath/pathinfo

^localpath(.*) /otherpath$1 [R]                 http://thishost/otherpath/pathinfo
                                                via external redirection

^localpath(.*) /otherpath$1 [P]                 not supported, because silly!
----------------------------------------------  ----------------------------------
^localpath(.*) http://thishost/otherpath$1      /otherpath/pathinfo

^localpath(.*) http://thishost/otherpath$1 [R]  http://thishost/otherpath/pathinfo
                                                via external redirection

^localpath(.*) http://thishost/otherpath$1 [P]  not supported, because silly!
----------------------------------------------  ----------------------------------
^localpath(.*) http://otherhost/otherpath$1     http://otherhost/otherpath/pathinfo
                                                via external redirection

^localpath(.*) http://otherhost/otherpath$1 [R] http://otherhost/otherpath/pathinfo
                                                via external redirection
                                                (the [R] flag is redundant)

^localpath(.*) http://otherhost/otherpath$1 [P] http://otherhost/otherpath/pathinfo
                                                via internal proxy

Example:

We want to rewrite URLs of the form

/ Language /~ Realname /.../ File

into

/u/ Username /.../ File . Language

We take the rewrite mapfile from above and save it under /path/to/file/map.txt. Then we only have to add the following lines to the Apache server configuration file:

RewriteLog   /path/to/file/rewrite.log
RewriteMap   real-to-user               txt:/path/to/file/map.txt
RewriteRule  ^/([^/]+)/~([^/]+)/(.*)$   /u/${real-to-user:$2|nobody}/$3.$1