 |
.htaccess Files
Return to the help index
Table of Contents
- Introduction
- Error Documents
- Limiting Access by IP
- Unmasking Proxy Users
- GeoIP Data
- Classic Authentication
- Enhanced Authentication
- MySQL Based Authentication
- Hiding Files
- Redirections
- Adding MIME Types
- Preventing Image Theft
- Speed Limits
- Preventing Prefetching
- Preventing Caching
- Overriding PHP Options
- Rewriting URLs (mod_rewrite)
- User Tracking (mod_usertrack)
- Auto Indexing (mod_autoindex)
- WebDAV Support
- Content Negotiation
- Treating .htm* Files as PHP Files
Introduction
An .htaccess file (pronounced "H-T-Access", or sometimes "Dot-H-T-Access") is basically a web server configuration file that allows you to change the way the web server works for certain parts of your web site. They are plain text files and they are always named ".htaccess" (note the dot at the front).
.htaccess files can be used to control many things, like restricting access to part of your web site based on a username/password or IP address, providing custom error messages, controlling bandwidth consumption, and more.
The web server looks for an .htaccess file in the same directory as the requested file. If one is not found, it then moves up to the parent directory and checks for one there. It continues backing up until it either finds an .htaccess file or reaches the root of your web space. So if you place a single .htaccess file in your root directory it will apply to all files in your entire web site and there is no need to replicate it in each directory. If you then place an .htaccess file within one particular subdirectory, it will apply to those files only and the higher level .htaccess file is ignored for them.
Be careful when making changes to your .htaccess files. Adding an unrecognised command will result in a "500 Internal Server Error" page for all files in the directory until you fix it.
NOTE: Many of these features were created by us and can only be found at Islandnet.com. These features are marked with an Islandnet.com logo like this: 
[Back to top]
Error Documents
The most common use for .htaccess files is to specify an alternate message for various errors. For example, if someone tries to access a file that doesn't exist, the web server will display the classic (and generally useless and ugly) "404 file not found" error. It is nice to be able to provide a better looking error page that has your site's look and feel along with a more descriptive error, and possibly some links to a search tool or site map.
Different types of error conditions have different error code numbers. A "file not found" error is number 404. An internal server error (like a CGI script that exits prematurely) is number 500. Here is a list of the more common ones that you might want to provide an error page for (there are many others, but in general there is little or no point in specifying alternate error pages for them):
| 400 | Bad Request This is generally only seen if someone is using something other than a legitimate web browser to access your site. The request they made does not conform to the proper HTTP format. |
| 403 | Forbidden This one means that the request cannot be serviced because the server configuration prohibits it. For example, if you were to create a web page that had invalid or missing file permissions, the web server would be unable to read it.. |
| 404 | Not Found This one is quite common. It occurs when someone asks for a file that does not exist. This could be due to a bad link, or a typo on the user's part when entering your URL, etc. |
| 405 | Method Not Allowed Web servers can accept a variety of commands, with the most common being GET, POST, and HEAD (to get a document, post to a form, or to get the headers for a document. There are others that are less common or not supported. This error is given whenever an unsupported (or nonexistant) request type is encountered. |
| 500 | Internal Server Error Despite the name, this is seldom caused by a server problem (although it can be). It is usually caused by a CGI script that exits without generating any output, or an .htaccess file that contains invalid commands. |
| 503 | Service Unavailable This error is generated in the event that the web server is overloaded or unable to access certain resources. It happens if the server load exceeds a certain threshold, or if the web server sees too many errors from a specific client, or when a bandwidth or hit limit that you've configured is reached, etc. |
To provide an alternate error message for any particular error type, you simply need to add one line to your .htaccess file that looks like this:
ErrorDocument 404 /404.php
Instead of specifying a file you could provide an URL (which could be anywhere, it doesn't even need to be part of your site):
ErrorDocument 404 http://some.other.site.com
You can also provide the raw text of the error message:
ErrorDocument 404 "<h1>404 Error</h1> The file you asked for does not exist
Note the quote at the beginning of the text. That is necessary otherwise the web server will try and treat it as a filename. There should not be a matching quote at the end of the text.
A useful technique is to point error documents at a PHP script which can in turn generate a custom error message that takes into account the actual filename that was requested (it's available via various global variables and/or environment variables). The script may even be able to figure out the real document and redirect the visitor to it.
Some versions of Microsoft's Internet Explorer will ignore your custom error document if it's less than 512 bytes in size, and display it's own error screen instead. To get around this, make sure your error page is at least 512 bytes in size.
[Back to top]
Limiting Access by IP
It is sometimes necessary to block certain visitors from accessing your site. You can do this by adding access rules to your .htaccess file. For example, if you wanted to block accesses from 1.2.3.4 specifically, as well as 5.6.7.*, but allow everyone else, you could add these lines:
order deny,allow
deny from 1.2.3.4
deny from 5.6.7.
Or, if you wanted to only allow accessed from 5.6.7.* and deny everyone else, you could do this:
order allow,deny
allow from 5.6.7.
The actual order of the individual "allow" and "deny" lines is meaningless. The "order" lines tells the server to either check all the "allows" first or all the "denys". The first matching one "wins".
Note that the IP address used for this test is that of the machine that connects to the server. In the event that the visitor is behind one or more proxy servers, this may not have the effect you want. For example, blocking a proxy server will certainly stop a specific computer from accessing your site, but it will also block anyone else who happens to use that same proxy server. For a way to block the visitors "real" IP address, read the next section.
[Back to top]
Unmasking Proxy Users 
Whenever you refer to the visitor's IP address, by using the REMOTE_ADDR environment variable for example, you often don't get the visitor's real IP address. If they are accessing your site through a proxy server then the IP you actually see on this side is the proxy's, not the user's.
Consider this fairly common scenario: you have a discussion forum on your web site and someone keeps signing up and making abusive postings. You finally ban their IP address, but find that you've actually wound up banning all users from a particular ISP because the "bad guy" was using their proxy server and your forum software isn't smart enough to figure out their real IP address.
While the "correct" thing to do is have the forum software to do its own IP sleuthing, this isn't always possible. So we've created an option that causes the web server to replace the IP address, when possible, with the "real" address. To do this you need only to add one line to your .htaccess file:
UseRealIP on
This will affect all web pages and scripts that rely on the REMOTE_ADDR environment variable, and it also affects your web logs. It also adds a new environment variable called PROXY_ADDR which contains the IP address that was replaced.
[Back to top]
GeoIP Data 
If you enable this feature, it will look up the visitor's IP address in the GeoIP database and set some environment variables that you can then use in your web pages and scripts. To enable GeoIP lookups simply add this line to your .htaccess file:
GeoIP on
You will now have several extra environment variables available to you:
GEOIP_COUNTRY_CODE is the two character country code (CA for Canada, JP for Japan, and so on).
GEOIP_COUNTRY_CODE3 is the three character country code (CAN for Canada, JPN for Japan, and so on).
GEOIP_COUNTRY_NAME is the full name of the country (eg: "Canada", "Japan", "United States", etc.)
GEOIP_REGION is the region (province, state, etc.) if it is known, or blank if it is not known.
GEOIP_CITY is the city name if it is known, or blank if it is not known.
GEOIP_POSTALCODE is the postal code if it is known, or blank if it is not known.
GEOIP_METROCODE is the metro code number if it is known, or 0 if it is not known.
GEOIP_AREACODE is the area code number if it is known, or 0 if it is not known.
GEOIP_LATITUDE is the latitude in decimal form if it is known, or 0 if it is not known.
GEOIP_LONGITUDE is the longitude in decimal form if it is known, or 0 if it is not known.
Click here for a list of all the country codes and names.
[Back to top]
Classic Authentication
It is possible to require a valid username and password to be entered by the visitor before allowing access to parts of your site. The standard way of doing this is a little complicated, but it's supported to maintain compatibility. See the next section for some easier ways to do the same thing.
To protect a directory of files you need to add these lines to your .htaccess file:
AuthType Basic
AuthName "Realm Name"
AuthUserFile .htpass
Require valid-user
The first line says that you want to use "basic" style authentication. There is only one other style, which is "digest", but that is not currently supported.
The second line specifies the "realm". Basically, this is a brief message that will appear as part of the browser's username/password prompt. Unfortunately, this is about the only customizing you can do. Everything else, like the layout, colours, and any other text that appears is entirely up to the browser and will vary quite a bit.
The third line indicates the file that contains a list of username/password pairs. In this example we've named it ".htpass" but it could be anything you like (just make sure it's not a file that people can view with their web browser!) You must create and maintain this file. It is a plain ASCII file, and each line must contain exactly one username and password, separated by a colon. The tricky bit is that the password must be encrypted (and this is where it gets a little complicated, since it means you usually have to have access to a special script to generate the password entries). Normally you must provide the complete path to the file, which isn't something you'll generally know, so we've modified things so that if the filename does not start with a slash, then the file is assumed to be in (or relative to) the root directory of your web site.
The fourth line says that any valid user from the file is allowed access, assuming they enter the right password. Instead of "valid-user" you could put something like "user john bob jane" which means that only the users "john", "bob", or "jane" can access this info, even if some other valid username/password is entered).
Here is a sample user file so you know what it looks like. Note that the passwords are encrypted:
john:6omti8lPO9UOo
sally:ypL4t..gZV3X6
It is also possible to grant access based on "groups". You could, for example, create a group named "staff" and specify which users belong to that group, and another one named "sales" for just the sales staff, and so on. To do this you must first create a plain text file, similar to the user file, but each line starts with a group name, then a colon, then one or more username separated by spaces. You would then add this line to your .htaccess file:
AuthGroupFile .htgroup
Here is a sample group file:
staff: john jane sam sally
sales: john sally
support: jane sam
Then in your .htaccess file you would change the Require line:
Require group staff
That would mean that after entering a successful username and password (based on the user file), access is only granted if the username also happens to be in the "staff" group.
Note that groups refer to users that have previously been defined in the user file. You do not need to provide a password for a group.
Note that successful authentications are cached for several minutes to minimize the work the server has to do and speed things up. Changing or deleting password may not have an immediate effect.
[Back to top]
Enhanced Authentication 
Classic password protection requires you to maintain a separate file that contains encrypted passwords. While maintaining backward compatibility, we have enhanced this feature by adding several options that make password protection setup even easier.
For starters, it is no longer necessary for passwords in the user file to be encrypted. If you'd prefer, you can leave the passwords in plain text, or encrypt them using other methods besides the classic UNIX style encryption (like Base64, MD5, and SHA1). To use such a user file, all you need to do is add an option to your AuthUserFile line that tells the server what format the passwords are in. For example:
AuthUserFile .htpass plain
You can specify "plain" for plain text (no encryption) passwords, "base64", "crypt", "md5", or "sha1". Except for "plain", all the others require you to encrypt the password. "base64" is the least secure (and in fact isn't really a form of encryption), with "crypt", "md5", and "sha1" increasingly secure.
As an example, here is the same user file from the previous section, but with plain text passwords:
john:secret
sally:letmein
And here it is again, but with MD5 passwords:
john:5ebe2294ecd0e0f08eab7690d2a6ee69
sally:0d107d09f5bbe40cade3de5c71e9e9b7
In most cases people only have a few username/password entries, so it's more convenient to list them right inside the .htaccess file itself instead of in an external file. Instead of having an "AuthUserFile" line you could have one or more "AuthUser" lines like this (with plain text passwords):
AuthUser john secret plain
AuthUser sally letmein plain
Or with MD5 passwords:
AuthUser john 5ebe2294ecd0e0f08eab7690d2a6ee69 md5
AuthUser sally 0d107d09f5bbe40cade3de5c71e9e9b7 md5
You can even combine both methods by using AuthUserFile and AuthUser in the same .htaccess file. In that case, the inline AuthUser entries are searched first, then the AuthUserFile.
In a similar manner you can create groups without an external file using the AuthGroup command. For example:
AuthGroup staff john jane sam sally
AuthGroup sales john sally
AuthGroup support jane sam
Note that successful authentications are cached for several minutes to minimize the work the server has to do and speed things up. Changing or deleting password may not have an immediate effect.
[Back to top]
MySQL Based Authentication 
Instead of (or in addition to) storing usernames and passwords in a text file, or right in the .htaccess file, you also have the option of storing them in a MySQL database. This is particularly useful if you already have a MySQL database of users in place for some other part of your web site (like a forum), of if you have a large number of users.
To consult a MySQL database you need to add several lines to your .htaccess file:
AuthMySQLHost sql1.islandnet.com
AuthMySQLPort 3306
AuthMySQLUsername test
AuthMySQLPassword youwish
AuthMySQLDatabase myusers
AuthMySQLQuery SELECT password FROM users WHERE username='{username}'
AuthMySQLType plain
The first five values provide the information necessary to connect and log into a specific MySQL server. The MySQLPort value is optional and defaults to 3306, which is the standard MySQL port.
The MySQLQuery value is the key one here. This is the SQL query that will be sent to the server. It should return a single field, which is the user's password. Any values enclosed in braces will be replaced prior to performing the query. So '{username}' will be replaced with the actual username entered by the user. You could also refer to such values as '{REMOTE_ADDR}' (to restrict based on IP address), '{HTTP_USER_AGENT}' (for the browser agent value), and so on.
The final line indicates what format the returned password will be in. This is optional, and the default is 'plain' for an unencrypted plain text password. If passwords in your SQL table are stored in Base64, Crypt, MD5, or SHA1 format then you must indicate so here with 'base64', 'crypt', 'md5', or 'sha1'.
If your SQL table contains passwords that are encrypted using MySQL's PASSWORD() function, then you can use a query like this:
AuthMySQLQuery SELECT {password} FROM users WHERE username='{username}' AND password=PASSWORD('{password}')
That will return the password as entered by the visitor, but only if the encrypted form of it matches what is in the table.
Note that successful authentications are cached for several minutes to minimize the work the server has to do and speed things up. Changing or deleting password may not have an immediate effect.
[Back to top]
Hiding Files
It is sometimes necessary to place a file in your web space that isn't meant for public viewing. For example, a PHP script may store a database username and password in a file, and you wouldn't want people to load that file directly. You can block access to specific files via your .htaccess file like this:
<Files database.conf>
order allow,deny
deny from all
</Files>
Note the "order" and "deny" commands - this works the same as the IP Limiting feature described above, and you could be selective by adding IP addresses. The key thing to note here though is that the limiting rules are enclosed in a pair of "Files" tags. The opening tag indicates the file(s) that the following block of rules will apply to. In this example, we are blocking access to the file named "database.conf".
This only blocks people "on the outside" from accessing that file. Your CGI scripts are not bound by .htaccess files and can access the files, as can FTP, etc.
It is fairly common to block access to any files that start with ".ht" (like the .htaccess file itself). In fact, the web server itself does this by default using something like this:
<Files .ht*>
order allow,deny
deny from all
</Files>
Note how you can use the '*' wildcard in filenames. More complex pattern matching can be done with the <FilesMatch ...> tag, which works the same way except that it uses regular expressions instead of simple wildcards.
[Back to top]
Redirections
There may come a time when you need to redirect certain requests to another URL. For example, you've published an ad somewhere and it contains a typo. Or you've renamed or moved part of your web site. To redirect a specific request you only need to add a single line to your .htaccess file:
Redirect /oldpage.html http://mysite.com/newpage.html
Note that the first value after the Redirect command is the requested file, without the domain name, but the second value is a complete URL. In this example, if someone tried to access "http://mysite.com/oldpage.html" they would be redirected to "http://mysite.com/newpage.html".
By default this command will use a "302 temporary" error code to redirect the visitor. You can optionally insert the keyword "permanent", "temp", "seeother", or "gone" after the Redirect command to return a "301 permanent", "302 temporary", "303 see other", or "410 gone" status (when returning a "gone" status you should leave out the new URL).
[Back to top]
Adding MIME Types
While we try to maintain a fairly up-to-date list of known MIME types, you may have a need to serve up files of a particular type that isn't (yet) listed in the master MIME type file. If you do this, the browser that downloads the file may not know what to do with it. For example, if the server didn't know that files that end with ".mp3" were of the MIME type "audio/mpeg" then instead of playing the file after downloading it the user's browser might instead give an error message. You can tell the server about new mime types via your .htaccess file like this:
AddType audio/mpeg mp3
That example tells the server that any files that end with ".mp3" should be sent with a MIME type of "audio/mpeg" (note that all our servers already know about mp3 files).
[Back to top]
Preventing Image Theft
If you have images on your site that are useful to others, they may link them into their own pages without your knowledge. This means that when anyone views their web site, the images themselves will be downloaded from your site! You can prevent this by ensuring that the "referer" (that's the URL of the web page that referenced the images) is your own and not somebody else's. You can do this by adding these lines to your .htaccess file:
SetEnvIfNoCase Referer "^https?://(www\.)?example.com/" ok=1
SetEnvIfNoCase Referer "^$" ok=1
<FilesMatch "\.(gif|png|jpe?g)$">
order allow,deny
allow from env=ok
</FilesMatch>
The first line checks to see if the referer (yes I know that's spelled wrong, but that's the official name for it) is "www.example.com" or "example.com" (you would, of course, change this to reflect your own domain name). If it is, it sets the variable "ok" to "1". The second line looks for a blank referrer and allows that as well. Then we have a set of limit rules enclosed in a <FilesMatch ...>" section that only allow access to image files (.gif, .png, .jpg, or .jpeg) if the variable "ok" s set to "1".
A more detailed description of this is available here.
Note that it is possible to do something similar by way of the "mod_rewrite" mechanism (also done via .htaccess files), but this is less portable (not all sites support mod_rewrite) and it's also less efficient.
[Back to top]
Speed Limits 
This feature allows you to specify certain "speed limits" which give you some control over excessive accesses to your site. Currently there are three limits that you can impose. You can specify the maximum number of bytes per time period that the server should send, the maximum number of hits per time period, and the maximum number of pages per time period.
For example, if you wanted to ensure that traffic to your web site never exceeds one megabyte per minute, you would add a line like this:
bytelimit 1m 60s
The first value is the number of bytes (you can add 'k' to turn it into kilobytes, 'm' for megabytes, or 'g' for gigabytes). The second value is the time period in seconds (and you can add 'm' for minutes, 'h' for hours, or 'd' for days).
When the limit is reached the web server will send a 503 error code (service unavailable) until the time period has passed. You might want to use the ErrorDocument command to specify a better looking error message (but if your goal is to cut down on bandwidth consumption, make sure it's lean and clean!).
In a similar manner you can limit the number of hits. For example, to limit traffic to a maximum of 1000 hits per 30 seconds you would do this:
hitlimit 1000 30s
Note that a "hit" is a request for any file. If you just want to limit page hits (ie: requests for *.html, *.htm, *.shtml, and *.php files) and ignore images, style sheets, and other files, you can do this instead:
pagelimit 500 30s
Please note that this is not always 100% precise. Due to the time involved in handling a request, combined with the fact that multiple requests may occur at the same time, multiplied by the number of servers in the cluster, the limits are only approximate (but never less than what you want).
[Back to top]
Preventing Prefetching 
Prefetching is a technique used by some browsers to "accelerate" web access by going ahead and downloading pages, images, and anything else that it thinks the user might click on next while they are busy reading the current page. If they do happen to request those files next, they are already loaded and ready to go.
Sounds great, but what if they don't request those files next? The bandwidth used by the prefetching was wasted. If this happens a lot it might noticeably impact your bandwidth consumption, and possibly your costs.
In most cases it is possible to tell the difference between a regular document request and a prefetch request. In these cases you can choose to deny the download by adding this line to your .htaccess file:
Prefetching Off
Prefetch requests will then get a "403 forbidden" error, but users will still get your page when they click on it normally, so they'll never even know the prefetch was declined.
[Back to top]
Overriding PHP Options 
While many PHP settings can be altered at run-time via PHP's ini_set() function, many cannot, and some times it's not practical to modify your PHP scripts.
For example, you might install a third-party PHP script that expects "magic_quotes" to be enabled. Problem is, it's disabled on the server by default, so the script won't work properly. While you could modify the script to either enable this feature at run time, or so that it doesn't rely on this setting at all, it's often easier to turn the option on inside your .htaccess file. In this example you would do it by adding this line:
PHPOption magic_quotes_gpc on
As you can see, PHP settings start with the "PHPOption" command, followed by the actual PHP settings you wish to change, then the new value. You may optionally put an equal sign between the setting and the value if you'd like.
Obviously for security reasons we can't allow people to modify every available PHP settings (like allowing people to disable safe mode, or to change the file upload directory), but here are the ones that you can change:
- allow_call_time_pass_reference
- allow_url_include
- always_populate_raw_post_data
- asp_tags
- auto_detect_line_endings
- display_errors
- implicit_flush
- magic_quotes_gpc
- magic_quotes_runtime
- precision
- register_argc_argv
- register_globals
- register_long_arrays
- short_open_tag
If you'd like to see other PHP options available here, let us know. As long as it doesn't compromise security or performance we'll probably add it.
Note that these options can also be configured via the host manager settings on our helpdesk (plus several other PHP options).
[Back to top]
Rewriting URLs (mod_rewrite)
The Apache web server (which we use) comes with an optional module called "mod_rewrite" that is disabled by default. Some ISPs enable it, many do not. We do. This module provides a set of very powerful commands that allow you to rewrite URLs on the fly.
Details on how to use this module are really beyond the scope of this document, so we'll refer you to the official Apache documentation (not for the faint of heart!)
[Back to top]
User Tracking (mod_usertrack)
Another optional Apache module is "mod_usertrack", which allows you to track individuals as they navigate your site through the use of cookies. For details on how to use this module, please see the official Apache documentation.
[Back to top]
Auto Indexing (mod_autoindex)
When a visitor requests an URL without a specific file name, the default action is to look for a file named "homepage.html", "index.html", or "index.php" (in that order, you can also use ".htm" instead of ".html") and display that. If there is no such file, the user gets a 403 Permission Denied error.
However, if you enable the auto indexing option, then the server will display a list of the files in the directory. This is not something you would normally want to enable, but it's handy for situations where you want people to be able to browse a set of files in a manner similar to an FTP server.
To enable this feature you only need to add one line to your .htaccess file:
Options +Indexes
There are several customizations that you can apply, like controlling the sort order of the files, adding a header or footer message, and even adding icons to represent different file types. For these kinds of details, please refer to the official Apache documentation.
[Back to top]
WebDAV Support
The WebDAV protocol is an extension to the normal functionality of a web server which allows client programs to save, rename, and delete files in addition to viewing them. Certain web page authoring tools and calendar programs can use WebDAV for publishing content. To enable WebDAV support within a given directory, simply add one line to your .htaccess file:
DAV on
WARNING: by default this will allow anyone to read/write/delete/rename files in the directory. It is always a good idea to configure authentication so that a username and password are required for WebDAV enabled directories!
Content Negotiation
Content negoation is a nifty feature that lets the web server display different content to the visitor depending on different things such as the visitor's preferred language.
For example, if a visitor connects to your site and their browser indicates that they would prefer a Japanese version of your page using a specific character set, it will give them that version instead of the default one (assuming you have prepared a version that matches their request of course).
Or a browser may indicate that it will accept both GIF and JPG images, but it prefers JPG if both are available.
This can be done through the use of a special "map file" that maps certain request preferences to specific files (not to be confused with the "map files" that our email system uses to map email addresses within your domains), or through the use of the "MultiViews" option.
This topic is really beyond the scope of this document. For details, please refer to the official Apache documentation.
[Back to top]
Treating .htm* Files as PHP
When one wants to make a pre-existing site more dynamic and add PHP based features to it, you normally have to rename all your .html and .htm files to .php files so that the web server will treat them as PHP scripts. If the site is large, this can be a vary painful process, since it also means updating all the links between pages. In this case it may be easier to get the web server to treat all your .html and .htm files as if they were .php files without renaming them. This can be done by adding one line to your .htaccess file:
AddHandler application/x-httpd-php .html .htm
[Back to top]
|  |