Articles about Rewriting

Apache: Better blocking with common rules

This is a follow on post to the 'Using Apache to block Spammers' post.
It shows how to use Includes in your Apache configuration to re-use useful rules.

Apache rewrites to control access to PHP files

There are certain PHP files that you want access to but don't want to make public.
Common examples of these are:

  • PHPInfo.php
  • APC.php
  • memcache.php

You also don't really want to deploy these on all of your sites on a server nor have them in your git repositories for sites.

A neat way of dealing with this is to use rewriting in your web server config files (e.g. Apache, NGINX, IIS etc) to do the following:

Why are you looking for my crossdomain.xml?

If you are developing commerce sites and review your logs regularly, chances are you will come across 404 errors looking for crossdomain.xml. We get a lot from the plugins that looks for coupons on e-commerce sites (e.g. Drop Down Deals). In fact you are likely to get them on any sites you develop - but we have seen them more frequently on ecommerce sites.

handling 404 errors on hosted CMS

A general housekeeping task for CMS systems such as Wordpress and Drupal and other websites and good practice to keep your site SEO high is to make sure you are gracefully handling missing pages (404 errors).
One of the routine tasks to carryout is checking for crawl errors in Google Webmaster tools. If you see any missing pages in the list it is worth making sure you have some measures in place to handle these and ideally issue a 301 redirect so that Google and other search engines update their indexes.

Safe guard your web site with routine web log analysis and forensics

Whether you are running Drupal,Wordpress, Expression engine, Joomla or in fact any web site one of the regular tasks you should carryout on your web site is a bit of log analysis. It is often left up to modules, plug ins or someone else to protect your web site until it too late.
We all rely on Google Analytics to tell us about visitors and maybe use our log analysis software (AWStats, Webaliser etc) to report on log entries - but it is always worth using tools locally to dig deeper into your logs. These can range from simple reports on accesses to your site to more detailed forensic analysis of site activity.
By doing this we get to know better how visitors are accessing our site and can uncover some interesting answers to questions such as:

  • How often is Google actually spidering my site?
  • How many errors am I getting and what are they?
  • Who is stealing my content?
  • Is anyone trying to crack my site?

In this post I will briefly cover some useful techniques to analyse you logs and see if any one is abusing your hospitality.

Securing access to files on your website

It is easy to forget that the files in your web site are visible to anyone even if they are not linked to or are not files normally requested. In this post we look at how to use the.htaccess file to control access to your site.

To www or not to www

One of the trends these days is to lose the www. from your domain name.
Arguments for are usability (e.g. you don't have as much typing to do - it is easier to not have to say dubyadubyadubya every time you give out your web address).
There are counter arguments of course to do with cookie control and sites that have sub-domains etc.
What is important though is that you choose one and stick to it. In any event you should decide for one and not allow both.
Supporting both will cause duplicate content in Google and you may suffer in SEO terms.

Path Rewriting and changes to Path Auto

Quick note: On a site I worked on recently I made a change to the Pathauto settings and needed to create a load of Redirects for previous URLs. Fortunately there is a quick way to do this using wildcards in the htaccess.