Articles about Apache
I came across a great Ansible tip recently that showed how to use the validate option of the template module in Ansible. A great solution for testing configuration before it goes live. However it doesn't work with Apache (apachectl -t -f) - you get a validation error like "Invalid command 'Order', perhaps misspelled or defined by a module not included in the server configuration" for syntax that is perfectly valid. A bit of investigation shows though that this is expected and you have to go a bit further - but I came up with a solution.
I regularly encounter issues that arise from reviews of server security, log reviews etc that provide good examples of how Ansible can be used to respond to an issue.
If you are using a utility like Logwatch to keep an eye on activity on your servers you may occasionally see an entry such as 'Connection attempts using mod_proxy' - Addressing this with Ansible is simple, and maintains the state of your security measures.
This is a follow on post to the 'Using Apache to block Spammers' post.
It shows how to use Includes in your Apache configuration to re-use useful rules.
Log files can get filled up with repeated calls to files such as favicon, robots.txt, images, css js etc
This can be a pain when you need to scan the logs for issues and they are full of unimportant requests.
This is especially so if you use Ultimate Cron in Drupal and run cron every minute - the logs get swamped with the cron calls.
Mostly you want to log the initial request for a page and not all of the resources subsequently requested.
Troubleshooting other issues may mean you would log files such as favicon, images etc - but generally they needlessly fill up your logs.
There are certain PHP files that you want access to but don't want to make public.
Common examples of these are:
You also don't really want to deploy these on all of your sites on a server nor have them in your git repositories for sites.
A neat way of dealing with this is to use rewriting in your web server config files (e.g. Apache, NGINX, IIS etc) to do the following:
One of the housekeeping tasks that I undertake is to review the activity of comment spammers on our websites.
All of our Drupal sites use Mollom to keep us almost Spam free (big thumbs up to Mollom!)
But if you review the logs you can see that Mollom is protecting you from an alarming rate of attack and it would be good to not bother ourselves or Mollom with such traffic is possible. So the solution is to drop the traffic upstream of our web sites.
There are many ways of doing this from Firewalls to Drupal modules.
If you are developing commerce sites and review your logs regularly, chances are you will come across 404 errors looking for crossdomain.xml. We get a lot from the plugins that looks for coupons on e-commerce sites (e.g. Drop Down Deals). In fact you are likely to get them on any sites you develop - but we have seen them more frequently on ecommerce sites.
A general housekeeping task for CMS systems such as Wordpress and Drupal and other websites and good practice to keep your site SEO high is to make sure you are gracefully handling missing pages (404 errors).
One of the routine tasks to carryout is checking for crawl errors in Google Webmaster tools. If you see any missing pages in the list it is worth making sure you have some measures in place to handle these and ideally issue a 301 redirect so that Google and other search engines update their indexes.
Update: we have now combined this site with our main site! And all the articles are available in the one site.
We wanted to create a home for our knowledgebase and created the website technology.blue-bag.com. Here we provide a range of articles and posts covering issues from using CMS systems such as Drupal through to security articles covering securing access to your website.
Whether you are running Drupal,Wordpress, Expression engine, Joomla or in fact any web site one of the regular tasks you should carryout on your web site is a bit of log analysis. It is often left up to modules, plug ins or someone else to protect your web site until it too late.
We all rely on Google Analytics to tell us about visitors and maybe use our log analysis software (AWStats, Webaliser etc) to report on log entries - but it is always worth using tools locally to dig deeper into your logs. These can range from simple reports on accesses to your site to more detailed forensic analysis of site activity.
By doing this we get to know better how visitors are accessing our site and can uncover some interesting answers to questions such as:
- How often is Google actually spidering my site?
- How many errors am I getting and what are they?
- Who is stealing my content?
- Is anyone trying to crack my site?
In this post I will briefly cover some useful techniques to analyse you logs and see if any one is abusing your hospitality.
It is easy to forget that the files in your web site are visible to anyone even if they are not linked to or are not files normally requested. In this post we look at how to use the.htaccess file to control access to your site.
I use the .htaccess file a lot on hosted servers. On our own servers I prefer to use the httpd.conf as it performs better and is not reevaluated on every request. But if you are on a hosted server the .htaccess is your earliest port of call for handling incoming traffic and can be more efficient than using modules for certain tasks. One common gotcha is how to discard the querystring for a redirect.