Articles about Performance
Log files can get filled up with repeated calls to files such as favicon, robots.txt, images, css js etc
This can be a pain when you need to scan the logs for issues and they are full of unimportant requests.
This is especially so if you use Ultimate Cron in Drupal and run cron every minute - the logs get swamped with the cron calls.
Mostly you want to log the initial request for a page and not all of the resources subsequently requested.
Troubleshooting other issues may mean you would log files such as favicon, images etc - but generally they needlessly fill up your logs.
One of the housekeeping tasks that I undertake is to review the activity of comment spammers on our websites.
All of our Drupal sites use Mollom to keep us almost Spam free (big thumbs up to Mollom!)
But if you review the logs you can see that Mollom is protecting you from an alarming rate of attack and it would be good to not bother ourselves or Mollom with such traffic is possible. So the solution is to drop the traffic upstream of our web sites.
There are many ways of doing this from Firewalls to Drupal modules.
We were engaged by the History of Parliament Trust to work on their flagship website that publishes the results of research into the members and constituencies of all British parliaments since 1386. This data rich web site is managed by the Drupal CMS and has tens of thousands of 'nodes'. We were tasked with solving various issues with the site's performance and importing data from the DTP files used to create the published volumes.
Whether you are running Drupal,Wordpress, Expression engine, Joomla or in fact any web site one of the regular tasks you should carryout on your web site is a bit of log analysis. It is often left up to modules, plug ins or someone else to protect your web site until it too late.
We all rely on Google Analytics to tell us about visitors and maybe use our log analysis software (AWStats, Webaliser etc) to report on log entries - but it is always worth using tools locally to dig deeper into your logs. These can range from simple reports on accesses to your site to more detailed forensic analysis of site activity.
By doing this we get to know better how visitors are accessing our site and can uncover some interesting answers to questions such as:
- How often is Google actually spidering my site?
- How many errors am I getting and what are they?
- Who is stealing my content?
- Is anyone trying to crack my site?
In this post I will briefly cover some useful techniques to analyse you logs and see if any one is abusing your hospitality.