Ubuntu's garbage collection cron job for PHP sessions takes 25 minutes to run, why?

thenickdude asked:

Ubuntu has a cron job set up which looks for and deletes old PHP sessions:

# Look for and purge old sessions every 30 minutes
09,39 *     * * *     root   [ -x /usr/lib/php5/maxlifetime ] \
   && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 \
   -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) ! -execdir \
   fuser -s {} 2> /dev/null \; -delete

My problem is that this process is taking a very long time to run, with lots of disk IO. Here’s my CPU usage graph:

CPU usage graph

The cleanup running is represented by the teal spikes. At the beginning of the period, PHP’s cleanup jobs were scheduled at the default 09 and 39 minutes times. At 15:00 I removed the 39 minute time from cron, so a cleanup job twice the size runs half as often (you can see the peaks get twice as wide and half as frequent).

Here are the corresponding graphs for IO time:

IO time

And disk operations:

Disk operations

At the peak where there were about 14,000 sessions active, the cleanup can be seen to run for a full 25 minutes, apparently using 100% of one core of the CPU and what seems to be 100% of the disk IO for the entire period. Why is it so resource intensive? An ls of the session directory /var/lib/php5 takes just a fraction of a second. So why does it take a full 25 minutes to trim old sessions? Is there anything I can do to speed this up?

The filesystem for this device is currently ext4, running on Ubuntu Precise 12.04 64-bit.

EDIT: I suspect that the load is due to the unusual process “fuser” (since I expect a simple rm to be a damn sight faster than the performance I’m seeing). I’m going to remove the use of fuser and see what happens.

My answer:

Congratulations on having a popular web site and managing to keep it running on a virtual machine for all this time.

If you’re really pulling in two million pageviews per day, then you’re going to stack up a LOT of PHP sessions in the filesystem, and they’re going to take a long time to delete no matter whether you use fuser or rm or a vacuum cleaner.

At this point I’d recommend you look into alternate ways to store your sessions:

  • One option is to store sessions in memcached. This is lightning fast, but if the server crashes or restarts, all your sessions are lost and everyone is logged out.
  • You can also store sessions in a database. This would be a bit slower than memcached, but the database would be persistent, and you could clear old sessions with a simple SQL query. To implement this, though, you have to write a custom session handler.

View the full question and any other answers on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.