OddThinking

A blog for odd things and odd thoughts.

Recent Down-Times, An Exploration

Recently (since the move to a new server a few months ago), Somethinkodd.com has been down for a couple of minutes, many many times.

If you visit the site during the downtime, you get a message from my web-host ISP explaining that the site has exceeded its 20% CPU quota, and hence has been taken off-line for a few minutes.

It only seems to happen when I am working on my site (However, it occurred to me that the problem might be frequently occurring when I am not working on my site, but I hadn’t noticed, because I wasn’t working on my site at the time…)

I originally guessed that the problem might be related to the use of Unison, and I was going to get around to work out how to “nice” it.

Today, the problem occurred when I hadn’t used Unison, and I realised my original assumption was wrong. Checking the logs, I found that just prior to the downtime, the Atom feed on one of my old test blogs was being hit moderately hard – 300 times in 40 seconds.

I looked up the IP address to find out who it was who was peeking at my (unpublished) test blog. Uh oh, I recognize that domain name. That’s my web-server.

At this stage, it seems that there is a nasty denial-of-service bug in WordPress’s (pretend) cron jobs. I almost certainly have a misconfiguration somewhere (Let’s be clear about that; I have treated my test blogs harshly, and migrated them around several servers and URLs. That’s why I have them.) A scheduled task is trying to fetch a non-existent page, which is returning a 404 error. Either that is causing it to simply try again or (perhaps more likely) it is being added to a list of incomplete jobs that get tried again in the next cron job run.

Hopefully, I will nail the problem soon. Let me know if you see any downtime.


Comment

  1. it is being added to a list of incomplete jobs that get tried again in the next cron job run

    Yay ! a Fibonacci bug !!!

Leave a comment

You must be logged in to post a comment.