Subject: Re: bin/10775: cron exits on stat failure
To: <>
From: Robert Elz <kre@munnari.OZ.AU>
List: tech-userlevel
Date: 08/26/2000 06:16:54
Date: Wed, 09 Aug 2000 17:20:55 +1000
From: Robert Elz <kre@munnari.OZ.AU>
Message-ID: <17699.965805655@mundamutti.cs.mu.OZ.AU>
| The next time cron needs to restart on my web server, it will start a modified
| version which logs the value or errno, and then immediately attempts the
| stat() again, and if that works, just continues (otherwise still exits).
I had almost forgotten about this one...
This is the patch I made to cron (database.c) ...
*** database.c.OK Sun Feb 1 01:40:26 1998
--- database.c Tue Aug 8 11:36:23 2000
***************
*** 34,39 ****
--- 34,40 ----
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/file.h>
+ #include <errno.h>
#define TMAX(a,b) ((a)>(b)?(a):(b))
***************
*** 62,69 ****
* cached any of the database), we'll see the changes next time.
*/
if (stat(SPOOL_DIR, &statbuf) < OK) {
log_it("CRON", getpid(), "STAT FAILED", SPOOL_DIR);
! (void) exit(ERROR_EXIT);
}
/* track system crontab file
--- 63,76 ----
* cached any of the database), we'll see the changes next time.
*/
if (stat(SPOOL_DIR, &statbuf) < OK) {
+ int err = errno;
+
log_it("CRON", getpid(), "STAT FAILED", SPOOL_DIR);
! log_it("CRON", getpid(), "STAT ERROR", strerror(err));
! if (stat(SPOOL_DIR, &statbuf) == OK)
! log_it("CRON", getpid(), "STAT RECOVERED", "one retry");
! else
! (void) exit(ERROR_EXIT);
}
And this is what has been happening recently (back as far as what I
still have logs - these are just the relevant lines of course) ...
Aug 19 23:28:00 muckleshoot cron[212]: (CRON) STAT FAILED (tabs)
Aug 19 23:28:00 muckleshoot cron[212]: (CRON) STAT ERROR (No such file or directory)
Aug 19 23:28:00 muckleshoot cron[212]: (CRON) STAT RECOVERED (one retry)
Aug 20 01:07:00 muckleshoot cron[212]: (CRON) STAT FAILED (tabs)
Aug 20 01:07:00 muckleshoot cron[212]: (CRON) STAT ERROR (No such file or directory)
Aug 20 01:07:00 muckleshoot cron[212]: (CRON) STAT RECOVERED (one retry)
Aug 20 20:43:00 muckleshoot cron[212]: (CRON) STAT FAILED (tabs)
Aug 20 20:43:00 muckleshoot cron[212]: (CRON) STAT ERROR (No such file or directory)
Aug 20 20:43:00 muckleshoot cron[212]: (CRON) STAT RECOVERED (one retry)
Aug 22 10:07:00 muckleshoot cron[212]: (CRON) STAT FAILED (tabs)
Aug 22 10:07:00 muckleshoot cron[212]: (CRON) STAT ERROR (No such file or directory)
Aug 22 10:07:00 muckleshoot cron[212]: (CRON) STAT RECOVERED (one retry)
Aug 23 15:42:00 muckleshoot cron[212]: (CRON) STAT FAILED (tabs)
Aug 23 15:42:00 muckleshoot cron[212]: (CRON) STAT ERROR (No such file or directory)
Aug 23 15:42:00 muckleshoot cron[212]: (CRON) STAT RECOVERED (one retry)
Aug 23 18:58:00 muckleshoot cron[212]: (CRON) STAT FAILED (tabs)
Aug 23 18:58:00 muckleshoot cron[212]: (CRON) STAT ERROR (No such file or directory)
Aug 23 18:58:00 muckleshoot cron[212]: (CRON) STAT RECOVERED (one retry)
Aug 24 11:29:00 muckleshoot cron[212]: (CRON) STAT FAILED (tabs)
Aug 24 11:29:00 muckleshoot cron[212]: (CRON) STAT ERROR (No such file or directory)
Aug 24 11:29:00 muckleshoot cron[212]: (CRON) STAT RECOVERED (one retry)
Aug 24 11:40:00 muckleshoot cron[212]: (CRON) STAT FAILED (tabs)
Aug 24 11:40:00 muckleshoot cron[212]: (CRON) STAT ERROR (No such file or directory)
Aug 24 11:40:00 muckleshoot cron[212]: (CRON) STAT RECOVERED (one retry)
Aug 24 23:40:00 muckleshoot cron[212]: (CRON) STAT FAILED (tabs)
Aug 24 23:40:00 muckleshoot cron[212]: (CRON) STAT ERROR (No such file or directory)
Aug 24 23:40:00 muckleshoot cron[212]: (CRON) STAT RECOVERED (one retry)
Aug 25 02:12:00 muckleshoot cron[212]: (CRON) STAT FAILED (tabs)
Aug 25 02:12:00 muckleshoot cron[212]: (CRON) STAT ERROR (No such file or directory)
Aug 25 02:12:00 muckleshoot cron[212]: (CRON) STAT RECOVERED (one retry)
Cron hasn't exited since the patch was installed (not that I am suggesting
putting that patch into cron, it is a diagnostic, not a cure).
If there was ever any doubt this is a kernel problem, this squelches it.
Someone who is able should move this PR so it is listed as a kernel
problem rather than a cron problem.
| If I can create an environment that will force this to happen...
This one I'm afraid I haven't had time to work on yet. It is still on
my list of thing sto attempt.
kre