Subject:
|
Re: Newsserver downtime
|
Newsgroups:
|
lugnet.admin.nntp
|
Date:
|
Mon, 6 Nov 2000 17:07:34 GMT
|
Viewed:
|
94 times
|
| |
| |
On Mon, Nov 06, 2000 at 04:58:07PM +0000, Todd Lehman wrote:
> As you may have noticed, the LUGNET newsserver stopped displaying new articles
> for approximately 90 minutes today. According to the server logs, there was
> an extremely abnormally high load average for several minutes around 10:00 EST,
> which seems to have resulted in a stale lockfile in one of the newsserver's
> control directories (I suspect some process starved and didn't clean up
> properly upon abnormal exit). This caused about 30-40 articles to be put
> on hold in an incoming spool directory. Removing the lockfile caused all
> the queued-up articles immediately to appear and the also caused the normal
> functioning of the newsserver to resume. I'm now looking at ps and httpd
> logs hoping to learn more about the cause.
have you thought of putting some sort of watchdog processes up, to notify
you (if not fix automagically) when such things happen?
--
Dan Boger / dan@peeron.com / www.peeron.com / ICQ: 1130750
<set:454_2>: 1x4x2 window (LEGO/BASIC/Accessories), '66, 1 pcs
|
|
Message has 1 Reply: | | Re: Newsserver downtime
|
| (...) Yes -- in this case it might be able to be a cron job that checks every 5 minutes and examines the timestamp on the lockfile, if present. If it saw that the lockfile was more than a couple minutes old, it could mv or rm it out of the way. It (...) (24 years ago, 6-Nov-00, to lugnet.admin.nntp)
|
5 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|