To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.admin.generalOpen lugnet.admin.general in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Administrative / General / 1404
1403  |  1405
Subject: 
Re: Traffic Error
Newsgroups: 
lugnet.admin.general
Date: 
Sat, 17 Apr 1999 01:11:19 GMT
Viewed: 
872 times
  
In lugnet.admin.general, galliard@shades-of-night.com (James Brown) writes:
In lugnet.admin.general, Tom McDonald writes:
In lugnet.admin.general, Tom McDonald writes:
Today about 2:25 pm PDT I noticed things had gone "All Quiet" on the
traffic page. But about 10 minutes later, I reloaded the page and it
gave me at least 7 starwars messages as well a smattering of messages
(all less than 7) from few other groups as being posted within the
last hour. I checked to see if they were new, even though their red
numbers were not blazingly red, and noted that I had read them before.

Immediately after posting this, I refreshed the traffic page again, and
it seemed that the same groups were there with fewer messages posted
within the hour - about what I expected.

Maybe it accidentally burped out an "All Quiet" earlier. Also, I forgot to
mention that when I refreshed a couple of times earlier (to see if indeed
things were quiet) I got the same "AQ" response each time. Just thought you
should know.

And it's not just Tom. :)
I noticed that too, and was wandering over here to ask about it.

Thanks to both of you for reporting & documenting this.

Let's see what we can find out.  2:25pm PDT today equals 5:25pm EDT, equals
924297900 seconds past the epoch, so what's in the activity log snapshots at
that time...?  Hmm, nothing odd there, but let's look backward in time a
bit...

924290100
924290400
924290700
924291000
924291300
924291600
924291900
924292200        :
924292500        :
924292800       1:00
924293100        :
924294000        :
924294300        :
924294600        :30
924294900        :
924295200        :
924295500        :45
924295800        :
924296100        :
924296400       2:00
924296700        :
924297000        :
924297300        :15
924297600        :
924297900  <--- 2:25pm PDT <--- anomaly observed here
924298200        :
924298500        :
924298800
924299100
924299400
924299700

HEY -- look at that!  The log entries for 924293400 and 924293700 are
missing.  And these correspond to 1:10pm EDT and 1:15pm EDT, which would
completely explain the confusion if you loaded the page anytime between
2:20pm and 2:19pm (how close to 2:25pm did you see the problem?)

Let me check my mailbox.  Hey, guess what?  Cron (the program that launches
the logger every 5 minutes) sent mail saying there was a problem at 5:10pm
EST and again at 5:15pm EST.  It's not saying what happened (because I
didn't ask it to log these particular details), but I can guess what's going
on.

I have a semaphore in the logging code which prevents multiple simultaneous
invocations of cron-spawned jobs.  This is super-important for things like
sending out periodic digests and stuff like that -- you don't want to start
up a new process to service a pending request until the previous one is
complete.  In theory, the logging code, which takes snapshots of the news
article counts every 5 minutes, should be able to do its work in 1/10 second
and be all done with it.  In this case, however, it took more than 10
minutes.  Why?  Because the snapshot logger is still running an old broken
low-level DB library that gets very bloated (large and slow) after zillions
of additions and deletions.  It's 75MB right now when it should only be
8MB.  Until I cut this particular script over to the new & better DB code,
I have to rebuild its data file every couple of months.  I just did that
now, but this is only a band-aid.  The reason I haven't cut over this module
to the new DB library is because it operates on relatively "live" data
(updated every 5 minutes) so the best time to mess with it would be in the
middle of the night, when I'm usually either sleeping or banging out brand-
new code.  (I also wasn't 100% convinced until now that the problem was
really as bad as it is (was).)

So the problem -could- recur in another month or so.  I'll post here again
hopefully before then when I've cut over to the newer & far safer DB
routines.  (Whew!)

--Todd



Message has 2 Replies:
  Re: Traffic Error
 
(...) you (...) I'm not certain, as I was kinda busy doing that radio thing. (...) Umm... okay. (feeling a bit stupider than usual) <:^, I'll just mess with this 10-piece Winnie-the-Pooh puzzle until you work it out. Y'know Todd, I'm surprised that (...) (26 years ago, 17-Apr-99, to lugnet.admin.general)
  Re: Traffic Error
 
(...) you (...) I'm not certain, as I was kinda busy doing that radio thing. (...) Umm... okay. (feeling a bit stupider than usual) <:^, I'll just mess with this 10-piece Winnie-the-Pooh puzzle until you work it out. Y'know Todd, I'm surprised that (...) (26 years ago, 17-Apr-99, to lugnet.admin.general)

Message is in Reply To:
  Re: Traffic Error
 
(...) that (...) And it's not just Tom. :) I noticed that too, and was wandering over here to ask about it. James (URL) (26 years ago, 16-Apr-99, to lugnet.admin.general)

27 Messages in This Thread:






Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact

This Message and its Replies on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR