To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.admin.generalOpen lugnet.admin.general in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Administrative / General / 1406
1405  |  1407
Subject: 
Re: Traffic Error
Newsgroups: 
lugnet.admin.general
Date: 
Sat, 17 Apr 1999 00:28:10 GMT
Viewed: 
614 times
  
In lugnet.admin.general, Todd Lehman writes:
In lugnet.admin.general, galliard@shades-of-night.com (James Brown) writes:
In lugnet.admin.general, Tom McDonald writes:
In lugnet.admin.general, Tom McDonald writes:
Today about 2:25 pm PDT I noticed things had gone "All Quiet" on the
traffic page. But about 10 minutes later, I reloaded the page and it
gave me at least 7 starwars messages as well a smattering of messages
(all less than 7) from few other groups as being posted within the
last hour. I checked to see if they were new, even though their red
numbers were not blazingly red, and noted that I had read them before.

Immediately after posting this, I refreshed the traffic page again, and
it seemed that the same groups were there with fewer messages posted
within the hour - about what I expected.

Maybe it accidentally burped out an "All Quiet" earlier. Also, I forgot to
mention that when I refreshed a couple of times earlier (to see if indeed
things were quiet) I got the same "AQ" response each time. Just thought • you
should know.

And it's not just Tom. :)
I noticed that too, and was wandering over here to ask about it.

Thanks to both of you for reporting & documenting this.

Let's see what we can find out.  2:25pm PDT today equals 5:25pm EDT, equals
924297900 seconds past the epoch, so what's in the activity log snapshots at
that time...?  Hmm, nothing odd there, but let's look backward in time a
bit...

924290100
924290400
924290700
924291000
924291300
924291600
924291900
924292200        :
924292500        :
924292800       1:00
924293100        :
924294000        :
924294300        :
924294600        :30
924294900        :
924295200        :
924295500        :45
924295800        :
924296100        :
924296400       2:00
924296700        :
924297000        :
924297300        :15
924297600        :
924297900  <--- 2:25pm PDT <--- anomaly observed here
924298200        :
924298500        :
924298800
924299100
924299400
924299700

HEY -- look at that!  The log entries for 924293400 and 924293700 are
missing.  And these correspond to 1:10pm EDT and 1:15pm EDT, which would
completely explain the confusion if you loaded the page anytime between
2:20pm and 2:19pm (how close to 2:25pm did you see the problem?)

I'm not certain, as I was kinda busy doing that radio thing.


Let me check my mailbox.  Hey, guess what?  Cron (the program that launches
the logger every 5 minutes) sent mail saying there was a problem at 5:10pm
EST and again at 5:15pm EST.  It's not saying what happened (because I
didn't ask it to log these particular details), but I can guess what's going
on.

I have a semaphore in the logging code which prevents multiple simultaneous
invocations of cron-spawned jobs.  This is super-important for things like
sending out periodic digests and stuff like that -- you don't want to start
up a new process to service a pending request until the previous one is
complete.  In theory, the logging code, which takes snapshots of the news
article counts every 5 minutes, should be able to do its work in 1/10 second
and be all done with it.  In this case, however, it took more than 10
minutes.  Why?  Because the snapshot logger is still running an old broken
low-level DB library that gets very bloated (large and slow) after zillions
of additions and deletions.  It's 75MB right now when it should only be
8MB.  Until I cut this particular script over to the new & better DB code,
I have to rebuild its data file every couple of months.  I just did that
now, but this is only a band-aid.  The reason I haven't cut over this module
to the new DB library is because it operates on relatively "live" data
(updated every 5 minutes) so the best time to mess with it would be in the
middle of the night, when I'm usually either sleeping or banging out brand-
new code.  (I also wasn't 100% convinced until now that the problem was
really as bad as it is (was).)

So the problem -could- recur in another month or so.  I'll post here again
hopefully before then when I've cut over to the newer & far safer DB
routines.  (Whew!)

--Todd

Umm... okay.   (feeling a bit stupider than usual) <:^,

I'll just mess with this 10-piece Winnie-the-Pooh puzzle until you work it out.


Y'know Todd, I'm surprised that with all your patience and willingness to
familiarize yourself with intricacies far beyond the understanding of mortal
men that you're not an absolute Technic and Mindstorms freak :)

-Tom McD.



Message is in Reply To:
  Re: Traffic Error
 
(...) Thanks to both of you for reporting & documenting this. Let's see what we can find out. 2:25pm PDT today equals 5:25pm EDT, equals 924297900 seconds past the epoch, so what's in the activity log snapshots at that time...? Hmm, nothing odd (...) (25 years ago, 17-Apr-99, to lugnet.admin.general)

27 Messages in This Thread:






Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR