To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.admin.generalOpen lugnet.admin.general in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Administrative / General / 4064
4063  |  4065
Subject: 
Re: canceled posts are still indexed ?
Newsgroups: 
lugnet.admin.general
Date: 
Tue, 4 Jan 2000 04:04:29 GMT
Viewed: 
173 times
  
In lugnet.admin.general, Ray Sanders <rsanders@gate.net> writes:
Todd,

Why do canceled posts still participate in a search match ?

Ray

Well, slap me with a splintered ruler!  That _is_ goofy, eh?

Here's what's going on inside the server:

It's way easier to index every article and never unindex it than it is to go
back and alter the index periodically or whenever an article is cancelled,
due to the fact that the indexer is a simple WORM (write once, read many)
design and indexes content in real-time (sorta real-time -- always within
one minute or less).

So under that type of design, the trick is to leave all the articles
indexed, but then to ignore the cancelled ones when reporting the search
results.

The problem with that -- and why it doesn't do that yet -- is that the
module which calculates the matches has to know whether or not a given
article in the match set is cancelled or still active.  That answer is easy
to find out for each article, but requires a disk access if the article or
its filesystem directory isn't in the filesystem cache.  And that means
super-slow results when there are lots of matches.  The other alternative is
not to ask the "is it cancelled?" question until right before displaying the
article -- in the module that displays a list of articles.  Right now, that
module always displays a cancelled article as "(Cancelled)" -- rather than
not displaying anything -- because it keeps the numberings in the listing
displays coherent.  So if the higher level matcher module thought that it
just passed 20 articles to the display module, and the display module only
displayed 14 of them (hiding 6 completely from the user), it would look like
a really weird bug, and probably even would be, given that the 'qn='
parameter on the subsequent "Next>>" accesses would be all wrong.

So to make a long story short, the cancelled articles still show up in
search results (not the bodies of the articles -- just a placeholder) to
avoid even worse problems.

A robust would be to give the upper-level search matcher module some kind of
knowledge about the active-ness of the matching set so that it could prune
the cancelled articles before passing the sublists to the display module.
Maybe that would involve an extra bit in the indexer which could be back-
accessed with an absolute file-seek and changed in place.  But the indexer
isn't set up for that in its current design.

Hope that answers the question!  Sorry about the confusion -- the design for
the indexer was written for the old search-results page, which was way
slower and lumped all (sometimes hundreds or thousands of) the results on
one big page.  The new search-results page is much faster and (hopefully)
easier to use, but needs a smarter indexer as a result.

--Todd



Message is in Reply To:
  canceled posts are still indexed ?
 
Todd, Why do canceled posts still participate in a search match ? Ray (25 years ago, 31-Dec-99, to lugnet.admin.general)

2 Messages in This Thread:

Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR