To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.admin.generalOpen lugnet.admin.general in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Administrative / General / 8618
8617  |  8619
Subject: 
Re: News search function reactivated (was: News search function temporarily disabled)
Newsgroups: 
lugnet.admin.general
Date: 
Wed, 3 Jan 2001 01:51:23 GMT
Viewed: 
1039 times
  
In lugnet.admin.general, Frank Filz writes:
Todd Lehman wrote:
I know what you mean, though, about being able to restrict a search to
_specifically_ some exact subject or author.  I'll think about how I might
be able to handle this in the future -- it would be a separate index
database for each of the two fields.

Do you index the name in "X-real-life-name"?

Ya, let's see...as it assembles the text to index, first it grabs
X-Real-Life-Name:, then it grabs either Original-From: or From:, then
Subject:, then Keywords:, then Summary:, and then finally the non-quoted
and non-sig parts of the body.

So, for example, on your post that I'm replying to, it would generate

   frank filz frank filz re news search functoin reactivated was news search
   function temporarily disabled do you index the name in x real life name
   one thought index the special strings from and subject the the serach
   etc., etc.

And then it would remove a few stopwords and then feed that to the actual
indexer.


One thought, index the special strings "from:" and "subject:". The the
search:

   from: ffilz

Should rank my posts highly due to proximity. Of course it would be
better to index the real life name as if it was preceded by from: also
so that you could search:

   from: filz

and find my posts.

Ah.  That's a neat trick!  It's a little English-centric, though, but it's
still a very simple and elegant partial solution...one of those "80% of the
benefits for only 20% of the work" types of things.  Unfortunately, it would
mean reindexing the entire news corpus from scratch, because they'd be
insertions in the numeric word-order lists -- so it's probably something to
do along with other additions the next time the index is rebuilt from scratch.
The last time I rebuilt it, I think it took a whole day, and that was with
about 1/4 as many articles in the system.  (The indexer is optimized for fast
incremental indexing rather than fast one-time building.)

--Todd



Message is in Reply To:
  Re: News search function reactivated (was: News search function temporarily disabled)
 
(...) Do you index the name in "X-real-life-name"? One thought, index the special strings "from:" and "subject:". The the search: from: ffilz Should rank my posts highly due to proximity. Of course it would be better to index the real life name as (...) (23 years ago, 2-Jan-01, to lugnet.admin.general)

45 Messages in This Thread:
















Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR