To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.off-topic.geekOpen lugnet.off-topic.geek in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Off-Topic / Geek / 420
419  |  421
Subject: 
Re: Text::Query
Newsgroups: 
lugnet.off-topic.geek
Date: 
Tue, 24 Aug 1999 16:54:09 GMT
Reply-To: 
jsproat@ANTISPAMio.com
Viewed: 
1816 times
  
Todd Lehman wrote:
In lugnet.general, Todd Lehman writes:
In lugnet.general, Jeremy H. Sproat writes:
Whoa!  Todd dude!  You're using Text::Query, aren't you?  Come on,
fess up.
No, it's using a homebrew.  I probably should look into Text::Query though,
if it's loaded with callbacks and/or easy member-function overloading.
OK, I just took a look at Text::Query.  From what I can tell from reading
the docs and the source, it looks as though it's a brute force text scanner
rather than an inverted-index generator.

Yah.  I actually was getting excited over the similarity in syntax; I hadn't
even thought of the need for indexing such a huge database as LUGNET.  BTW,
what kind of scanner are you using for the index builder?  Does it just
break apart words separated by whitespace / non-alphanumeric / etc. and keep
a running word/URI count, or is it something more magical?  How long does it
take to build the index for LUGNET?  Does this length of time have an impact
on when new articles can be indexed?  Will Danny Elfman and John Williams
ever put out a compilation album?  Inquiring minds want to know.

Cheers,
- jsproat

--
Jeremy H. Sproat <jsproat@io.com>
http://www.io.com/~jsproat
Darth Maul Lives



Message has 1 Reply:
  Re: Text::Query
 
(...) Not very magical, no. It breaks text apart by anything non-alphanumeric, where the "alpha" part includes ISO-8859-1 international letters like ã, ñ, ß, and ø, etc. It converts everything to lowercase for indexing and collapses apostrophes. (...) (25 years ago, 25-Aug-99, to lugnet.off-topic.geek)

Message is in Reply To:
  Re: Text::Query
 
(...) OK, I just took a look at Text::Query. From what I can tell from reading the docs and the source, it looks as though it's a brute force text scanner rather than an inverted-index generator. So that means it's about 3 to 4 orders of magnitude (...) (25 years ago, 22-Aug-99, to lugnet.off-topic.geek)

11 Messages in This Thread:

Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact

This Message and its Replies on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR