To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.admin.generalOpen lugnet.admin.general in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Administrative / General / 8624
     
   
Subject: 
Re: News search function reactivated (was: News search function temporarily disabled)
Newsgroups: 
lugnet.admin.general, lugnet.off-topic.geek
Date: 
Wed, 3 Jan 2001 05:16:07 GMT
Viewed: 
9465 times
  

In lugnet.admin.general, Dan Jezek writes:
[...]
example, to limit posts to the last 10 days, use

   &qs=864000

It works great! ... but the &qs doesn't carry over to the next page of
results.  So if I want to see more pages, I have to edit the querystring on
each page.

oops, doy!  I didn't put in the propagation of that URL term.  I don't consider
it 100% "documented" yet (it's still subject to change without notice), but I
still shouldn't have missed that.  Thanks.  I'll fix that.

The reason it's subject to change is partially because the letter 's' in 'qs'
is named after the word (or greek letter, rather) 'sigma' -- sigma being 1
standard deviation in the bell curve function f(x) = exp(-x^2/2) -- and that
formula isn't being used anymore in the searches, and partially because 'qs'
might better someday be used for "query subject."  Anyway, it's still not 100%
in stone.  But it'll work until it breaks.


Since you already have the inner workings of this in place, it
would be really easy to just add a textbox named "qs" and add the &qs= to
the bottom "5 more, 10 more"... links.  With a little more effort, you could
include radio buttons to have the user select how many days, months or years
they want to go back and have your search engine convert it to milliseconds
depending on what the user selects.

Yup, that's the idea!!!  Say, where is that old article about sigma and
advanced options...ah! so easy to find now!  :-)

   http://news.lugnet.com/?q=url+query+qs+qt+sigma+%3C//1.5

(See topmost result and related thread for more background.)


It's actually in the nature of search engines to generate thousands of
results.

If given thousands of results, most search engines have some advanced
options like sorting.

Well, they -are- sorted.  They're always sorted -- always highest probability
of relevance first, lowest last.  Usually, the metric for relevance is a
combination of non-temporal factors such as word frequencies, word proximities,
and word orderings.  I don't know of any search engine that doesn't sort (on
some criteria) the matches it finds.  But anyway, I think you meant sorting
by time?

I wonder if a little link at the top to re-deploy the search taking recentness
into account (or conversely, turning it off if it's on) would be useful?


What's more important is the first page returned -- i.e., the ranking.
Typically one doesn't dig down past the first few, so you rarely
actually go visit all the thousands.

I'd be interested in seeing some statistics on how far the average user goes
when given back let's say 10, 100 and 1,000 pages of results.  It would help
in the design of an effective search engine.

Me too.  I'd expect a f(x)=1/x type of curve, but it would be fun to see actual
numbers.  :-)

--Todd

   
         
     
Subject: 
Re: News search function reactivated (was: News search function temporarily disabled)
Newsgroups: 
lugnet.admin.general, lugnet.off-topic.geek
Date: 
Wed, 3 Jan 2001 06:16:20 GMT
Viewed: 
4368 times
  

In lugnet.admin.general, Todd Lehman writes:
oops, doy!  I didn't put in the propagation of that URL term.  I don't >consider
it 100% "documented" yet (it's still subject to change without notice), but I
still shouldn't have missed that.  Thanks.  I'll fix that.
The reason it's subject to change is partially because the letter 's' in 'qs'
is named after the word (or greek letter, rather) 'sigma' -- sigma being 1
standard deviation in the bell curve function f(x) = exp(-x^2/2) -- and that
formula isn't being used anymore in the searches, and partially because 'qs'
might better someday be used for "query subject."  Anyway, it's still not 100%
in stone.  But it'll work until it breaks.

Wow!  So you have terms for the ampersand options in a URL?  My standpoint
on this would be to put everything in a form and kill 2 birds with 1 stone -
not having to think of how to name URL terms (unless you enjoy doing that)
and having the search more user-friendly (not everyone will remember the
options or find it easy to edit the URL).

If given thousands of results, most search engines have some advanced
options like sorting.

Well, they -are- sorted.  They're always sorted -- always highest probability
of relevance first, lowest last.  Usually, the metric for relevance is a
combination of non-temporal factors such as word frequencies, word >proximities,
and word orderings.  I don't know of any search engine that doesn't sort (on
some criteria) the matches it finds.  But anyway, I think you meant sorting
by time?

No, I meant having the option to pick between what I want the results to be
sorting on.  Dejanews has a great power search:

http://www.deja.com/home_ps.shtml

which includes the option to sort by relevance, subject, forum, author and
date.  That's how I would like to see the sort options here.  But knowing
that you most likely don't have the resources that dejanews has and how
flawlessly Lugnet runs on the current setup, I'm satisfied with editing the
URL for now :-)

I'd be interested in seeing some statistics on how far the average user goes
when given back let's say 10, 100 and 1,000 pages of results.  It would help
in the design of an effective search engine.

Me too.  I'd expect a f(x)=1/x type of curve, but it would be fun to see >actual numbers.  :-)

It could be done.  Include another version of jump.cgi into the 5 more, 10
more... on the search results page and log the number of results returned,
the IP address and the query subject.  Then run an average, min, max query
grouped by all 3 fields.  Sounds complicated, depends on how badly you want
to see the results.  I wouldn't want to go through the process of
implementing that but would really like to see the results :-)

    
          
     
Subject: 
Re: News search function reactivated (was: News search function temporarily disabled)
Newsgroups: 
lugnet.admin.general, lugnet.off-topic.geek
Date: 
Wed, 3 Jan 2001 14:11:53 GMT
Viewed: 
7074 times
  

In lugnet.admin.general, Dan Jezek writes:
Wow!  So you have terms for the ampersand options in a URL?  My standpoint
on this would be to put everything in a form and kill 2 birds with 1 stone -
not having to think of how to name URL terms (unless you enjoy doing that)
and having the search more user-friendly (not everyone will remember the
options or find it easy to edit the URL).

Ya, exactly -- first name the URL components carefully and then put a user-
friendly level on top of it.  Best of both worlds.


No, I meant having the option to pick between what I want the results to be
sorting on.  Dejanews has a great power search:

http://www.deja.com/home_ps.shtml

which includes the option to sort by relevance, subject, forum, author and
date.  That's how I would like to see the sort options here.

Ah, I see.  Yeah, that could be helpful in certain cases, if you're scouring
tons of results!  I've needed to look things up on Deja.com, so I know what
you mean.


But knowing
that you most likely don't have the resources that dejanews has and how
flawlessly Lugnet runs on the current setup, I'm satisfied with editing the
URL for now :-)

There's an alternate form that avois the &qs= thingie, so you don't have to
edit the URLs:

http://news.lugnet.com/admin/general/?n=8613


It could be done.  Include another version of jump.cgi into the 5 more, 10
more... on the search results page and log the number of results returned,
the IP address and the query subject.

These don't actually run through jump.cgi.  But they're already logged by
httpd anyway.  (That's how the jump.cgi logging is implemented as well.)


Then run an average, min, max query
grouped by all 3 fields.  Sounds complicated, depends on how badly you want
to see the results.  I wouldn't want to go through the process of
implementing that but would really like to see the results :-)

Hmm, it's all there now, except for logging the number of results produced.
I guess it could be as simple as open for append, flock, print, and close on
a filehandle inside of the search page...lemme think about it.  Analyzing the
results and making a graph would be a snap with gnuplot.

I think it would be especially fun to compare the graph now to the way it was
(would have been) before the change...but alas, that data was never captured
for the old query engine and it's too late now.

--Todd

   
         
   
Subject: 
Re: News search function reactivated (was: News search function temporarily disabled)
Newsgroups: 
lugnet.admin.general
Date: 
Mon, 5 Feb 2001 01:41:53 GMT
Viewed: 
1302 times
  

In lugnet.admin.general, Todd Lehman writes:
It's actually in the nature of search engines to generate thousands of
results.

If given thousands of results, most search engines have some advanced
options like sorting.

Well, they -are- sorted.  They're always sorted -- always highest probability
of relevance first, lowest last.  Usually, the metric for relevance is a
combination of non-temporal factors such as word frequencies, word
proximities, and word orderings.  I don't know of any search engine that
doesn't sort (on some criteria) the matches it finds.  But anyway, I think
you meant sorting by time?

I wonder if a little link at the top to re-deploy the search taking recentness
into account (or conversely, turning it off if it's on) would be useful?

Todd, this would be really useful. I'll often search for a recent post, only
remembering the poster's name and maybe one or two key-words, and that the
post was in the past few days. I don't need two year old messages nearly as
frequently. Could you change the display so that when results have the same
score more recent posts are displayed first? I think a score weighting for
recentness would be even more useful as part of the default setting --
obviously that would be up to you.

--DaveL

 

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR