To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.publishOpen lugnet.publish in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Publishing / 563
562  |  564
Subject: 
Re: Idea for automatic Lugnet Set Database linking
Newsgroups: 
lugnet.off-topic.geek, lugnet.publish
Date: 
Sun, 11 Jul 1999 18:26:10 GMT
Viewed: 
34 times
  
In lugnet.publish, Kevin Loch <kloch@NOSPMkl.net> writes:
That's a great idea! I would love to see this!  How about detecting any
three or four digit numbers and linking them to the search page.

just run this every 5-10 min.:

#!/bin/sh
filetmp1=/tmp/hypertmp1
filetmp2=/tmp/hypertmp2
filelist=/tmp/hyperlist
newsroot=/var/news/
#
cd ${newsroot}
du -a > ${filelist}
for f in ${filelist}
do
  sed 's/[0-9][0-9][0-9][0-9]/<a
href=http:\/\/www.lugnet.com\/pause\/search\/?=&><\/a>/g' ${f} >
${filetmp1}
  sed 's/[0-9][0-9][0-9]/<a
href=http:\/\/www.lugnet.com\/pause\/search\/?=&><\/a>/g' ${filetmp1} >
${filetmp2}
  mv ${filetmp} ${f}
done

That would be really REALLY cool!


Yikes! -- Kevin, surely you jest!  :)


Problems with the general approach above:

*  Iterates over tens of thousands of files every few minutes, rewriting
   each one whether it needs to or not.  Doesn't track which articles
   have and have not yet been modified.  Wastes 1-2 minutes of CPU time
   on each run and does thousands of unnecessary disk I/O operations.

*  Destroys the original article content as originally posted by the
   user.  Makes it impossible to tell whether the original article had
   just a number or a full hyperlink.  Makes it impossible to revert to
   the original content if a bug is discovered.  Also destroys the
   original file timestamp.

*  Turns most lines of text into unreadable >80 column messes.

*  Rewrites the original raw NNTP article with embedded HTML, which
   won't display correctly on correctly working newsreaders because the
   article content-type is still text/plain.

*  Rewrites the text with hard-coded URLs that are subject to change.

*  Only updates articles periodically rather than instantaneously.  Someone
   view the article on the homepage 2 minutes after it's posted and they
   don't see the hyperlinks.  Someone else views it 10 minutes later and
   they -do- see the hyperlinks.  (This is the problem that the threading
   display currently has -- it's on a 1-minute cron job.)


Bugs in the code above:

*  Doesn't check whether the previous invocation from cron is still
   running or has completed.  If any run were to take longer than the
   cron interval, two copies would then run simultaneously, and then
   three, and then four, and within a few hours the system could crash.

*  Matches 3- and 4-digit numbers within words, for example "12345678"
   or "foo6991bar".  Worse, matches same within URLs that it itself
   wrote earlier, causing unrestrained and infinite expansion.


Most of these problems could be avoided by processing the article right as
it arrives rather than during a periodic cron sweep later, but altering the
body-content of the actual article is still playing with fire.

--Todd

p.s.  (Don't get me wrong -- I agree in theory that having links somehow
would be a nice thing...I've thought this through carefully in the past many
times before and don't have a good solution yet.)



Message has 2 Replies:
  Re: Idea for automatic Lugnet Set Database linking
 
Well, that was the product of "2am after coming home from the bar" activity. Not too bad considering that. 1. The part about running it on the whole news tree every 5-10min was a joke (normally I would just give a code fragment instead of a whole (...) (25 years ago, 11-Jul-99, to lugnet.off-topic.geek, lugnet.publish)
  Re: Idea for automatic Lugnet Set Database linking
 
Todd, I've read several messages from the thread, but not the all of them, so please excuse me if it is already mentioned. I think the way web browsers automatically handles URLs in the message body is also suitable for this mission. if you type the (...) (25 years ago, 12-Jul-99, to lugnet.off-topic.geek, lugnet.publish)

Message is in Reply To:
  Re: Idea for automatic Lugnet Set Database linking
 
That's a great idea! I would love to see this! How about detecting any three or four digit numbers and linking them to the search page. just run this every 5-10 min.: #!/bin/sh filetmp1=/tmp/hypertmp1 filetmp2=/tmp/hypertmp2 filelist=/tmp/hyperlist (...) (25 years ago, 11-Jul-99, to lugnet.off-topic.geek, lugnet.publish)

12 Messages in This Thread:







Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact

This Message and its Replies on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR