Geeking : 221


Off-Topic / Geek / 221	220 \| 222

Subject:	Re: Idea for automatic Lugnet Set Database linking
Newsgroups:	lugnet.off-topic.geek, lugnet.publish
Date:	Mon, 12 Jul 1999 06:25:54 GMT
Viewed:	1310 times

Todd, I've read several messages from the thread, but not the all of them, so please excuse me if it is already mentioned. I think the way web browsers automatically handles URLs in the message body is also suitable for this mission. if you type the whole URL starting with "http://" ,then the browser makes the sentence a hypertext link, but if you only type "lugnet.com" they do nothing.. So, in the same way, if the poster want his/her set numbers mentioned in the message to be automatically shown somewhere in the page, he/she should use a format which could be identified by your scripts, like (I'm just making it up): "I've find a 6331 for 10 USD last week in blah blah.." ---do nothing "I've find a //6331// for 10 USD last week in blah blah.." ---show its picture thumbnail. Selçuk Todd Lehman wrote in message <3788db9e.1624282@lugnet.com>... > In lugnet.publish, Kevin Loch <kloch@NOSPMkl.net> writes: > > That's a great idea! I would love to see this! How about detecting any > > three or four digit numbers and linking them to the search page. > > > > just run this every 5-10 min.: > > > > #!/bin/sh > > filetmp1=/tmp/hypertmp1 > > filetmp2=/tmp/hypertmp2 > > filelist=/tmp/hyperlist > > newsroot=/var/news/ > > # > > cd ${newsroot} > > du -a > ${filelist} > > for f in ${filelist} > > do > > sed 's/[0-9][0-9][0-9][0-9]/<a > > href=http:\/\/www.lugnet.com\/pause\/search\/?=&><\/a>/g' ${f} > > > ${filetmp1} > > sed 's/[0-9][0-9][0-9]/<a > > href=http:\/\/www.lugnet.com\/pause\/search\/?=&><\/a>/g' ${filetmp1} > > > ${filetmp2} > > mv ${filetmp} ${f} > > done > > > > That would be really REALLY cool! > > > Yikes! -- Kevin, surely you jest! :) > > > Problems with the general approach above: > > * Iterates over tens of thousands of files every few minutes, rewriting > each one whether it needs to or not. Doesn't track which articles > have and have not yet been modified. Wastes 1-2 minutes of CPU time > on each run and does thousands of unnecessary disk I/O operations. > > * Destroys the original article content as originally posted by the > user. Makes it impossible to tell whether the original article had > just a number or a full hyperlink. Makes it impossible to revert to > the original content if a bug is discovered. Also destroys the > original file timestamp. > > * Turns most lines of text into unreadable >80 column messes. > > * Rewrites the original raw NNTP article with embedded HTML, which > won't display correctly on correctly working newsreaders because the > article content-type is still text/plain. > > * Rewrites the text with hard-coded URLs that are subject to change. > > * Only updates articles periodically rather than instantaneously. Someone > view the article on the homepage 2 minutes after it's posted and they > don't see the hyperlinks. Someone else views it 10 minutes later and > they -do- see the hyperlinks. (This is the problem that the threading > display currently has -- it's on a 1-minute cron job.) > > > Bugs in the code above: > > * Doesn't check whether the previous invocation from cron is still > running or has completed. If any run were to take longer than the > cron interval, two copies would then run simultaneously, and then > three, and then four, and within a few hours the system could crash. > > * Matches 3- and 4-digit numbers within words, for example "12345678" > or "foo6991bar". Worse, matches same within URLs that it itself > wrote earlier, causing unrestrained and infinite expansion. > > > Most of these problems could be avoided by processing the article right as > it arrives rather than during a periodic cron sweep later, but altering the > body-content of the actual article is still playing with fire. > > --Todd > > p.s. (Don't get me wrong -- I agree in theory that having links somehow > would be a nice thing...I've thought this through carefully in the past • many > times before and don't have a good solution yet.)

Message is in Reply To:

		Re: Idea for automatic Lugnet Set Database linking
(...) Yikes! -- Kevin, surely you jest! :) Problems with the general approach above: * Iterates over tens of thousands of files every few minutes, rewriting each one whether it needs to or not. Doesn't track which articles have and have not yet been (...) (26 years ago, 11-Jul-99, to lugnet.off-topic.geek, lugnet.publish)

12 Messages in This Thread:

Entire Thread on One Page:: Nested: All | Brief | Compact | Dots
Linear: All | Brief | Compact

Custom Search