To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.publishOpen lugnet.publish in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Publishing / 628
627  |  629
Subject: 
Re: Colon character ":" getting changed to ">"
Newsgroups: 
lugnet.admin.general, lugnet.publish
Date: 
Mon, 26 Jul 1999 20:24:37 GMT
Viewed: 
89 times
  
In lugnet.admin.general, "Robert Munafo" <munafo@gcctech.com> writes:

I don't understand your answer at all. You say it's trying to "fix miscreant
the quote character ':' back into a '>'". I don't know what "fix miscreant"
means, but I do know that ":" and ">" aren't quote characters so I'm having
trouble figuring out why you would refer to ":" as a quote character.

Would you be willing to give your answer in a totally different way, like
explain what bad thing would happen if it didn't turn ":" into ">"?

By "quote character" here I mean news-quoting rather than English text
quoting.  The two paragraphs above this one that I'm writing are "quoted"
paragraphs from your message.  The word "quoted" is in English quotes.
Sorry about the ambiguity.  I'm not sure what an unambiguous term is for
news/mail-content quoting.

When I wrote "fix miscreant the quote character..." I meant "fix the
miscreant quote character..." (accidental word floppage there, sorry).

That is, although ">" is the One True Quote Character, some miscreant
newsreaders insist on using the Evil and Wrong ":" when they present a reply
to the user.  (Either that or some people have overriden ">" to ":"
manually, but I've seen ":" so much that I have to think it's a default for
some broken package out there.)

Anyway, I'm thinking at this point that it's probably better to disable
conversion of ":" to ">" altogether because it ruins good text some of the
time.  It needs a much smarter algorithm for detecting whether or not ":" is
actually used as a quote character before a conversion like that could be
put back in place.


Another thing you didn't address at all is why it gets confused by text that it
has artificially line-wrapped. The original text was typed in with one newline
per paragraph, and had no ":" characters at the beginning of a line. The
"intra-paragraph newlines" were put in by some script or program within LUGNET.
The only reason the ":" ended up at the beginning of the line is because of
these artificially induced newlines. If the script did the ":"/">" thing before
adding the extra newlines, the bug would be fixed.

There are actually two totally separate parts going on there...

The first part, the input part, is forced line wrapping at 79 characters,
which happens when you submit an article via the web interface.  The server
sometimes has to add line wrapping "by hand" because some browsers don't
honor the WRAP=HARD attribute of the <TEXTAREA> tag (it's not a 100%
standardized attribute, but that's no reason not to use it).  Forced line
wrapping is the only way to ensure that things don't go too wide.  (Things
were terrible last fall when it -didn't- do that.  :-)  After line-wrapping,
the article is then stored just like that, and that's how the NNTP
newsserver serves the never-changing raw article.  The line-wrapping happens
just prior to injecting the article via a temporary NNTP connection.

The second part, the output part, happens on-the-fly every time an article
is summoned for display via a web page.  This could be days or weeks or
months later.  This part doesn't know where an article came from or how the
lines got broken up originally (it could take an educated guess, but it
would probably not always be right).  When the display routine sees ":" at
the beginning of a line (or in a sequence of nested quote characters like
">:>:>"), it thinks it's a quote character just like ">".  The indendenting
renderer, which has just split apart a paragraph into an indentation count
(a number) and a string (the rest of the line), then displays all quote
characters (what it THINKS are quote characters) as ">", followed by the
rest of the line.  This is fine and good most of the time, but it fails in
certain cases, like the case you and Ed discovered.

Alas, it's worse (a) for the quote-detection algorithm to be fooled so
easily than (b) to show ":" characters as quote characters...  So I've
disabled the portion which recognizes ":".  Now only ">" is recognized as a
content-quote character.  This is a lot more restrictive, but at least the
worst thing that can happen now is that something -doesn't- get colored and
italicized differently when it should be, rather than vice-versa plus
occasionally incorrect ":"-to-">" conversion.

Looking at Forte Free Agent, it recognizes ">", ":", and "|" as "Quoted Text
Markers" by default, and I think Microplanet Gravity has something similar
-- maybe not "|", but at least ">" and ":".  I wouldn't hazard a guess at
what MSOE does.  :)

Anyway, it's really too bad that not every configured newsreader uses ">",
but I guess we're all fortunate that *most* (95%+?) use ">".

BTW, the reason that the ":"-to-">" conversion started in the first place
was because it happened to be (a) very easy to do, since the indentation
count needs to be noted on each line anyway in order to repair mis-wrapped
lines, and (b) very unpleasant to see ":" alternated with ">" when the
earlier font display (at the end of June) used progressively smaller and
lighter fonts for quoted text rather than what it does now.  So it's less
important now to convert ":" to ">".  (And of course, doing it wrong is
right out.)

An example of "repairing mis-wrapped lines" is this:  Someone responds to
some article that's been posted with margins at, say 79 -- just wide enough
to fit, but not wide enough to be quotable with "> ".  So then when they
reply, they're a lazy-ass and they neglect to fix up the broken wrapped
lines that they've created, and they end up posting a mess like this:

--------------begin--------------
We the people of the United States, in order to form a more perfect • union,
etablish justice, insure domestic tranquility, provide for the common • defense,
promote the general welfare, and secure the blessings of liberty to • ourselves
and our posterity, do ordain and establish this Constitution for the • United
States of America.
---------------end---------------

That looks absolutely terrible on a web page if it's not adjusted for.  In
fact, it looks like a sloppy bug in the web software rather than a lazy-ass
news user.  So the display routine analyzes the indentation and makes
educated guesses about how to repair the wraps.  For good karma, it displays
a small bullet character (&middot; = &#183; = ·) wherever it has joined two
lines.

Ironically, most of the cases of bad line wrapping come from messages posted
via the web interface.  Why is this?  Well, because the text editors in web
browsers aren't specialized and don't know that lines beginning with ">"
are special.  Good dedicated newsreaders, of course, allow lines beginning
with ">" to extend past the right margin, or, better yet, re-wrap the text
on those lines automagically.

--Todd



Message is in Reply To:
  Re: Colon character ":" getting changed to ">"
 
I don't understand your answer at all. You say it's trying to "fix miscreant the quote character ':' back into a '>'". I don't know what "fix miscreant" means, but I do know that ":" and ">" aren't quote characters so I'm having trouble figuring out (...) (25 years ago, 26-Jul-99, to lugnet.admin.general)

5 Messages in This Thread:


Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR