|
do we need a admin.geek group?
In lugnet.admin.general, Todd Lehman writes:
> In lugnet.publish, Matthew Miller writes:
> > Urg. It may not be obvious to those of you viewing this message with MS
> > Windows, but the above message isn't ascii text (or ISO 8859-1 Latin-1,
> > either -- even though the header claims it is!). It's Microsoft's
> > non-standard [1] character set. This makes the message look pretty weird
> > when viewed on a non-MS system -- all of the apostrophes show up as question
> > marks (or don't show up at all).
> >
> > Since asking everyone to not use Microsoft products to read LUGnet is
> > probably a bit harsh [2], Todd, how about automatically scanning for this
> > and correcting it when messages are posted?
>
> Hmmmm. I agree that it's pretty horrendous for plaintext, but I think that
> so-called "smart quotes" are a pretty great thing for HTML (as long as the
> correct standard character entities are output, of course! :) when done
> properly.
>
> What currently happens in the web interface when someone views a message with
> these is that they get mapped into HTML entities like this:
>
> 145 --> ‘
> 146 --> ’
> 147 --> “
> 148 --> ”
>
> Unfortunately, those positions don't seem to be defined in HTML 3.2, so
> they'll only show up "correctly" (meaning, as intended by the author of
> the message) on non-MS systems when someone uses MS fonts or fonts with
> equivalent character mappings.
>
> I'm happy to see that HTML 4.0 defines[1] these...
>
> ‘ <==> ‘ (equivalent to 145)
> ’ <==> ’ (equivalent to 146)
> “ <==> “ (equivalent to 147)
> ” <==> ” (equivalent to 148)
>
> ...but I haven't tested these in popular browsers to see if they're worth
> using yet. I switched from ™ to ™ for the TM symbol a while back
> and that has worked well.
>
>
> > There's an already existing tool:
> > <http://www.fourmilab.ch/webtools/demoroniser/>
> > (That page also has more good info on the problem.)
>
> I looked quickly at the source (admittedly, not a thorough scouring); it
> looks like the mapping it applies is non-invertible, especially in the case
> of 147 and 148. :-(
>
> It may be better simply to reject MS-moronised messages altogether than to
> attempt to convert it at the receiving end, because at least that way the
> original meaning isn't destroyed. (Actually, I'm not in favor of either of
> those options anywhere near as much leaving the conversion up to each
> individual client on-the-fly at display-time.)
>
> --Todd
>
> [1] http://www.w3.org/TR/1998/REC-html40-19980424/sgml/entities.html
|
|
Message is in Reply To:
| | Re: ???Question???
|
| (...) Hmmmm. I agree that it's pretty horrendous for plaintext, but I think that so-called "smart quotes" are a pretty great thing for HTML (as long as the correct standard character entities are output, of course! :) when done properly. What (...) (25 years ago, 19-May-00, to lugnet.publish, lugnet.admin.general)
|
11 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|