|
In lugnet.admin.general, Fredrik Glöckner wrote:
> > That way, the article would be stored on the server using native
> > ISO-8859-1 encoding rather than the unreadable ASCIIfications.
>
> This sounds like a nice idea.
(Just came across this old thread and realized this hadn't yet been fixed.)
OK, I've just installed a filter which does this conversion based on the
RFC 2047 specification.
This eliminiates nasties like =?ISO-8859-1?Q?f=F2=F3=20b=E2r?= in Subject
lines and other message headers, replacing them instead with readable text.
All character sets (ISO-8859-1, US-ASCII, X-UNKNOWN, WINDOWS-1252, etc.)
are decoded from their =XX form into octets. The 'charset' element of the
'Content-Type' tag tells the client which character set is in use.
Both the Q (quoted-printable) and the B (base64) encodings of RFC 2045 are
handled.
Examples:
OLD: From: Fredrik =?iso-8859-1?q?Gl=F6ckner?= <fredrigl@math.uio.no>
NEW: From: Fredrik Glöckner <fredrigl@math.uio.no>
OLD: Organization: =?iso-8859-1?Q?H=F6gskolan?= i =?iso-8859-1?Q?Sk=F6vde?=
NEW: Organization: Högskolan i Skövde
OLD: Subject: Re: Oh gente, vamos =?ISO-8859-1?Q?l=E1_a_dar_uma_ajuda!?=
=?ISO-8859-1?Q?_=28rob=F3tica_=3A-=29?=
NEW: Subject: Re: Oh gente, vamos lá a dar uma ajuda! (robótica :-)
OLD: Subject: Fwd: Re: Treffen in S
=?ISO-8859-1?B?/GRkZXV0c2NobGFuZCBhbSBu5A==?=chsten Wochenende?
NEW: Subject: Fwd: Re: Treffen in Süddeutschland am nächsten Wochenende?
OLD: Subject: 928 @ =?ISO-8859-1?Q?=A330?=
NEW: Subject: 928 @ £30
OLD: Subject: Re: Attn: Ralph D=?ISO-8859-1?B?9g==?=ring
NEW: Subject: Re: Attn: Ralph Döring
OLD: Subject: =?ISO-8859-1?B?bGliZXJ04CBkaSBzdGFtcGEgZSBjZW5zdXJh?=
NEW: Subject: libertà di stampa e censura
Applying the filter retroactively resulted in approximately 2,802 changes.
For the curious, a comprehensive (minus cancelled articles) summary of
differences between old and new appears here:
http://www.lugnet.com/temp/fixrfc2047headers.html.gz (47 KB)
(It's gzipped because the HTML file is 1 MB.)
--Todd
|
|
Message is in Reply To:
| | Re: Posting with MIME encoded FROM header
|
| (...) This was my first reaction as well. It has always been considered bad and ugly to post anything MIME encoded on news. But as you say yourself, this practice is becoming increasingly common, so you are probably going to be be confronted with it (...) (26 years ago, 19-Jan-99, to lugnet.admin.general)
|
4 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|