Frequently Asked Questions : 88


FAQ / 88	87 \| 89

Subject:	Re: Raw FAQ data format (Was: Format of FAQ items)
Newsgroups:	lugnet.faq
Date:	Sun, 25 Apr 1999 04:15:25 GMT
Viewed:	3210 times

In lugnet.faq, jsproat@geocities.com (Sproaticus) writes: > > - The content of the entries should be marked up as a > > subset of HTML ("lynx -dump" is a possible tool for > > translation to plain text). > > Or some other tool; but I agree, a well-defined subset of HTML can > and should be used. Oh man, I'm HOT on "lynx -dump -force_html"!! It doesn't do an absolutely perfect perfect job, but it comes *so* close, and I'll bet it can get even closer by specifying a custom config file on the command line. > > - ASCII + HTML entities are allowed in the headers. > > At least the ® -style chars. I don't see much need for more HTML in the > headers. Agreed -- only &xxx; entities ought to be allowed in the headers, IMO... And if the content charset is Latin-1 instead of pure 7-bit ASCII, then this can be further reduced to < > " &. > > Now I have some questions and ideas: > > - Should we use ASCII or Latin-1 for the content character > > set? > > - The content should of cause not be a full HTML document. > > Both of these are starting to get over my head. My knee-jerk reaction to the > ASCII question is to just use the lower 128 (not counting the very lowest 32 > of course :-), and use some form of encoding for any other characters -- at > least for the raw FAQ format. Hmm. Well, either way, the following three characters will have to be written as entities: & => & < => < > => > and *perhaps* the double-quote character should be forced to be written as an entity as well: " => " But apart from those, wouldn't it simplify editing a ton (and make it much much safer) if characters above 128 were just written directly in their Latin-1 encoding, i.e.--? ® instead of ® å instead of å ü instead of ü ñ instead of ñ I can convert HTML <=> Latin-1 extremely easily on the fly. --Todd

Message has 1 Reply:

		Re: Raw FAQ data format (Was: Format of FAQ items)
(...) If we ban HTML _elements_ from the headers, then we don't need to escape '<' and '>'. There has never been a need to escape '"'. If we want to allow numeric character references outside Latin-1 (like '̥') we still have to escape (...) (27 years ago, 26-Apr-99, to lugnet.faq)

Message is in Reply To:

		Re: Raw FAQ data format (Was: Format of FAQ items)
(...) Sounds mostly good. Catch my exceptions down below. (...) Or some other tool; but I agree, a well-defined subset of HTML can and should be used. (...) (Please keep in mind Jacob, that these are nits I'm picking. :-) "Newsgroups" would be more (...) (27 years ago, 24-Apr-99, to lugnet.faq)

82 Messages in This Thread:

Entire Thread on One Page:: Nested: All | Brief | Compact | Dots
Linear: All | Brief | Compact
This Message and its Replies on One Page:: Nested: All | Brief | Compact | Dots
Linear: All | Brief | Compact

Custom Search