To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.faqOpen lugnet.faq in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 FAQ / 63
62  |  64
Subject: 
FAQ data format possibilities
Newsgroups: 
lugnet.faq
Date: 
Fri, 23 Apr 1999 21:22:09 GMT
Viewed: 
1957 times
  
In lugnet.faq, jsproat@geocities.com (Sproaticus) writes:
Jacob Sparre Andersen wrote:
Todd Lehman (lehman@javanet.com) wrote:
For example, maybe one file for each question, formatted like so:

   Newsgroups: [list of newsgroups (wildcards permitted) that this question
                and its answer would appear in]
   Subject: [this is the question]
   Category: [named section; further modifies section implied by newsgroup]

Also:
       Subcategory: [subcategory depth 1]
       Subcategory: [subcategory depth 2]
etc.

Plus, some kind of tags at the beginning and ending of each FAQ entry would
make parsing it much easier IMO.

Separate files would make it even easier to parse.  Also easier for editing,
maintenance, and division of labor.  Plenty simple to organize the
individual translations using ISO 639 suffixes too, just like webpages,
because then you can 'grep' and 'ls' and 'wc' and all sorts of other fun
things on them.

BTW, something like this would be nice for a file structure:

   .../faq/lugnet/market/auction/awareness/1.en.faq
   .../faq/lugnet/market/auction/awareness/2.en.faq
      :
   .../faq/lugnet/market/auction/preparation/1.en.faq
   .../faq/lugnet/market/auction/preparation/2.en.faq
      :
   .../faq/lugnet/market/shipping/packing/1.en.faq
   .../faq/lugnet/market/shipping/packing/2.en.faq
      :
   .../faq/lugnet/market/shipping/general/1.en.faq
   .../faq/lugnet/market/shipping/general/2.en.faq
      :
   .../faq/lugnet/market/shipping/carriers/1.en.faq
   .../faq/lugnet/market/shipping/carriers/2.en.faq
      :

Advantages of this file structure:

- Clean, simple, and extremely portable.

- Very easy to get a quick overview via 'du'.

- Very easy to iterate over arbitrary subsets of the files using:
     du -a | grep ... | perl ...

- Very easy to make and exchange tarballs of the whole thing or subsections.

- Webserver could even be configured to display .faq files directly (with
  slight on-the-fly filtering), thereby making each and every Q&A pair
  bookmarkable and pass-around-able as a URL.


e.g. (1):

<FAQENTRY>
Newsgroups: lugnet.market.*
Category: Auction & Shipping

I'd list this particular Q&A pair in lugnet.market.auction rather than
lugnet.market.* -- because (1) the Auction & Shipping FAQ is really two FAQs
(an Auction FAQ and a Shipping FAQ) and (2) by virtue of this Q&A pair being
listed in lugnet.market.auction, it would also automatically be listed in
the virtual group lugnet.market (and so forth on up the tree).

Automatically propagating all Q&A pairs all the way up the tree to the top
might result in excess noise in some cases -- for example, stuff pertaining
to a particular city or a particular store in a particular city.  It may be
useful to be able to limit the hierarchical range of Q&A pairs.

So instead of automatically propagating something up the tree, the list of
newsgroups should probably be explicit -- wildcards still allowed, but
giving group names separated by commas explicitly, i.e.:

   Newsgroups: lugnet.market.auction,lugnet.market,lugnet
      [specifies levels 3, 2, and 1]

   Newsgroups: lugnet.loc.us.ma.bos,lugnet.loc.us.ma
      [specifies levels 5 and 4 only]

   Newsgroups: lugnet.off-topic.clone-brands
      [specifies level 3 only]

Note:  The newsgroup names don't necessarily have to be actual ng names --
they can be virtual ng names corresponding to categories on the webpages.

Alternatively, the URI of the ng/resource could be used instead of an ng
name, i.e.:

   Locations: /market/auction/, /market/, /

   Locations: /loc/us/ma/bos/, /loc/us/ma/

   Locations: /off-topic/clone-brands/

That's probably closer to something more extensible to other areas like sets
and models and things, and it's fully backward compatible with newsgroup
names.


Subcategory: Awareness

A bonus with the URI approach is that subcategories can be encoded directly
into the location!  :)

   Locations: /market/auction/awareness/

   Locations: /market/shipping/packing/

If the subcategory actually exists as a defined webpage subsection, then the
Q&A pair would appear there on that page as well as higher pages.  If it
doesn't actually exist as a defined webpage subsection (the two examples
shown directly above wouldn't) then the Q&A pair would only appear on higher
pages for which actual virtual or nonvirtual categories do exist.

The other neat thing is that it's forward-extensible -- in that if a
subcategory ever does become defined as its own webpage subsection, then it
automatically goes there with no changes needed to the underlying FAQ data.

The FAQ Q&A pairs then also help plan and organize substructures of areas.
Wherever lots of information goes, substructures arise.  Wherever
substructures arise, subcategories can be considered and planned.

The giant link-farm will probably work this way -- lots of virtual
categories and virtual sub-categories, but grouped together on up the tree
so that you can view things at different granularities.  No fun having too
many subcategories, or too few, but this way, as a user/browser of the
information, you can have your cake and eat it too....


I'm kind of split on whether to use HTML within the text.  The plus = better
formatting on a Web page.  The minus = worse formatting when converted into
plain text.  Will this be an HTML-only FAQ?

it's gotta be postable to the newsgroups (and readable there as plan ASCII),
so it can't be HTML-only.

I'd always thought of it as being ASCII-only (as in extended ASCII
such as ISO-8859-1) but it's nice not to have to read a FAQ on a web page in
Courier, and more importantly, it'd be great if the Q&A pairs weren't pre-
formatted to 80 column text, because then they can be presented in sidebars
or at any width (more on this later).

If HTML is allowed in the core content, then I think it ought to be highly
restrictive HTML, with strict usage rules.  For example, no frames, no
tables, no JavaScript, no embedded images, no centering, no forms, etc.:
basically nothing that can't be easily converted to plain ASCII.  Just the
basic necessities:  paragraphs, italics (for names/citations), boldface (so
long as it's not abused), preformatted text, and of course hyperlinks.

I have routines to convert HTML entities like &uuml; and &reg; to ISO-8859-1
ASCII and back, and a routine to line-wrap arbitrary paragraphs of text, so
handling <P>, <I>, <B>, <PRE>, and <A> would be pretty straightforward.


We also need:
     Content-Language: [ISO 639 language code]

Very good idea Jacob.  I suspect that we could possibly siphon some helpers
from the ldraw.org project for translation.

Translations would rock!  Especially French and Japanese -- those are the
two biggest gaps right now of people not using the system who could be if
their language was represented.

I suppose that each ng or each Q&A pair will need its own "default" or
"home" language -- that is, something stating in which language the "master
copy" of the Q&A pair is originally written, and from which all other-
language translations are derived (for that Q&A pair).  We wouldn't want to
see circular results like what happened to Mark Twain's
_The_Jumping_Frog_[1].

--Todd



Message has 2 Replies:
  Re: FAQ data format possibilities
 
Todd, I read your post just after I put up my own. Ships in the night. :-P (...) 'Kay, I hadn't thought of that. I like it; the only real drawback I can see to this is how to phrase the category titles; e.g. translating: (...) (25 years ago, 23-Apr-99, to lugnet.faq)
  Re: FAQ data format possibilities
 
(...) Whoops, I forgot to list <UL>/<OL>/<LI> and <DL>/<DT>/<DD> -- gotta have those! Now this is getting into nesting... :) There are some good Perl5 modules that already exist for doing all of this (the HTML->text conversion, not the FAQing). (...) (25 years ago, 23-Apr-99, to lugnet.faq)

Message is in Reply To:
  Re: I'll volunteer for the LEGO FAQ
 
(...) Also: Subcategory: [subcategory depth 1] Subcategory: [subcategory depth 2] etc. Plus, some kind of tags at the beginning and ending of each FAQ entry would make parsing it much easier IMO. e.g. (1): <FAQENTRY> Newsgroups: lugnet.market.* (...) (25 years ago, 23-Apr-99, to lugnet.faq)

19 Messages in This Thread:







Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact

This Message and its Replies on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR