Subject:
|
Re: "Well-Formed" LDraw (Was: regular expressions for DAT file subpart lines)
|
Newsgroups:
|
lugnet.off-topic.geek, lugnet.cad.dev
|
Date:
|
Fri, 25 Jun 1999 07:40:13 GMT
|
Viewed:
|
75 times
|
| |
| |
Hey now,
that's beginning to look like the yacc grammer for LDRAW in
the LDLite source code.
If you want to get precise in the language spec, grab the LDLite lex and
yacc files and either take them as the defacto standard or modify them
to handle the cases better.
-gyug
Steve Bliss wrote:
>
> On Wed, 23 Jun 1999 22:09:53 GMT, Sproaticus <jsproat@geocities.com> wrote:
>
> > Steve Bliss wrote:
> > > Oops, I forgot a few other odd things. I *think* your expression allows
> > > these, but I'm not sure:
> > > .1
> > Nope, won't take a fraction w/o the integer part; gotta fix that
> >
> > > -.1
> > Nope, same reason as above
>
> Both of the above are common output from LDAO code.
>
> > > 1E
> > Nope, I can't even figure out what this is supposed to be without the
> > mantissa! Is it just 1?
>
> Well, I was just throwing stuff out. I don't know if LDraw would take that
> one, or not.
>
> > The hard part about using a single regular expression to detect scientific
> > notation is that it might allow things like "+.E." to represent a number,
> > which any respectable atoi() would choke on. I gotta think about this some
> > more, see if it's even worth my time at the moment...
>
> Darned context-sensitive constructions.
>
> > > BTW, the color-code can be in scientific notation. The line-type can't, at
> > > least, it can't in LDLite. I'm betting LDraw would recognize 1E0 as a
> > > valid line-type.
> >
> > Hmmm... But LEdit doesn't output scientific notation in these areas. I'm
> > inclined to be more liberal than restrictive, but some of these cases don't
> > make any sense; i.e. fractional colors.
>
> I didn't say they made sense. Just that they were valid input to LDraw.
>
> > > What about line-breaks? LDraw allows those in the middle of the line. And
> > > anything extra after the filename is ignored, I think. At least, on
> > > line-types 2 through 5, anything after the last parameter and before a line
> > > break is ignored.
> >
> > Arg. Broken lines would obviate a line-based parser. Pascal parsers tend
> > to be character-based, while Perl makes it much easier to write line-based
> > parsers.
>
> Yep. LDraw's parser consists of the Pascal read() and readln() functions.
>
> > I wonder if we should start to differentiate between "valid" and
> > "well-formed" LDraw files, a la XML...?
>
> That might be a useful distinction. I assume "valid" means LDraw will
> render it correctly, and "well-formed" means it follows the expected forms?
> "Well-formed" would not allow the following:
> - Mid-command line-breaks
> - Scientific notation for line-types and color codes
> - Leading zeroes on values (except when the value is between 0 and 1)
>
> Hmmm. Going outside of regular expressions, using some personal flavor of
> grammar notation (and not claiming this a well-written grammar):
>
> Start -> <command> <br> <Start>
> | null
> command -> 1 <ws> <subfile>
> | 2 <ws> <line>
> | 3 <ws> <triangle>
> | 4 <ws> <quad>
> | 5 <ws> <cond_line>
> | 0 <ws> <meta_command>
> | <ws> ; not sure this one is necessary
> br -> [\cr\lf]+
> subfile -> <int> <ws> (<real> <ws>){12} <filename> <ws>*
> line -> <int> <ws> (<real> <ws>){5} <real> <ws>*
> triangle -> <int> <ws> (<real> <ws>){8} <real> <ws>*
> quad -> <int> <ws> (<real> <ws>){11} <real> <ws>*
> cond_line -> <int> <ws> (<real> <ws>){11} <real> <ws>*
> meta_command -> <keyword> <string>
> | <string>
> keyword -> PRINT
> | WRITE
> | STEP
> | CLEAR
> | SAVE
> | PAUSE
> | CLS
> string -> {any string of text, excluding \cr and \lf}
> real -> <sign> <float>
> | <sign> <float> E <sign> <float>
> int -> [1-9][0-9]*
> | 0
> float -> <int>
> | <int> .
> | <int> . <int>
> | . <int>
> sign -> +
> | null
> ws -> [\32\9]+ ; sorry if my escape-notation is weird
|
|
Message has 1 Reply:
Message is in Reply To:
9 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
This Message and its Replies on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|