|
In lugnet.admin.database, Matthew Miller writes:
> There exists (in CPAN) a Text-CSV perl module. From its documentation:
> [...]
Aha! Excellent. Those docs sound well thought out.
Hmm, that reminds me... Duh, this is prolly a common Perl thing. I bet
Freidl[1] as some stuff on CSV? Lessee... <dig dig dig> Ahh, here we go:
pp. 204-208, 227, 231, 290. And (haha!) it even handles the special case of
adding a final empty field if the line ends in a trailing comma. :)
OK, so now I have a good definition, thanks to Matt, and some trustable code,
thanks to Freidl[1]. Now I wonder how (or if!) you can reliably and exactly
detect whether a given CSV input stream uses \" or "" escapement of " -- I saw
some Perl examples earlier which output \" instead of "" .
Are there any ambiguous input lines? This evil case comes close, but not
quite...
"foo",98.6,"bletch \"",3.14,"",
...as it's well-formed for \" but not for "", and this one...
"foo",98.6,"bletch \"",3.14,"","
...is well-formed for "" but not for \" .
(Not that either of those are actually very likely to occur, but feh anyway.)
If there aren't any ambiguous cases (meaning well-formed for both \" and ""
and yielding different parsing), then a smart parser may need to try one and
see if it fails (if it's not well-formed) and then try the other and see.
--Todd
[1] O'Reilly: http://www.oreilly.com/catalog/regex/ [2]
Amazon: http://www.amazon.com/exec/obidos/ASIN/1565922573
[2] I love that URL :-)
|
|
Message is in Reply To:
12 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
Active threads in Database
|
|
|
|