To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.off-topic.geekOpen lugnet.off-topic.geek in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Off-Topic / Geek / 2072
2071  |  2073
Subject: 
Re: CSV delimiters
Newsgroups: 
lugnet.admin.database, lugnet.off-topic.geek, lugnet.publish
Date: 
Sat, 23 Sep 2000 07:20:41 GMT
Reply-To: 
MATTDM@MATTDMantispam.ORG
Viewed: 
11 times
  
Todd Lehman <lehman@javanet.com> wrote:
I guess it's pretty obvious, mostly, but how are the delimiter characters
typically encoded (specifically: comma, double-quote, and newline)?

I remember looking for this about a year ago, and I came across something
that looked very standardish. If I remember right, it's really silly.

You don't actually escape things. Fields are comma separated, and optionally
surrounded by quotation marks -- both strings and numeric fields. If you
need to represent a comma or a quotation mark, then the enclosing marks are
required. Commas-within-quotation-marks are just typed normally; to
represent a quotation mark in a field, you double it.

So:

  blah,"blah","1",1,"blah,blah,blah","he said ""blah"""

I don't think that there is any way to encode a newline. Perhaps by just
doing:

  blah,"blah
  blah"

but I've got some sort of intuition that there's a problem with that. Maybe
not. It's certainly ugly. :)

Sorry I can't find any references -- too tired. :)




What's a string?  Anything matching /[^0-9]/ or not matching /^[0-9]+$/ ?
(uh, plus any gunk for handling decimals and e+12 and all that funstuff).

There's no distinction made between a string and anything else....

I just downloaded a "CSV" version of my PayPal history and I thought this
was weird:  it put _everything_ (all fields, that is) in double-quotes --
even numerical fields.  But then it didn't put the header fields in quotes.

Yup.

And it did also put a trailing comma on each line (hmm).

I don't think that's right. Or helpful, even. :)



--
Matthew Miller                     --->                 mattdm@mattdm.org
Quotes 'R' Us                    --->              http://quotes-r-us.org/
Boston University Linux            --->               http://linux.bu.edu/



Message has 2 Replies:
  Re: CSV delimiters
 
There exists (in CPAN) a Text-CSV perl module. From its documentation: This module is based upon a working definition of CSV format which may not be the most general. 1 Allowable characters within a CSV field include 0x09 (tab) and the inclusive (...) (24 years ago, 23-Sep-00, to lugnet.admin.database, lugnet.off-topic.geek, lugnet.publish)
  Re: CSV delimiters
 
Maybe this will also be helpful. It's not exactly a standard, but it is IEEE. :) (URL) (24 years ago, 23-Sep-00, to lugnet.admin.database, lugnet.off-topic.geek, lugnet.publish)

Message is in Reply To:
  CSV delimiters
 
Is there any "official" or "reasonably standardized format" for CSV (_C_omma _S_eparated _V_alue) data? I guess it's pretty obvious, mostly, but how are the delimiter characters typically encoded (specifically: comma, double-quote, and newline)? In (...) (24 years ago, 23-Sep-00, to lugnet.admin.database, lugnet.off-topic.geek, lugnet.publish)

12 Messages in This Thread:




Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact

This Message and its Replies on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    
Active threads in Database

 
LUGNET Guide updates (Wed 2 Oct 2024)
19 hours ago
Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR