|
What Is CSV? / Converting Spreadsheet Data To CSV Format
CSV stands for Comma Separated Values, sometimes also called Comma Delimited. A CSV file is a specially formatted plain text file which stores spreadsheet or basic database-style information in a very simple format, with one record on each line, and each field within that record separated by a comma. It's important that the individual "records" within a CSV file NOT contain commas, as this may break the simple formatting when using the file in another application. CSV files are often used as a simple way to transfer a large volume of spreadsheet or database information between programs, without worring about special file types. For example, transferring a home-made address book from Excel into a database program such as Filemaker Pro could be done by exporting the file as a CSV from Excel, then importing that CSV into Filemaker. Creating a CSV file from spreadsheet data is very simple to do using Microsoft Excel. Open your spreadsheet document and go to the File pull-down menu, then choose Save As.... Change the "Save as type" or "Format" field to read: "CSV (Comma delimited)". Enter a name for the document and click Save. To check the validity of your CSV file, open the new file from a plain-text reading program such as Notepad or TextEdit. Check to make sure there are no extra commas, other than between fields. Depending on what program you're going to use the CSV file with, you may need to also remove any extra quotation marks or other formatting which Excel may have inserted.
And aslo can be difine as- CSV, comma separated values, files are commonly used to transport large amounts of tabular data between either companies or applications that are not directly connected. The files are easily editable using common spreadsheet applications like Microsoft Excel. Fields are separated by commas. Records are separated with system end of line characters, CRLF (ASCII 13 Dec or 0D Hex and ASCII 10 Dec or 0A Hex respectively) for Windows, LF for Unix, and CR for Mac. Fields containing a comma must be escaped. Excel escapes these values by embedding the field inside a set of double quotes, generally referred to as text qualifiers, i.e. a single cell with the text apples, carrots, and oranges becomes "apples, carrots, and oranges". Unix style programs escape these values by inserting a single backslash character before each comma, i.e. a single cell with the text apples, carrots, and oranges becomes apples, carrots, and oranges. If a field contains a backslash character, then an additional backslash character is inserted before it. End of line characters within a field must also be escaped. A lot of programs do not support this feature. Excel and all the Microsoft text drivers do support it. Excel escapes these values the same as it escapes commas, and just embeds the field value inside a set of double quotes. Unix style programs have two distinct ways of escaping end of line characters within a field. Some Unix style programs use the same escape method as with commas, and just insert a single backslash before the end of line character. Other Unix style programs replace the end of line character using c style character escaping where CR becomes r and LF becomes n. In an Excel escaped CSV file, in fields containing a double quote, the double quote must be escaped by replacing the single double quote with two double quotes. Some files use an escaping format that is a mixture of the Excel escaping and Unix escaping where fields with commas are embedded in a set of double quotes like the Excel escaping, but fields containing double quotes are escaped by inserting a single backslash character before each double quote like the Unix style comma escaping. The first record in a CSV file might represent the names of the following columns of data, and are generally referred to as column headers. Each record in a file with column headers may be have fewer fields in it than the number of column headers. If so, the left out fields' values should be considered empty. Leading and trailing whitespace characters, commas and tab characters, adjacent to commas or record delimiters are trimmed. To guarantee preservation of leading and trailing whitespace characters, a field must be text qualified by embedding the field inside a set of double quotes. The column delimiter is often changed from a comma to some other character. The tab character is the most common replacement and the resulting format is sometimes referred to as TSV. The same escaping rules generally apply. If a field contains the column delimiter, then it must be escaped. Some Unix CSV files are known to use CR for the column delimiter and LF for the record delimiter. Records with seemingly no data in them, with no occurrences of the column delimiter and no value for the first field, should be skipped. A record with only a single field with no value must be text qualified to attempt to prevent the record from being skipped. The last record in a file may or may not be ended with an end of line character. The double quote character for the Excel style escaping text qualifier is sometimes replaced with a single quote or apostrophe, ASCII 39 Dec or 27 Hex. The end of line characters used for record delimiters are sometimes changed to other characters like a semicolon. Some users find it useful for a parser to be able to ignore an empty field in data and instead return the next non empty field. Non-printable characters in a field are sometimes escaped using one of several c style character escape sequences, ### and o### Octal, 0; Hex, d### Decimal, and u#### Unicode. Some delimited file formats have a comment character, generally #, that can be used as the first character on a line to represent that the following text up to the next end of line character is to be ignored. An IIS web log file is a good example. Some CSV parsing products require all fields be enclosed in double quotes as if there were characters within the field that need to be escaped but there aren't. Some CSV parsers that require strongly typed data from a CSV file consider non-textqualified fields to be numeric, or null if the field contains no data, and text data if a field is textqualified.
The abbreviation CSV can have the following meanings. |