Using the CSV packge to perform CSV to XML transformations

The interlok-csv optional package allows you to render CSV documents as XML. It also contains a jdbc-csv-output so that you can output a CSV document directly from jdbc-data-query-service. We’re going to describe the transformation functionality here only. Under the covers, it uses a mixture of commons-csv and super-csv.

Simple CSV to XML

simple-csv-to-xml-transform takes a CSV file and renders it as XML. It expects that the CSV file is well formed, i.e. each line in the CSV file has the same number of fields. If this is not true then you should use raw CSV to XML instead. If the CSV has a header line then the XML element names can be derived from the header line, or auto-generated.

Raw CSV to XML

Sometimes you get a file that is ostensibly a CSV, but the number of fields on each line differ; raw-csv-to-xml-transform allows you to render that as XML regardless and may be of use in situations where you have a flat file that would be too difficult to model using flat-file-transform-service. It will always generate element names, and emit empty fields. Because there is no verification that CSV columns matchup; if the message type isn’t a CSV (e.g. it’s JSON or XML); it will still be marked up as XML regardless.

CSV styles

Using csv-basic-format is probably the best way to get started; you can choose between 5 different flavours (DEFAULT, EXCEL, MYSQL, RFC4180, TAB_DELIMITED) of CSV which should match most requirements. In the event that you have a custom separators then you can use csv-custom-format which allows you to define your own format through explicit configuration.

Style Description
DEFAULT Effectively the same as RFC4180, but ignores empty lines
EXCEL Delimiter ,, QuoteChar ", RecordSeparator \r\n. Note that the actual delimiter generated by Excel is locale dependent.
MYSQL Delimiter \t, EscapeChar \, RecordSeparator \n
RFC4180 Delimiter ,, QuoteChar ", Record Separator \r\n
TAB_DELIMITED Delimiter \t, IgnoreSurroundingSpaces true

CSV to JSON

Since 3.6.6 you can convert to and from CSV via the interlok-csv-json optional package. This adds new services that allow you to easily convert JSON to CSV and vice versa. It has a dependency on both interlok-csv and interlok-json.

Tags: cookbook