XML Convert 2.2

Examples

This page presents a variety of examples in which XML Convert is used to perform the following conversions:

The examples are as follows:

For additional examples of XFlat schemas, please refer to the Sample XFlat Schemas section of the XFlat Language page.

NOTE: If you view a sample XML document (e.g., an XFlat schema) using Microsoft's Internet Explorer (IE), then you will not see any of the leading and trailing whitespace characters in the attribute values. For example, IE will display LeadingFillerChars=" " as LeadingFillerChars="". You can use IE's View/Source command, to see these whitespace characters.

Generic XFlat Schema for CSV Files

This example presents a generic XFlat schema that can be used to convert any CSV file into XML. If you are not familiar with the CSV file format, then please see the description of the CSV format in the Overview of XML Convert and XFlat.

The csv.xfl file contains the generic XFlat schema for CSV files. In this example, we use the csv.xfl schema to convert the csv_data.txt file, which is in CSV format, into the csv_data.xml file, which contains an XML document. Note that the element names in the csv_data.xml file are generic and are not based on the column headings in the csv_data.txt file.

The csv.xfl schema can handle CSV files in which the records may contain any number of fields. A blank line in the CSV file is treated as a record that contains one field whose value is null.

The csv.xfl schema can be used to convert a CSV file into XML; however, it can't be used to convert an XML document into a CSV file.

For this example, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml csv.xfl csv_data.txt csv_data.xml

Generic XFlat Schema for Tab-Delimited Files

This example presents a generic XFlat schema that can be used to convert a tab-delimited file into XML. The schema can handle any tab-delimited file that meets the following constraints:

Note that each record in the tab-delimited file may have any number of fields. A blank line in the tab-delimited file is treated as a record that contains one field whose value is null.

The tab_delimited.xfl file contains the generic XFlat schema for tab-delimited files. In this example, we use the tab_delimited.xfl schema to convert the tab_delimited_data.txt file, which is in tab-delimited format, into the tab_delimited_data.xml file, which contains an XML document. Note that the element names in the tab_delimited_data.xml file are generic and are not based on the column headings in the tab_delimited_data.txt file.

The tab_delimited.xfl schema can be used to convert a tab-delimited file into XML; however, it can't be used to convert an XML document into a tab-delimited file.

For this example, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml tab_delimited.xfl tab_delimited_data.txt tab_delimited_data.xml

Generic XFlat Schema for Windows Configuration Settings Files

This example presents a generic XFlat schema that can be used to convert a Windows Configuration Settings file (e.g., an INI file, such as win.ini) into XML.

The ini.xfl file contains the generic XFlat schema for Windows Configuration Settings files. In this example, we use the ini.xfl schema to convert the ini_data.txt file, which is a Windows Configuration Settings file, into the ini_data.xml file, which contains an XML document. Note that the element names in the ini_data.xml file are generic and are not based on the section names or the setting names in the ini_data.txt file.

The ini.xfl schema can be used to convert the ini_data.txt file into the ini_data.xml file. The XFlat schema can also be used to convert the ini_data.xml file into a flat file that contains the same data as the ini_data.txt file, except that the comment will not contain any whitespace after the semicolon and there will be no blank lines.

For this example, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml ini.xfl ini_data.txt ini_data.xml

XFlat Schema Contains a DTD that Defines Default Attribute Values

The XFlat schema in this example uses a DTD to define global defaults for the RecSep, FieldSep and QuotedValue attributes. See employees_dtd.xfl, employees_dtd.txt and employees_dtd.xml. The schema can be used to convert the employees_dtd.txt file into the employees_dtd.xml file, and vice versa.

For this example, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml employees_dtd.xfl employees_dtd.txt employees_dtd.xml

The xml2flat application (i.e., the xml2flat.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

xml2flat employees_dtd.xfl employees_dtd.xml employees_dtd.txt

XFlat Schema Uses the XFlat DTD Attribute

The XFlat schema in this example uses the XFlat DTD attribute to specify a document type declaration, which XML Convert will write to the XML output when converting from flat file to XML. The DTD in the document type declaration defines a fixed attribute named "Type" for the product element. See catalog_dtd_attr.xfl, catalog_dtd_attr.txt and catalog_dtd_attr.xml. The XFlat schema can be used to convert the catalog_dtd_attr.txt file into the catalog_dtd_attr.xml file, but it can't be used to convert the catalog_dtd_attr.xml file into a flat file, since the XFlat schema does not contain a FieldDef element for the fixed attribute.

For this example, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml catalog_dtd_attr.xfl catalog_dtd_attr.txt catalog_dtd_attr.xml

Optional Line Separator at End of Flat File

The flat file in this example may contain an optional line separator at the end of the file. See contacts_crlf.xfl, contacts_crlf.txt and contacts_crlf.xml. The XFlat schema can be used to convert the contacts_crlf.txt file into the contacts_crlf.xml file, and to convert the contacts_crlf.xml file into a flat file that contains the same data as the contacts_crlf.txt file, except that the output will not contain an extra line separator at the end of the file.

For this example, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml contacts_crlf.xfl contacts_crlf.txt contacts_crlf.xml

Optional Control-Z Character at the End of File

The flat file in this example may contain an optional control-Z character (i.e., Unicode #1A) at the end of the file. See contacts_control_z.xfl, contacts_control_z.txt and contacts_control_z.xml.

Note that the control-Z character is not a legal XML character; thus, the control-Z character cannot appear within an XML document, such as the contacts_control_z.xfl file. Since we want the value of the ValidValue attribute in the FieldDef element in the XFlat schema to be equal to the control-Z character, we must encode the control-Z character as "\#x1A;". When XML Convert reads the XFlat schema, it will convert the "\#x1A;" string into a control-Z character.

The XFlat schema can be used to convert the contacts_control_z.txt file into the contacts_control_z.xml file; the XFlat schema can also be used to convert the contacts_control_z.xml file into a flat file that contains the same data as the contacts_control_z.txt file, except that the output will not contain a control-Z character at the end of the file.

For this example, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml contacts_control_z.xfl contacts_control_z.txt contacts_control_z.xml

Last Record May or May Not Be Terminated With a Record Separator

The last record in the flat file in this example may or may not be terminated by the record separator. See last_recsep_is_optional.xfl, last_recsep_is_optional.txt and last_recsep_is_optional.xml.

Note that the last record in the last_recsep_is_optional.txt file is not terminated with the record separator.

The XFlat schema can be used to convert the last_recsep_is_optional.txt file into the last_recsep_is_optional.xml file; the XFlat schema can also be used to convert the last_recsep_is_optional.xml file into a flat file that contains the same data as the last_recsep_is_optional.txt file, except that the last record in the output will be terminated with the record separator.

For this example, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml last_recsep_is_optional.xfl last_recsep_is_optional.txt last_recsep_is_optional.xml

Human-Readable Report

The XFlat schema in this example describes a human-readable report that contains a checking account register. The XFlat schema uses a DTD that defines an entity that expands into a RecordDef element that appears repeatedly throughout the schema. See register.xfl, register.txt and register.xml. The register.xfl schema can be used to convert the register.txt file into the register.xml file; however, it can't be used to convert the register.xml file into a valid register file.

For this example, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml register.xfl register.txt register.xml

Batch of Purchase Orders

The XFlat schema in this example describes a flat file that contains a batch of purchase orders. The flat file contains a batch header record followed by one or more purchase orders. Each purchase order consists of several header records followed by one or more item detail records. See purchase_order_batch.xfl, purchase_order_batch.txt and purchase_order_batch.xml. The purchase_order_batch.xfl schema can be used to convert the purchase_order_batch.txt file into the purchase_order_batch.xml file and vice versa.

For this example, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml purchase_order_batch.xfl purchase_order_batch.txt purchase_order_batch.xml

ChoiceDef Example

This example illustrates the use of the ChoiceDef element. Suppose we have a flat file that must contain one record. This record must be a positive acknowledgment record or a negative acknowledgment record. The format of a positive acknowledgment record is different than the format of a negative acknowledgment record.

The positive acknowledgment record consists of the string success followed by a comma followed by a status code, which is a non-negative integer, followed by a line separator. The positive_ack.txt file contains a positive acknowledgment record.

The negative acknowledgment record consists of a status code, which is a negative integer, followed by a comma followed by an error message followed by a line separator. The negative_ack.txt file contains a negative acknowledgment record.

We can model the choice between the positive acknowledgment record and the negative acknowledgment record using a ChoiceDef element that contains the following subelements:

The choice_of_acks.xfl file contains the XFlat schema for the acknowledgment files. Both the positive_ack.txt file and the negative_ack.txt file conform to the choice_of_acks.xfl schema.

We can use the flat2xml application and the choice_of_acks.xfl schema to convert the positive_ack.txt file into the positive_ack.xml file. In this case, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml choice_of_acks.xfl positive_ack.txt positive_ack.xml

When the flat2xml application reads the record from the positive_ack.txt file, the application will try to match the record against the first RecordDef element. The record will match the first RecordDef element, and the flat2xml application will not try to match this record against the second RecordDef element.

We can use the flat2xml application and the choice_of_acks.xfl schema to convert the negative_ack.txt file into the negative_ack.xml file. In this case, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml choice_of_acks.xfl negative_ack.txt negative_ack.xml

When the flat2xml application reads the record from the negative_ack.txt file, the application will try to match the record against the first RecordDef element. The record will not match the first RecordDef element, and the flat2xml application will then try to match this record against the second RecordDef element. The record will match the second RecordDef element.

Note that we can use the xml2flat application and the choice_of_acks.xfl schema to convert the positive_ack.xml file into the positive_ack.txt file. Similarly, we can use the xml2flat application and the choice_of_acks.xfl schema to convert the negative_ack.xml file into the negative_ack.txt file.

Record May Contain Two or Three Fields

The XFlat schema in this example describes a record that may contain two or three fields (i.e., the last field and the field separator that precedes it may be omitted). See 2_or_3_fields.xfl, 2_or_3_fields.txt and 2_or_3_fields.xml. The XFlat schema can be used to convert the 2_or_3_fields.txt file into the 2_or_3_fields.xml file, and to convert the 2_or_3_fields.xml file into a flat file that contains the same data as the 2_or_3_fields.txt file, except that the second record in the output would contain a null salary field (i.e., the name field in the second record would be followed by the field separator and the record separator).

The XFlat schema in this example uses a DTD to define the following entities:

The XFlat schema also uses a ChoiceDef element that contains two RecordDef elements. The first RecordDef element defines the record that contains three fields. The second RecordDef element defines the record that contains two fields. When the flat2xml application reads a record from the 2_or_3_fields.txt file, it will try to match it against the first RecordDef element. If this record has three valid fields, then it will match the first RecordDef element, and flat2xml will not try to match the record against the second RecordDef element. If the record has only two fields, then it will not match the first RecordDef element, and flat2xml will try to match the record against the second RecordDef element.

For this example, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml 2_or_3_fields.xfl 2_or_3_fields.txt 2_or_3_fields.xml

Comment Lines in the Flat File

The XFlat schema in this example describes a flat file, in which one or more comment lines may appear before or after any record. See comments.xfl, comments.txt and comments.xml. The XFlat schema can be used to convert the comments.txt file into the comments.xml file, and to convert the comments.xml file into a flat file that contains the same data as the comments.txt file, except that the output will not contain any comments.

The XFlat schema in this example uses a DTD to define the comment entity, which expands into the RecordDef element for the comment line. This entity allows us to define the RecordDef element once, and then use it several times throughout the XFlat schema.

For this example, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml comments.xfl comments.txt comments.xml

Fields are Separated By One or More Spaces and/or Tabs

The XFlat schema in this example describes a flat file in which the fields are separated by one or more spaces and/or tabs. See fieldsep_is_whitespace.xfl, fieldsep_is_whitespace.txt and fieldsep_is_whitespace.xml.

Please note that in the XFlat schema, the time, systolic_blood_pressure and dialstolic_blood_pressure fields are defined as variable-length, since the default value of the MinFieldLength attribute is zero and the default value of the MaxFieldLength attribute is 80. Also, these fields are defined within the XFlat schema as non-delimited, since the FieldDef elements for these fields and the RecordDef element for the snapshot record do not include the FieldSep attribute. The FieldDef elements for these fields include the InvalidChars=" \t" attribute, so that XML Convert can determine the end of each of these fields. For example, when XML Convert reads in the time field from the flat file, it will keep reading characters until one of the following conditions is true:

Similarly, the FieldDef elements for the field_separator fields include the ValidChars=" \t" attribute, so that XML Convert can determine the end of each of these fields. For example, when XML Convert reads in a field_separator field from the flat file, it will keep reading characters until one of the following conditions is true:

The XFlat schema can be used to convert the fieldsep_is_whitespace.txt file into the fieldsep_is_whitespace.xml file. The XFlat schema can also be used to convert the fieldsep_is_whitespace.xml file into a flat file that contains the same data as the fieldsep_is_whitespace.txt file, except that adjacent fields within each record will be separated by only one space.

For this example, the flat2xml application (i.e., the flat2xml.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xml fieldsep_is_whitespace.xfl fieldsep_is_whitespace.txt fieldsep_is_whitespace.xml

Using SAXON to Convert an XML Document into a Flat File

In this example, we use SAXON, which is Michael Kay's XSLT processor, to convert an XML document (products.xml) into a flat file (catalog.txt). The XML document does not conform to the flat file's XFlat schema (catalog.xfl). In fact, this XML document does not conform to any XFlat schema, since each of the Product elements contains an attribute and text. (For more information about the XML structures that cannot be handled by XML Convert, please see the description of the MapToXml attribute in the XFlat Language page.)

The XML document is the products.xml file, which contains a catalog of products. The XSLT stylesheet is prod2cat_saxon.xsl. The SAXON application uses the XSLT stylesheet (prod2cat_saxon.xsl) to transform the products.xml file into an in-memory representation of the target XFlat instance. SAXON then passes the XFlat instance to XML Convert, which validates the XFlat instance against the target XFlat schema (catalog.xfl) and builds the target flat file (catalog.txt). Note that the XSLT stylesheet contains an xsl:output instruction, which specifies XML Convert (i.e., the com.unidex.xflat.Sax2FlatFileBuilder class) as the output method.

For this example, SAXON is invoked from the Windows/MS-DOS Command Prompt as follows:

java -Dcom.unidex.xflat.schema-file-name-2=catalog.xfl
     -Dcom.unidex.xflat.suppress-startup-message=true
     -Dcom.unidex.xflat.flat-file-name=catalog.txt
     com.icl.saxon.StyleSheet products.xml prod2cat_saxon.xsl

Note that the name of the XFlat schema file (catalog.xfl) and the name of the output flat file (catalog.txt) are specified via system properties on the command line. Also, the command, which is quite long, is displayed as multiple lines for readability. However, when invoking this command at the Command Prompt, you must enter it as a single line.

The command above is equivalent to the following two commands, in which SAXON writes the XML output to a temporary file (i.e., tmp.xml), which the xml2flat application then converts into a flat file (catalog.txt).

java com.icl.saxon.StyleSheet -o tmp.xml products.xml prod2cat.xsl
java xml2flat -s catalog.xfl tmp.xml catalog.txt

The prod2cat.xsl stylesheet is equivalent to the prod2cat_saxon.xsl stylesheet, with the exception that the method attribute of the xsl:output instruction in the prod2cat.xsl stylesheet is set to "xml".

Using XT to Convert an XML Document into a Flat File

This example is similar to the previous one, except that we use XT, which is James Clark's XSLT processor, to convert an XML document (products.xml) into a flat file (catalog.txt).

The XSLT stylesheet is prod2cat_xt.xsl. The XT application uses this XSLT stylesheet to transform the products.xml file into an in-memory representation of the target XFlat instance. XT then passes the XFlat instance to XML Convert, which validates the XFlat instance against the target XFlat schema (catalog.xfl) and builds the target flat file (catalog.txt). Note that the XSLT stylesheet contains an xsl:output instruction, which specifies XML Convert (i.e., the com.unidex.xflat.XtFlatFileBuilder class) as the output method and the catalog.xfl file as the target XFlat schema.

For this example, the XT executable would be invoked from the Windows/MS-DOS Command Prompt as follows:

xt products.xml prod2cat_xt.xsl catalog.txt

Using SAXON to Convert a Flat File into XML

In this example, we use SAXON, which is Michael Kay's XSLT processor, to convert a flat file (catalog.txt) into an XML document (products.xml), the format of which does not conform to the flat file's XFlat schema (catalog.xfl). In fact, the flat2xml application is not able to create this XML document, since each of the Product elements contains an attribute and text. (For more information about the XML structures that cannot be handled by XML Convert, please see the description of the MapToXml attribute in the XFlat Language page.)

The flat file is the catalog.txt file, which contains a catalog of products. The XFlat schema is catalog.xfl. The XSLT stylesheet is catalog2products.xsl. The resulting XML document is products.xml.

SAXON uses the com.unidex.xflat.Sax2FlatFileParser class to convert the flat file into an in-memory representation of the corresponding XFlat instance. SAXON then uses the XSLT stylesheet (catalog2products.xsl) to transform the XFlat instance into an XML document named products.xml.

For this example, SAXON would be invoked from the Windows/MS-DOS Command Prompt as follows:

java -Dcom.unidex.xflat.schema-file-name=catalog.xfl
     -Dcom.unidex.xflat.suppress-startup-message=true
     com.icl.saxon.StyleSheet -o products.xml
          -x com.unidex.xflat.Sax2FlatFileParser 
          catalog.txt catalog2products.xsl

Note that the name of the XFlat schema file (catalog.xfl) and the value for the suppress-startup-message flag (true) are specified via system properties on the command line. Also, the command, which is quite long, is displayed as multiple lines for readability. However, when invoking this command at the Command Prompt, you must enter it as a single line.

The command above is equivalent to the following two commands, in which the flat2xml application writes the XML output to a temporary file (i.e., tmp.xml), which SAXON then transforms into the products.xml file.

java flat2xml -s catalog.xfl catalog.txt tmp.xml
java com.icl.saxon.StyleSheet -o products.xml tmp.xml catalog2products.xsl

Using flat2xt to Convert a Flat File into XML

This example is similar to the previous one, except that we use the flat2xt application to convert a flat file (catalog.txt) into an XML document (products.xml), the format of which does not conform to the flat file's XFlat schema.

The flat file is the catalog.txt file, which contains a catalog of products. The XFlat schema is catalog.xfl. The XSLT stylesheet is catalog2products.xsl. The flat2xt application converts the flat file into an in-memory representation of the corresponding XFlat instance. The flat2xt application then passes the XFlat instance to XT, which uses the XSLT stylesheet (catalog2products.xsl) to transform the XFlat instance into an XML document named products.xml.

For this example, the flat2xt application (i.e., the flat2xt.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xt catalog.xfl catalog.txt catalog2products.xsl products.xml

Using SAXON to Convert a Flat File into HTML

In this example, we use SAXON, which is Michael Kay's XSLT processor, to convert a flat file (catalog.txt) into an HTML document (catalog.html). The XFlat schema for the flat file is catalog.xfl. The XSLT stylesheet (catalog.xsl) transforms the XFlat instance into an HTML document (catalog.html) that contains a list of products that are sorted by description.

SAXON uses the com.unidex.xflat.Sax2FlatFileParser class to convert the flat file (catalog.txt) into an in-memory representation of the corresponding XFlat instance. SAXON then uses the XSLT stylesheet (catalog.xsl) to transform the XFlat instance into an HTML document named catalog.html.

For this example, SAXON would be invoked from the Windows/MS-DOS Command Prompt as follows:

java -Dcom.unidex.xflat.schema-file-name=catalog.xfl
     -Dcom.unidex.xflat.suppress-startup-message=true
     com.icl.saxon.StyleSheet -o catalog.html
          -x com.unidex.xflat.Sax2FlatFileParser 
          catalog.txt catalog.xsl

Note that the name of the XFlat schema file (catalog.xfl) and the value for the suppress-startup-message flag (true) are specified via system properties on the command line. Also, the command, which is quite long, is displayed as multiple lines for readability. However, when invoking this command at the Command Prompt, you must enter it as a single line.

The command above is equivalent to the following two commands, in which the flat2xml application writes the XML output to a temporary file (i.e., tmp.xml), which SAXON then transforms into the catalog.html file.

java flat2xml -s catalog.xfl catalog.txt tmp.xml
java com.icl.saxon.StyleSheet -o catalog.html tmp.xml catalog.xsl

As an alternative to SAXON, we can use the following flat2xt command to convert the flat file (catalog.txt) into the HTML document (catalog.html):

flat2xt catalog.xfl catalog.txt catalog.xsl catalog.html

Using SAXON to Convert a Flat File from one Format to Another

In this example, we use SAXON, which is Michael Kay's XSLT processor, to convert a flat file from one format to another. The source flat file is the contacts.txt file, which contains information about contacts in the Windows INI format. The target flat file is the addr-book.txt file, which contains an address book in CSV format. The XFlat schema for the source flat file is contacts.xfl. The XFlat schema for the target flat file is addr-book.xfl.

Note that an XFlat instance that conforms to the source XFlat schema would not conform to the target XFlat schema; similarly, an XFlat instance that conforms to the target XFlat schema would not conform to the source XFlat schema. So, we use SAXON and the cont2addr_saxon.xsl stylesheet to transform an XFlat instance that conforms to the source XFlat schema into a result tree that conforms with the target XFlat schema. The XSLT stylesheet specifies the com.unidex.xflat.Sax2FlatFileBuilder class as the output method. SAXON invokes the Sax2FlatFileParser class to convert the source flat file into an in-memory representation of the corresponding XFlat instance. SAXON then uses the XSLT stylesheet (cont2addr_saxon.xsl) to transform the XFlat instance into a result tree that conforms to the target XFlat schema. The result tree is passed to com.unidex.xflat.Sax2FlatFileBuilder, which converts the result tree into the target flat file (addr-book.txt).

For this example, SAXON would be invoked from the Windows/MS-DOS Command Prompt as follows:

java -Dcom.unidex.xflat.schema-file-name=contacts.xfl
     -Dcom.unidex.xflat.suppress-startup-message=true
     -Dcom.unidex.xflat.schema-file-name-2=addr-book.xfl
     -Dcom.unidex.xflat.flat-file-name=addr-book.txt
     com.icl.saxon.StyleSheet
          -x com.unidex.xflat.Sax2FlatFileParser 
          contacts.txt cont2addr_saxon.xsl

Note that the names of the XFlat schema files (contacts.xfl and addr-book.xfl) and the name of the output flat file (addr-book.txt) are specified via system properties on the command line. Also, the command, which is quite long, is displayed as multiple lines for readability. However, when invoking this command at the Command Prompt, you must enter it as a single line.

The command above is equivalent to the following three commands:

java flat2xml -s contacts.xfl contacts.txt tmp.xml
java com.icl.saxon.StyleSheet -o tmp2.xml tmp.xml cont2addr.xsl
java xml2flat -s addr-book.xfl tmp2.xml addr-book.txt

The cont2addr.xsl stylesheet is equivalent to the cont2addr_saxon.xsl stylesheet, with the exception that the method attribute of the xsl:output instruction in the cont2addr.xsl stylesheet is set to "xml".

Using flat2xt to Convert a Flat File from one Format to Another

This example is similar to the previous one, except that we use the flat2xt application.

We can use the following flat2xt command to convert the contacts.txt file into the addr-book.txt file using the cont2addr_xt.xsl stylesheet:

flat2xt contacts.xfl contacts.txt cont2addr_xt.xsl addr-book.txt

The cont2addr_xt.xsl stylesheet is equivalent to the cont2addr.xsl stylesheet, with the exception that the xsl:output instruction in the cont2addr_xt.xsl stylesheet specifies the com.unidex.xflat.XtFlatFileBuilder class as the output method and the addr-book.xfl file as the target XFlat schema.

Using flat2xt to Convert Good Records into XML and to Create a List of Bad Records

In this example, the flat file (employees_goodbad.txt) contains good employee records and bad employee records. The good records are syntactically valid and the bad ones are syntactically invalid. We use the flat2xt application to convert the flat file into the following two files:

If there are no bad employee records in the flat file, then flat2xt will not create the employees_bad.html file.

The XFlat schema is employees_goodbad.xfl. This schema contains a ChoiceDef element that contains two RecordDef elements. The first RecordDef element matches only valid employee records. The second RecordDef element matches any employee record (i.e., both good and bad employee records). For each employee record in the flat file, the flat2xt application will try to match the record against the first RecordDef element in the ChoiceDef element. If the record is valid, then flat2xt will not try to match the record against the second RecordDef element. If the record is invalid (i.e., it does not comply with the first RecordDef element), then flat2xt will try to match the record against the second RecordDef element and it will succeed.

The XSLT stylesheet is employees_goodbad.xsl. The flat2xt application converts the flat file into an in-memory representation of the corresponding XFlat instance, which contains an employee element for each good record in the flat file and an invalid_employee_record element for each bad record in the flat file. The flat2xt application then passes the XFlat instance to XT, which uses the XSLT stylesheet to transform the XFlat instance into an XML document (employees_good.xml) and an HTML document (employees_bad.html). The stylesheet uses an XT extension instruction named xt:document to create a second output file (employees_bad.html) when there is at least one invalid_employee_record element in the source document.

For this example, the flat2xt application (i.e., the flat2xt.exe executable) would be invoked from the Windows/MS-DOS Command Prompt as follows:

flat2xt employees_goodbad.xfl employees_goodbad.txt employees_goodbad.xsl employees_good.xml

By invoking the flat2xml application as follows, you can create the intermediary XML document (i.e., the XFlat instance), which XT transforms into the employees_good.xml and employees_bad.html files:

flat2xml employees_goodbad.xfl employees_goodbad.txt employees_goodbad.xml

Note that the employees_goodbad.xml file contains an employee element for each good record in the flat file and an invalid_employee_record element for each bad record in the flat file.

Using flat2flat to Convert a Flat File from one Format to Another

In this example, we use the flat2flat application to convert a flat file from one format to another. The source flat file is the enrollment_csv.txt file, which contains information about employees who are enrolled in a benefits program. The format of the source flat file is Comma Separated Value (CSV). The target flat file is the enrollment_fixed.txt file, which also contains information about employees who are enrolled in a benefits program; however, the enrollment_fixed.txt file contains fixed length records and fields. The XFlat schema for the source flat file is enrollment_csv.xfl. The XFlat schema for the target flat file is enrollment_fixed.xfl.

Note that the structure of the source flat file is basically the same as the structure of the target flat file. Both files contain one record per employee, and the records in the source flat file contain the same three fields as the records in the target flat file (even though the fields in the source flat file are not in the same order as the fields in the target flat file). In other words, the source XFlat schema is compatible with the target XFlat schema, in that an XFlat instance that conforms to the source XFlat schema will also conform to the target XFlat schema. Thus, we do not need an XSLT processor, such as SAXON, to transform an XFlat instance that conforms to the source XFlat schema into an XFlat instance that conforms to the target XFlat schema.

In order to ensure that the source XFlat schema is compatible with the target XFlat schema, we designed the two XFlat schemas as follows:

To convert the enrollment_csv.txt file into the enrollment_fixed.txt file, you would invoke the flat2flat application (i.e., the flat2flat.exe executable) from the Windows/MS-DOS Command Prompt as follows:

flat2flat enrollment_csv.xfl enrollment_csv.txt enrollment_fixed.xfl enrollment_fixed.txt

We can also convert the enrollment.txt file into the employees.txt file as follows:

flat2flat enrollment_fixed.xfl enrollment_fixed.txt enrollment_csv.xfl enrollment_csv.txt

An XSLT Stylesheet that Transforms an XFlat Schema into a DTD

In this example, we use XT, which is James Clark's XSLT processor, and an XSLT stylesheet to convert an XFlat schema into a DTD. The XSLT stylesheet (xflat2dtd.xsl) converts an XFlat schema into the equivalent DTD. Please see the comments in the xflat2dtd.xsl file for more information about this stylesheet.

The contacts_dtd.xfl file contains the XFlat schema. Using XT and the xflat2dtd.xsl stylesheet, we convert this XFlat schema into a DTD that is written to the contacts_dtd.dtd file as follows:

xt contacts_dtd.xfl xflat2dtd.xsl contacts_dtd.dtd

Note that the XFlat schema includes the XFlat DTD attribute, the value of which is a document type declaration that points to the contacts_dtd.dtd file. When XML Convert transforms the flat file (contacts_dtd.txt) into an XML file (contacts_dtd.xml), it will place this document type declaration in the XML output. In other words, the XML document contains a document type declaration that points to the DTD (i.e., the contents_dtd.dtd file).

An XSLT Stylesheet that Generates XFlat Schemas for Delimited Flat Files

In this example, we use Instant SAXON, which is Michael Kay's XSLT processor, and an XSLT stylesheet to generate XFlat schemas for delimited flat files. We'll generate XFlat schemas for a CSV flat file and a tab-delimited flat file.

First, we'll create an XFlat schema for the csv_data.txt flat file, which contains data in CSV format. The steps to do this are as follows:

  1. Use the flat2xml application (i.e., the flat2xml.exe executable) and the csv.xfl schema to convert the csv_data.txt file into a generic XML document (csv_data.xml) that has generic element names:

     
    flat2xml -s csv.xfl csv_data.txt csv_data.xml
    
  2. Use Instant SAXON and the flat2xflat.xsl stylesheet to transform the csv_data.xml document into an XFlat schema for the csv_data.txt flat file:

    saxon csv_data.xml flat2xflat.xsl > csv_data.xfl
    

    The field names in the resulting XFlat schema (csv_data.xfl) are based on the column headings in the csv_data.txt flat file.

Now we can use the flat2xml application (i.e., the flat2xml.exe executable) and the csv_data.xfl schema to convert the csv_data.txt file into an XML document (csv_data2.xml) that contains meaningful element names for the fields:

flat2xml -s csv_data.xfl csv_data.txt csv_data2.xml

Please note that the column headings in the csv_data.txt file do not appear in the XML output (csv_data2.xml).

Next, we'll create an XFlat schema for the tab_delimited_data.txt flat file, whose records contain tab-delimited fields. The steps to do this are as follows:

  1. Use the flat2xml application (i.e., the flat2xml.exe executable) and the tab_delimited.xfl schema to convert the tab_delimited_data.txt file into a generic XML document (tab_delimited_data.xml) that has generic element names:

     
    flat2xml -s tab_delimited.xfl tab_delimited_data.txt tab_delimited_data.xml
    
  2. Use Instant SAXON and the flat2xflat.xsl stylesheet to transform the tab_delimited_data.xml document into an XFlat schema for the tab_delimited_data.txt flat file:

    saxon tab_delimited_data.xml flat2xflat.xsl QuotedValue=No FieldSep=\t > tab_delimited_data.xfl
    

    The flat2xflat.xsl stylesheet uses global parameters, two of which are named QuotedValue and FieldSep. For more information about these global parameters, please see the comments in the flat2xflat.xsl stylesheet.

    The field names in the resulting XFlat schema (tab_delimited_data.xfl) are based on the column headings in the tab_delimited_data.txt flat file.

Now we can use the flat2xml application (i.e., the flat2xml.exe executable) and the tab_delimited_data.xfl schema to convert the tab_delimited_data.txt file into an XML document (tab_delimited_data2.xml) that contains meaningful element names for the fields:

java flat2xml -s tab_delimited_data.xfl tab_delimited_data.txt tab_delimited_data2.xml

The column headings in the tab_delimited_data.txt file do not appear in the XML output (tab_delimited_data2.xml).

Java Application that uses the XmlConvert Class to Convert a Flat File into XML

The convert_file.java file contains the source code of a Java application that converts flat files into XML using the com.unidex.xflat.XmlConvert and com.unidex.xflat.XflatException classes. The convert_file program performs the same function as the flat2xml application; however, the flat2xml application includes more error handling than the simple convert_file application.

You can use the following MS-DOS command to compile the convert_file.java program using Sun's Java compiler:

javac convert_file.java

Once you've compiled the convert_file.java program, you can use it to convert a flat file into XML. In the following example, the convert_file application is run from the xmlconvert\samples folder at the MS-DOS command line, in order to convert the catalog.txt file into the catalog.xml file using the catalog.xfl schema:

java -cp .;..\jars\xflat.jar convert_file catalog.xfl catalog.txt catalog.xml

Java Application that uses the XmlConvert Class to Convert a String into XML

The convert_string.java file contains the source code of a Java application that converts flat file data into XML using the com.unidex.xflat.XmlConvert and com.unidex.xflat.XflatException classes. The XFlat schema and the flat file data are passed to the XmlConvert object as strings (i.e., via java.io.StringReader objects). The XmlConvert object returns the XML data as a string (i.e., via a java.io.StringWriter object); the XML document is displayed on standard output.

You can use the following MS-DOS command to compile the convert_string.java program using Sun's Java compiler:

javac convert_string.java

Once you've compiled the convert_string.java program, you can use it to convert the flat file data into XML. In the following example, the convert_string application is run from the xmlconvert\samples folder at the MS-DOS command line:

java -cp .;..\jars\xflat.jar convert_string

Error in Flat File: Field Value Too Long

The XFlat schema in this example contains a common error. The FieldDef element for the name field has a maximum length of 80 characters (the default value). However, the length of one of the name fields in the flat file is longer than 80 characters. See field_too_long.xfl, field_too_long.txt and field_too_long.xml. When XML Convert tries to convert the field_too_long.txt file into XML using the field_too_long.xfl schema, it aborts the conversion and displays the following error message:

Error parsing the flat file.
Description: Field value too long. The value of the name field in the employee record is
longer than the maximum length for this field (80 characters).
The error occurred in file:/C:/xmlconvert/samples/field_too_long.txt at record number 2.
The character offset of this record (from the beginning of the file) is 35.
The error occurred at column 11 of the record.
The bad record begins with:
444556666,"Barr, Clark [Extra text that