Types of transformations
XML pipelining
TODO: Tools
XML Transformations
An XML transformation language is a programming language designed specifically to transform an input XML document into an output document which satisfies some specific goal.
There are two special cases of transformation:
- XML to XML
the output document is an XML document.
- XML to Data
the output document is a byte stream.
XML Pipeline
In software, an XML Pipeline is formed when XML (Extensible Markup Language) processes, especially XML transformations and XML validations, are connected. For instance, given two transformations T1 and T2, the two can be connected so that an input XML document is transformed by T1 and then the output of T1 is fed as input document to T2. Simple pipelines like the one described above are called linear; a single input document always goes through the same sequence of transformations to produce a single output document.
XML Pipeline operations
Linear operations
Non-linear operations
Linear operations
Micro operations
Document operations
Sequence operations
Linear: Micro-operations
Operate at the inner document level:
- Rename
renames elements or attributes without modifying the content
- Replace
replaces elements or attributes
- Insert
adds a new data element to the output stream at a specified point
- Delete
removes an element or attribute (also known as pruning the input tree)
- Wrap
wraps elements with additional elements
- Reorder
changes the order of elements
Linear: Document operations
They take the input document as a whole:
- Identity transform
makes a verbatim copy of its input to the output
- Compare
it takes two documents and compare them
- Transform
execute a transform on the input file using a specified XSLT file
- Split
take a single XML document and split it into distinct documents
Linear: Sequence operations
They are mainly introduced in XProc and help to handle the sequence of documents as a whole:
- Count
it takes a sequence of documents and counts them
- Identity transform
makes a verbatim copy of its input sequence of documents to the output
- Split-sequence
takes a sequence of documents as input and routes them to different outputs depending on matching rules
- Wrap-sequence
takes a sequence of documents as input and wraps them into one or more documents
Non-linear operations
- Conditionals
where a given transformation is executed if a condition is met while another transformation is executed otherwise
- Loops
where a transformation is executed on each node of a node set selected from a document or a transformation is executed until a condition evaluates to false
- Tees
where a document is fed to multiple transformations potentially happening in parallel
- Aggregations
where multiple documents are aggregated into a single document
- Exception Handling
where failures in processing can result an alternate pipeline being processed