XSL Transformations
T. Pitner, L. Bártek, A. Rambousek. L Grolig FI MU Brno 2020
XSLT - XSL Transformations
● Motivation: need for tool to define and run transformations of XML into XML − or possibly into plain-text, LaTeX source, or any other markup (YAML, JSON) ● We could do it programmatically by manipulating DOM model in any universal language but it leads to unreadable and cumbersome code. ● There is a solution: eXtensible Stylesheet Language Transformations - XSLT ● https://www.w3.org/TR/xslt/all/ again a W3C Recommendation
XSLT - XSL Transformations
● XSLT is a language for specifying transformation of XML documents on the (usually) XML outputs, or text, HTML or other output formats. ● The original application area, the transformation of XML data to XSL:FO (XSLFormatting Objects), thus rendering XML. ● XSLT specification was therefore part of XSL (eXtensible Stylesheet Language). ● Later, XSL was set aside and XSLT began to be seen as a universal general description language for XML → XML (txt, HTML) transformations. ● XSLT is a Turing-complete language, ie. one can do “anything” in XSLT
XSLT - XSL Transformations
● Transformation process of XML using XSLT
XSLT - Language principles
● XSLT is a functional language, where reduction rules have the form of templates, specifying how nodes in the source document are converted into the output document. ● XSLT transformation specification is contained in a style file (stylesheet), which is an XML document written in the XSLT syntax. The root element is either xsl:stylesheet or xsl:transformation (which are synonyms) where xsl: is a prefix for the XSL namespace. ● The XSLT style(sheet) is then processed by an XSLT processor and subsequently, ● XML file(s) can be transformed using that stylesheet.
XSLT - Workflow
XSLT Example: XML source file
JohnSmithMorkaIsmincius
XSLT Example: XSLT style
XSLT Example: Output
JohnMorka
XSLT - Versions
● The current versions are defined by the XSLT 1.0, 2.0 and 3.0 specifications: ● XSLT 1.0 XSL Transformations (XSLT) — W3C Recommendation 16 November 1999 ● XSL Transformations (XSLT) Version 2.0 — W3C Recommendation 23 January 2007 ● XSL Transformations (XSLT) Version 3.0 — W3C Last Call Working Draft 2 October 2014 ● The version 1.0 is still widely used, mainly due to lacking (free) implementations of the newer specification ● Also, for many purposes, the features of 1.0 are sufficient.
XSLT - Processing
1) XSLT processor (interpreter) takes the stylesheet (XSLT code) 2) Usually compiles it into an internal form. 3) Then it takes the nodes from the input document, looks for an appropriate template and select it. 4) 4. Then it produces a result fragment corresponding to the construction part of the selected template. 5) Recursive takes next nodes from the input document and applies the same procedure for them.
Open-source XSLT Processors
● Xalan − an open source XSLT 1.0 processor from the Apache Software Foundation available stand-alone and for Java and C++. Integrated into Java SE. ● Web browsers − Safari, Chrome, Firefox, Opera and Internet Explorer all support XSLT 1.0. None supports XSLT 2.0 natively, although the third party products like Saxon-CE (Saxon-Client Edition) and Frameless can provide this functionality. − Browsers can perform on-the-fly transformations of XML files and display the transformation output in the browser window. This is done either by embedding the XSL in the XML document or by referencing a file containing XSL instructions from the XMLdocument. − The latter may not work with Chrome because of its security model.
Open-source XSLT Processors
● libxslt − is a free library released under the MIT License that can be reused in commercial applications. Itis based on libxml and implemented in C. It can be used at the command line via xsltproc which is included in OS X and many Linux distributions. ● WebKit, Blink − used for example in the Safari and Chrome web browsers respectively, uses the libxslt library to do XSL transformations. ● Saxon − an XSLT (2.0 and partial 3.0) and XQuery 3.0 processor with open-source and proprietary commercial versions for stand-alone operation and for Java, JavaScript and .NET.
Commercial XSLT Processors
● MSXML and .NET − includes an XSLT 1.0 processor. From MSXML 4.0 it includes the command line utility msxsl.exe. ● QuiXSLT − an XSLT 3.0 processor doing streaming implemented in Java by Innovimax and INRIA. ● Saxon − commercial versions support the newest standards such as XSLT 3.0.
Information Resources
● W3C XSLT 1.0 Recommendation ○ XSLT 1.0 is still the most used version. ● What is XSLT? on XML.COM ○ http://www.xml.com/pub/a/2000/08/holman/index.html ● Mulberrytech.com XSLT Quick Reference (2xA4, PDF) ○ http://www.mulberrytech.com/quickref/XSLTquickref.pdf ● Dr. Pawson XSLT FAQ ○ http://www.dpawson.co.uk/xsl/xslfaq.html ● Zvon XSLT Tutorial ○ http://zvon.org/xxl/XSLTutorial/Books/Book1/index.html ● Safari online XSLT Reference
XSLT Syntax
Basic XSLT Elements
● Viz http://en.wikipedia.org/wiki/XSLT_elements ● xsl:stylesheet ○ (or xsl:transform) is the top-level element. Occurs only once in a stylesheet document. ○ The attribute version specifies which XSLT version is being used. ○ The NS declaration xmlns:xsl specifies the URL, which is always http://www.w3.org/1999/XSL/Transform regardless of the XSLT version. ● xsl:output ○ Child element of stylesheet. ○ It describes how data will be returned. ○ The attribute method designates what kind of data is returned (such as xml, text, html). ○ The attribute omit-xml-declaration indicates if the initial ● Selects any element and the root. ● Produces nothing for it but traverses its all child elements.
Default tree (do-nothing) traversal for specified mode
● Does the same but only for the specified mode.
Copy text nodes and attributes
● Copies text nodes and attributes to the result
Ignore PIs and comments
● Ignores (does not include the results of the PI and comments)
Generating values programmatically
● Not only elements, attributes and texts from the source are copied to the output. ● All can be programmatically dynamically generated.
Generating Element with Calculated Attribute Value
● Task: ○ Generate the output of a predetermined element (with pre-known name), but with attributes with values with calculated during transformation. ● Solution: ○ Use the normal procedure - literal result element - attributes and values specified as the attribute value templates (AVT)
Generating Element with Calculated Attribute Value (Example)
● Input: ... ● Template: ... Alternative: # …
Generating Element/Attribute with Calculated Name
● Objective: ○ Generate the output element whose name, attributes and content is NOT known in advance when writing the style. So it must be determined (calculated) in runtime (when transforming). ● Solution: ○ Use a template to component xsl:element ● Input: ... ● Template: ID1
Generating Element/Attribute with Calculated Name
● Result: ○ Creates an element with the name elt_name, equipping it with the attribute id="ID1". Also the attribute name could be generated if we wished so.
XSLT Conditional processing
● Objective ○ To influence the output based on a condition. ● Solution ○ Use branching in the template - either ■ xsl:if for single then/else branches or ■ multiway xsl:choose / xsl:when / xsl:otherwise
Example xsl:if
● Input: ... ● Template:
Expensive bread - price CZK
● Result: Creates an element p with a record about the bread. If the bread was expensive, also
Example xsl:choose
● Input:
... ... ... ExpensiveSuspiciously cheapNormal bread - price CZK
● Template:
Result: Filters out the two extreme price level — normal prices remain for xsl:otherwise.
Loops
● Input:
... ... ...
bread - price CZK
● Template:
● Result: Creates series of elements p with bread prices. ● Caution - Construction xsl:for-each typically has procedural nature, which is generally not recommended for XSLT as it namely gives minimum flexibility to iterate through the contents of a set of nodes — we must know its exact structures beforehand. The style is also more difficult to modify if the structure changes (eg. new or altered element names).
Template calls and parameters
● Named template declaration . The template may contain declarations of parameters: (parameter type is not specified — i.e. dynamic typing) ● Template call using ● The call can also specify the parameters (if they were declared at the template definition): or _parametervalue_ ● Default parameter value can also be specified using
Automatic (generated) numbering
● Achieved by using xsl:number element ● For either (or both): counting elements in input to allow automatic numbering — for example to number book chapters sequentially, or formatting numbers, eg. writing them in Arabic or Roman numbers. Resembles part of the internationalization support seen in java.text. ● The autonumbering can also be multilevel eg. (sub)chapter numbers like 1.1 etc.
Example
● Template:
Group
Example
● Source data
Example
● Result: Group 1 1.1 Al Zehtooney 1.2 Brad York Group 2 2.1 Greg Sutter 2.2 Harry Rogers Group 2.1 2.1.1 John Quincy 2.1.2 Kent Peterson 2.3 John Frank
Namespace Handling
● ● ● ● XSLT allows to select and produce nodes (elements, attributes) in namespaces. However, it has some pitfalls, see Namespaces in XSLT issues Multiple namespaces in XSLT Avoid Namespace mistakes in XSLT at Developerworks
Where to do XSLT?
● ● ● ● ● ● Online (just for fun) In all XML professional editors and many programmers' IDE such as NetBeans Command-line tools, such as xsltproc or xmlstarlet From within Java programs using Java Core API (javax.xml.transform package) Using specialized tools programmatically (via API) or command-line, such as Saxon Similarly for other languages, almost all now have XML/XSLT support
Online Tools
● Good for simple try-and-see: ○ Online XSLT engine @Freeformatter.com plus a demo XML input/XSLT/XML output ○ W3Schools online XSLT engine maybe even better :-) ● With XSLT 2.0 support: ○ http://xslttest.appspot.com/
Using XSLT in Java (Core API)
● See Using the XSLT Processor for Java from Oracle.