Querying

  • Storing XML data

  • Querying XML data

  • XQuery

Characteristics

  • Query language for searching and extraction of XML nodes (elements, attributes) from a document and for an output XML document construction.

Characteristics (2nd)

  • The XQuery is a basic XML query language at present time (and it seems in the future as well).

  • The W3C specification since March 2011, see http://www.w3.org/XML/ Query.

  • Base on XPath 2.0 data model, operators and functions.

  • Supported by main database engines producers (IBM, MS, Oracle, etc)

Where to use XQuery (and where not)

The XQuery domain is useful for:

  • queries, where extraction (selection) part is more complicated than the construction part.

In other cases:

  • Use XSLT if more complex output is required (e.g. added some new markup and content).

  • Use a more general API (using DOM manipulation) if more complex operations are required.

Source code example

Example of source document, XML Queries on it and their results.

<?xml version="1.0" encoding="Windows-1250"?>
  <addressbook>
    <person category="friends">
      <firstname>Petr</firstname>
      <lastname>Novak</lastname>
      <date-of-birth>1969-05-14</date-of-birth>
      <email>novak@myfriends.com</email>
      <characteristics lang="en">Very good friend</characteristics>
    </person>
    <person category="friends">
      <firstname>Jaroslav</firstname>
      <lastname>Nováček</lastname>
      <date-of-birth>1968-06-14</date-of-birth>
      <email>novacek@myfriends.com</email>
      <characteristics lang="en">Another good friend</characteristics>
    </person>
    <person category="staff">
      <firstname>Jan</firstname>
      <lastname>Horak</lastname>
      <date-of-birth>1970-02-0</date-of-birth>
      <email>horak@mycompany.com</email>
      <characteristics lang="en">Just colleague</characteristics>
    </person>
    <person category="friends">
      <firstname>Erich</firstname>
      <lastname>Polak</lastname>
      <date-of-birth>1980-02-28</date-of-birth>
      <email>erich@myfriends.com</email>
      <characteristics lang="en">Good friend</characteristics>
    </person>
 </addressbook>

Example - Simple Query (XPath)

  • Task: "extract all surnames in the addressbook".

  • Query is more-or-less just an XPath expression, like "selects all lastname elements":

    doc('myaddresses.xml')/addressbook/person/lastname

Running XQuery using Saxon 9.0j

XSLT processor Saxon contains the XQuery processor since version 8.x as well. To process XQuery you need:

  • to install Saxon 9.0.0.4J for example ("j" means implementation in Java, there is a .NET implementation as well) by unpacking into folder c: /devel/saxon9-0-0-4j for example.

  • Change working directory to the folder: cd c:/devel/saxon9-0-0-4j

  • put the above mentioned query into a file (lastnames.xq).

  • store the above mentioned XML document containing "addressbook" into the file myaddresses.xml in the same directory.

  • Run:

    java -classpath saxon9he.jar net.sf.saxon.Query -o:result.xml lastnames.xq

Result

The query to above mentioned document will create the file result.xml:

<lastname>Novák</lastname> <lastname>Nováček</lastname>
<lastname>Horák</lastname> <lastname>Polák</lastname>

XQuery structure

FLWOR is an acronym of an XQuery structure. It roughly corresponds to the SQL query structure:

(F)or

Initial query part that specifies query cycle including control variable. Results of XPath expression behind the keyword " in" are assigned to the variable.

(L)et

You can assign values of next variable that can be used later in this section.

(W)here

specifies selection condition ie. which nodes (values) selected by for section will be used.The condition can utilize the variables defined in the "let" section.

(O)rder

Defines how the nodes should be ordered.

(R)eturn

Defines what is returned, constructed from extracted nodes (values).

FLWOR - simple example

Condition used to select requested nodes can be specified either in an XPath expression in "for" clause or in the "where" clause. "Return Mr. Polak’s birth-date."

for $person in
doc('myaddresses.xml')/addressbook/person where $person/lastname='Polák'
return $person/date-of-birth

XQuery returns:

<?xml version=" 1.0" encodings"UTF-8"?>
<date-of-birth>l980-02-28</date-of-birth>

FLWOR - Nested Queries

Sometimes is needed to iterate over the result of another query. You can do using the return statement - the return clause contains a nested query.

Nested Queries - Example Data

Example data

<warehouse>
  <categories>
    <category name="food">
      <item id="cheaproll"/>
      <item id="expensivebread"/>
    </category>
    <category name="hw">
      <item id="mb"/>
    </category>
  </categories>
  <items>
    <item id="cheapoll">
     <producer>1st bakeries</producer>
     <price unit="CZK">1</price>
    </item>
    <item id="expensivebread">
     <producer>2nd bakeries</producer>
     <price unit="USD">3</price>
    </item>
    <item id="mb">
      <producer>Homemade Electronics Inc.</producer>
      <price unit="EUR">100</price>
    </item>
  </items>
</warehouse>

Nested Queries - Example

The example gets detailed information on all products in category food:

 <foods>
 {
   for $foodId in //category[@name="food"]/item/@id
   return
      for $item in //items/item[@id=$foodId]
      return
    <food>
        {$item/@id}
        {$item/producer}
        {$item/price}
    </food>
 }
 </foods>

XQuery - Query Parameters

The XQuery parameters can be specified following way:

declare variable $id as xs:string external;

Example:

declare variable $cat as xs:string external;
<wares>
 {
   for $foodId in //category[@name=$cat]/item/@id
   return
      for $item in //items/item[@id=$foodId]
      return
      <ware>
        {$item/@id}
        {$item/producer}
        {$item/price}
     </ware>
}
</wares>

XQuery - Transformation attributes <→ elements

  • The following construction part in return statement can be used to transform an attribute into an element:

<some_new_element>
{
  data(xpath_to_attribute)
}
</some_new_element>
  • To transform an element into an attribute you can use:

<food price="{XPath_to_the_element}"/>

XQuery and Namespaces

  • XQuery supports namespaces.

  • Namespace can be declared in two possible ways:

    • The standard way in the root element of the resulting document

 <rich xmlns:pets="nsURI" xmlns:devs="nsURI">
  • Using the namespace declaration

  declare namespace pets="nsURI";
  declare namespace devs="nsURI";
  • The query contains the namespace prefix in both cases then.

XQuery Namespaces Demo

XQuery using namespaces

<rich xmlns:pets="http://www.fi.muni.cz/~bar/pets"
      xmlns:devs="http://www.fi.muni.cz/~bar/devices">
{
  for $rich_person in //person
  where sum($rich_person//pets:price)>100000 or
        sum($rich_person//devs:price)>150000
  return
  <rich-person name="{$rich_person/@name}">
        <pets-value>
         {
                sum($rich_person//pets:price)
         }
        </pets-value>
        <devs-value>
        {
                sum($rich_person//devs:price)
        }
        </devs-value>
  </rich-person>
}
</rich>

XQuery Implementation

SAXON since versions 7.x:

  • install (extract) Saxon with version 7.0 at least (8.x, 9.x) into some directory

  • change working directory to the Saxon directory and

  • saxon7 run: java -classpath saxon.jar net.sf.saxon.Query -o result.xml query-file.xq from command line.

  • saxon8/9 run: java -classpath saxon[8/9].jar net.sf.saxon.Query -o:result.xml query-file.xq from command line.

  • There is a .NET Saxon implementation (means .DLL and .EXE files)

Native XML databases

Native XML database systems mostly support XQuery as a query language: