(C) 2019 Masaryk University -- Tomáš Pitner, Luděk Bártek, Adam Rambousek
non relational databases, flexible schema
often used for big data applications, clusters
different storage structure than SQL databases
give up constraints/transactions to improve performance
low-level interface
key-value
Redis, Memcached, Amazon SimpleDB…
document (JSON, XML…)
CouchDB, Elasticsearch, MongoDB…
graph / RDF triple
Virtuoso, Neo4j…
object
Caché, GemStone…
standard data model (RDF)
standardized interchange format (N-Triples, N-Quads, XML,…)
query language (SPARQL), Linked Data
native
Apache Jena, Sesame/RDF4J…
RDF layer to relational database
Virtuoso, IBM DB2…
SPARQL Protocol and RDF Query Language
W3C Recommendation SPARQL 1.1, March 2013
SELECT - values as table
CONSTRUCT - extract RDF
ASK - true/false
DESCRIBE - extract RDF graph
inferencing
ex1:FullProfessor rdf:subClassOf ex1:Professor.
ex1:AssistantProfessor rdf:subClassOf ex1:Professor.
ex1:Professor owl:equivalentClass ex2:Teacher
ex1:Bob rdf:type ex1:FullProfessor .
ex1:Alice rdf:type ex1:AssistantProfessor .
ex2:Mary rdf:type ex2:Teacher
ex1:Bob rdf:type ex1:FullProfessor .
ex1:Alice rdf:type ex1:AssistantProfessor .
ex2:Mary rdf:type ex2:Teacher
SELECT ?x
WHERE {
?x rdf:type ex1:Professor
}
noone is Professor, but inferencing will find Bob, Alice, Mary
working with documents or metadata in XML format
data format/schema changes over time
complex and variable schema
structure queries
basic element = document
documents gathered in collections ("tables")
query on document structure
output is document, document fragment, or constructed XML
XML-enabled databases
mapping XML data to own data model (relational, object…)
character large object
fragmented to series of tables/objects
stored in XML Type
ISO SQL/XML - element construction, data mapping, enhanced SQL with XQuery
Oracle, IBM DB2, MS SQL, PostgreSQL
native XML databases
using XML data model directly
open-source
eXist, Sedna, BaseX, MonetDB, Oracle/Berkeley DB XML
commercial
MarkLogic, Virtuoso, Qizx
<titles>{
for $book in collection("books")/book where $book/year="1990"
return $book/title
}</titles>
update delete collection("books")/book/isbn
update insert <title>XQuery Guide</title>
into collection("books")/book[isbn="42"]
SELECT
id, vol,
xmlquery('$j/name', passing journal as "j") as name
FROM
journals
WHERE
xmlexists('$j[licence="CreativeCommons"]',
passing journal as "j")
SELECT XMLELEMENT (NAME "saleProducts",
XMLNAMESPACES (DEFAULT 'http://posample.org'),
XMLAGG (XMLELEMENT (NAME "prod",
XMLATTRIBUTES (p.Pid AS "id"),
XMLFOREST (p.name AS "name",
i.quantity AS "numInStock"))))
FROM PRODUCT p, INVENTORY i
WHERE p.Pid = i.Pid
XSLT - output transformation
XML Schema - input validation
XQJ - XQuery API for Java
unified query layer between application and XML Datasource
prepared statements
binding variables
XQDataSource xqs = new ExistXQDataSource();
XQDataSource xqs = new SednaXQDataSource();
xqs.setProperty("serverName", "localhost");
XQConnection conn = xqs.getConnection();
XQExpression xqe = conn.createExpression();
String xqueryString = "for $x in doc('books.xml')//book
return $x/title/text()";
XQResultSequence rs = xqe.executeQuery(xqueryString);
while(rs.next())
System.out.println(rs.getItemAsString(null));
conn.close();
XML:DB
similar concept to JDBC, abstract interface to XML database
Driver - access to given database
Collection - document collection in database
Services - support database features, e.g. XPathQueryService, XUpdateQueryService
Resource - data stored in database
ResourceSet - data as result of query
intact document storage
unique identifier for document
preferably parse and index on storage
on query: fast, if application need access to whole document and index can select the right document
slow, if additional parsing needed
parsing documents
document parsed on save and stored in own data model (eg. DOM)
addressable numbered nodes - some operations faster based on numbers
no need for parsing on query, more efficient
retrieved document not 100% same
fine granularity for addressing
partial modifications
index scope - collection, database, document
index target - document, node
value index
store all combinations of element/attribute values
substring index
for contains() etc., store n-grams
structural index
track existing paths, enriched tree, trie, DataGuide etc.
compare database performance
mostly XQuery speed, less often Update
data generator (up to GBs) and a set of XQueries
XMark, XBench, XMach-1
TPoX - complex database testing, XQuery and SQL/XML, indexing, XML Schema, XQuery Update