Canonical Form - principles /1
Main principles for constructing the canonical form of an XML
document:
-
encoding in UTF-8
-
line breaks (CR, LF) normalized according to the algorithm
mentioned in XML 1.0 Spec.
-
attribute values normalized
-
references to character and parsed entites replaced by their
content
-
CDATA section also replaced by their content
-
prolog "xml" and DTD removed