Canonical Form - principles /1

Main principles for constructing the canonical form of an XML document:

  • encoding in UTF-8

  • line breaks (CR, LF) normalized according to the algorithm mentioned in XML 1.0 Spec.

  • attribute values normalized

  • references to character and parsed entites replaced by their content

  • CDATA section also replaced by their content

  • prolog "xml" and DTD removed