Certain information loss (mostly info from DTD):
not-parsed entity (eg. binary ones) are not accessible anymore after canonicalization
notations
attribute types (incl. default values)