Posted on August 24, 2004 by Scott Leberknight
Several years ago I heard about this new cool technology that was supposed to replace XML DTDs and allow you to define types in your XML documents which would get automatically validated according to the rules you specify. You could restrict values in your XML documents to integer, float, dates, times, etc. It was called XML Schema. I started doing some research and learning and found that it did seem to allow you to define your own types and use any of the 46 (yes, I did say forty-six) "primitive" data types in any imaginable combination. Then I learned about serialization space, parse space, lexical and values spaces. Then came whitespace processing, xs:string and xs:normalizedString, simple types and complex types. Then namespaces and derivation by restriction, qualified names, target namespaces, and a zillion more.
By this time I had enough, wrote a DTD for the piece of the project I was working on, and was done with it. XML Schema is horribly complicated and tries to provide way too much in my opinion. Think for a second. If schema is so wonderful, then why do most open source projects like Struts, Hibernate, and Spring all use DTDs? In addition, since the early days of the Servlet API web application deployment descriptors have used DTDs. Unfortunately the new J2EE API versions will use schema to define deployment descriptors. DTDs are simple and work very well for configuration files. You can write a DTD very quickly. Some people consider the fact that a DTD is not actually written in XML to be a negative. Perhaps, but it is very simple to learn to write a DTD and you can probably do it in less than a hour! The same can definitely not be said of XML Schema.
The current project I'm working is using off-the-shelf XML Schemas so this time I have to use them to validate and create instance XML documents. I have managed to avoid really learning and/or using XML Schema since about 2001 but this time there appears to be no escape.