XML.com FAQs > D. Developers and Implementors (including WebMasters and server operators)
Question:  D.15 I've already got SGML DTDs: how do I convert them for use with XML?

There are numerous projects to convert common or popular SGML DTDs to XML format (for example the TEI DTD, both Lite and full versions).

The following checklist comes courtesy of Seán McGrath (author of XML By Example, Prentice Hall, 1998):

  1. No equivalent of the SGML Declaration. So keywords, character set etc are essentially fixed;
  2. Tag mimimization is not allowed, so <!ELEMENT x - O (A,B)> becomes <!ELEMENT X (A,B)> and <!ELEMENT x - O EMPTY> becomes <!ELEMENT X EMPTY>;
  3. #PCDATA must only occur at the extreme left (ie first) in an OR model, eg <!ELEMENT x - - (A|B|#PCDATA|C)> (in SGML) becomes <!ELEMENT x (#PCDATA|A|B|C)*>, and <!ELEMENT x (A,#PCDATA)> is illegal;
  4. No CDATA, RCDATA elements [declared content];
  5. Some SGML attribute types are not allowed in XML eg NUTOKEN;
  6. Some SGML attribute defaults are not allowed in XML eg CONREF;
  7. Comments cannot be inline to declarations like <!ELEMENT x (A,B) -- this is an SGML comment in a declaration -->;
  8. A whole bunch of SGML optional features are not present in XML: all forms of tag minimization (OMITTAG, DATATAG, SHORTREF, etc); Link Process Definitions; Multiple DTDs per document; and many more: see http://www.w3.org/TR/NOTE-sgml-xml-971215 for the list of bits of SGML that were removed for XML;
  9. And [nearly] last but not least, CONCUR!
  10. There are some important differences between the internal and external subset portion of a DTD in XML: Marked Sections can only occur in the external subset; and Parameter Entities must be used to replace entire declarations in the internal subset portion of a DTD, eg the following is invalid XML:
  11. <!DOCTYPE x [
    <!ENTITY % modelx "(A|B)*">
    <!ELEMENT x %modelx;>

This FAQ is from The XML FAQ, maintained by Peter Flynn