parseDTD                 package:XML                 R Documentation

_R_e_a_d _a _D_o_c_u_m_e_n_t _T_y_p_e _D_e_f_i_n_i_t_i_o_n (_D_T_D)

_D_e_s_c_r_i_p_t_i_o_n:

     Represents the contents of a DTD as a user-level object containing
     the element and entity definitions.

_U_s_a_g_e:

     parseDTD(extId, asText=F, name="", isURL=F)

_A_r_g_u_m_e_n_t_s:

   extId: The name of the file containing the DTD to be processed.

  asText: logical indicating whether the value of `extId' is the name
          of a file or the DTD content itself. Use this when the DTD is
          read as a character vector, before being parsed  and handed
          to the parser as content only.

    name: Optional name to provide to the parsing mechanism.

   isURL: A logical value indicating whether the input source is to be
          considred a URL or a regular file or string containing the
          XML.

_D_e_t_a_i_l_s:

     Parses and converts the contents of the DTD in the specified file 
     into a user-level object containing all the information about the
     DTD.

_V_a_l_u_e:

     A list with two entries, one for the entities and the other for
     the elements defined within the DTD. 

entities: a named list of the entities defined in the DTD.  Each entry
          is indexed by the name of the corresponding entity. Each is
          an object of class `XMLEntity' or alternatively
          `XMLExternalEntity' if the entity refers to an external
          definition. The fields of these types of objects are 

          _n_a_m_e the name of the entity by which users refer to it.

          _c_o_n_t_e_n_t the expanded value or definition of the entity

          _o_r_i_g_i_n_a_l the value of the entity, but with references to
                 other entities not expanded, but maintained in
                 symbolic form. 

elements: a named list of the elements defined in the DTD, with the
          name of each element being the identifier of the element
          being defined. Each entry is an object of class
          `XMLElementDef' which has 4 fields.

          _n_a_m_e the name of the element.

          _t_y_p_e a named integer indicating the type of entry in the DTD,
                 usually either `element' or `mixed'. The name of the
                 value is a user-level type. The value is used for
                 programming, both internally and externally.

          _c_o_n_t_e_n_t_s a description of the elements that can be nested
                 within this element. This is an object of class
                 `XMLElementContent' or one of its specializations -
                 `XMLSequenceContent', `XMLOrContent'. Each of these
                 encodes the number of such elements permitted  (one,
                 one or more, zero or one, or zero or more); the type
                 indicating whether the contents consist of a single
                 element type, an ordered sequence of elements, or one
                 of a set of elements. Finally, the actual contents
                 description is described in the `elements' field. This
                 is a list of one or more `XMLElementContent',
                 `XMLSequenceContent' and `XMLOrContent'  objects.

          _a_t_t_r_i_b_u_t_e_s a named list of the attributes defined for this
                 element in the DTD. Each element is of class
                 `XMLAttributeDef' which has 4 fields.

          _n_a_m_e name of the attribute, i.e. the left hand side

          _t_y_p_e the type of the value, e.g. an CDATA, Id, Idref(s),
                 Entity(s), NMToken(s),  Enumeration, Notation

          _d_e_f_a_u_l_t_T_y_p_e the defined type, one of  None, Implied, Fixed or
                 Required.

          _d_e_f_a_u_l_t_V_a_l_u_e the default value if it is specified, or the
                 enumerated values as a character vector, if the type
                 is Enumeration.

_W_A_R_N_I_N_G:

     Errors in the DTD are stored as warnings for programmatic access.

_N_o_t_e:

     Needs libxml (currently version 1.8.7) from <URL: >

_A_u_t_h_o_r(_s):

     Duncan Temple Lang, duncan@research.bell-labs.com

_R_e_f_e_r_e_n_c_e_s:

     <URL: http://www.w3.org>

_S_e_e _A_l_s_o:

     `xmlTreeParse',  WritingXML.html in the distribution.

_E_x_a_m_p_l_e_s:

      dtdFile <- system.file("data", "foo.dtd",pkg="XML")
      parseDTD(dtdFile)

       # Read from text
      txt <- scan(dtdFile, what="", sep="\n")
      txt <- paste(txt, collapse="\n")
      d <- parseDTD(txt, asText=T)


      url <- "http://www.omegahat.org/XML/DTDs/DatasetByRecord.dtd"
      d <- parseDTD(url, asText=T)  

