SCDJWS Study Guide: XML Schema


Printer-friendly version Printer-friendly version | Send this 
article to a friend Mail this to a friend


Previous Next vertical dots separating previous/next from contents/index/pdf Contents

XML Schema Data Types


A schema describes the structure of an XML document in terms of complex types and simple types. Complex types describe how elements are organized and nested. Simple types are the primitive data types contained by elements and attributes.

One of the benefits of using schema is the ability to control data types for elements and attributes. By allowing developers to dictate the type of data an element or attribute can contain, schema are much more suited for XML application development.


Simple Types

The XML schema sepcification defines many stardard smiple types, called built-in types. These built-in types are the standard building blocks of an XML schema document. Simple types cannot be broken down into constituent parts. In other words, a siimple element type will not contain other elements, it will contain only data. Attributes must be Simple Types. Simple type elements cannot themselves have attributes or contain other elements.

The following are the examples of the simple types:

<xsd:element name="state" type="xsd:string" />
<xsd:element name="zip" type="xsd:decimal" />
<xsd:attribute name="country" type="xsd:NMTOKEN"
    use="fixed" value="US"/>


You can derive new simple types types from existing types by restricting the type to a subset of its normal values. An xsd:simpleType element defines the restricted type. The name attribute of xsd:simpleType assigns a name to the new type. An xsd:restriction child element specifies what type is being restricted via its base attribute. Facet children of xsd:restriction specify the constraints on the type. For example, this xsd:simpleType element defines a independentYear as any year from 1776 on:

<xsd:simpleType name="independentYear">
  <xsd:restriction base="xsd:gYear">
    <xsd:minInclusive value="1776"/>
  </xsd:restriction>
</xsd:simpleType>

Then you declare the year element like this:

<xsd:element type="independentYear" />

There is another more detailed example later under "create your own data type".


Complex Types

A schema may declare complex types, which define how elements that contains other elements are organized. Complex types allow elements in their content and may carry attributes. For example, the USAddress schema type defines a US postal address, which contains name, street, city, state and zip:

<xsd:complexType name="USAddress">
<xsd:sequence>
<xsd:element name="name" type="xsd:string" />
<xsd:element name="street" type="xsd:string" />
<xsxd:element name="city" type="xsd:string" />
<xsd:element name="state" type="xsd:string"/>
<xsd:element name="zip" type="xsd:decimal" />
</xsd:sequence>
<xsd:attribute name="country" type="xsd:NMTOKEN"
use="fixed" value="US"/>
</xsd:complexType>


As in the above USAddress example, most complexType declarations in schemas contains a sequence element that lists one or more elements definitions. The sequence element describes which elements are nested in the complexType, the type of each nested elements, and the order of the nested elements.

In a complexType, the number of times of an element occurs in an XML document is controlled by the maxOccurs and minOccurs attributes. For example, in the above USAddress definition, the street element can occur at least one time and no more than two times:

<xsd:complexType name="USAddress">
<xsd:sequence>
<xsd:element name="name" type="xsd:string" />
<xsd:element name="street" type="xsd:string" />
      minOccurs="1" maxOccurs="2" / >

<xsxd:element name="city" type="xsd:string" />
<xsd:element name="state" type="xsd:string"/>
<xsd:element name="zip" type="xsd:decimal" />
</xsd:sequence>
<xsd:attribute name="country" type="xsd:NMTOKEN"
use="fixed" value="US"/>
</xsd:complexType>


Note that the default value for both maxOccurs and minOccurs is "1". So if there attributes are omitted, the element must be present exactly once. If an element is optional, set the minOccurs to "0". If an element can occur any number of times, set minOccurs="0" and maxOccurs="unbounded". In other words:

Must occur - minOccurs="1" maxOccurs="1"

Optional - minOccurs="0" maxOccurs="1"

Kleene closure - minOccurs="0" maxOccurs="unbounded"


Besides the sequence element, the all element is also used in the ComplexType definition. Unlike the sequence element, which defines the exact order of child elements, the XML schema all element allows the elements in it to appear in any order. Each element in an all group may occur once or not at all; no other multiplicity is allowed. In other words, in the all group, the minOccurs is either "0" or "1", the default is "1", the maxOccurs is always "1".

Note: Only single elements may be used in an all group, it can't include other groupings like sequence or all. The following is the USAddress example that is defined using the all element:

<?xml version=1.0 encoding="UTF-8" ?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"
    xmlns:address="http://www.postaddress.com/address"
    targetNamespace="http://www.postaddress.com/address"
...
<complexType name="USAddress">
<all>
<element name="name" type="string" />
<element name="street1" type="string" />
<element name="street2" type="string" minOccurs="0" />
<element name="apt" type="string" minOccurs="0"/>
<element name="city" type="string" />
<element name="state" type="string"/>
<element name="zip" type="decimal" />
</all>
</xsd:complexType>
...
<schema>


Note: that the names of XML schema types are case-sensitive. When an element declares that it is of a particular type, it must specify both the namespace and the name of that type exactly as the type declares them.


When do you use the complexType element and the simpleType element?

  • Use the complexType element when you want to define child elements and/or attributes of an element

  • Use the simpleType element when you want to create a new type that is a refinement of a built-in type (string, date, gYear, etc)


Create Your Own Data Types

A new datatype can be defined from an existing datatype (called the "base" type) by specifying values for one or more of the optional facets for the base type. For example,the string primitive datatype has six optional facets: length, minLength, maxLength, pattern, enumeration, whitespace (legal values: preserve, replace, collapse)

The followin is an example of Creating a New Datatype by Specifying Facet Values:

This creates a new data type called "TelephoneNumber"

<xsd:simpleType name="TelephoneNumber">

Elements of this type can hold string values

��� <xsd:restriction base="xsd:string">

But the string length must be exactly 8 characters long and

������� <xsd:length value="8"/>

The string must follow the pattern: ddd-dddd, where 'd' represents a 'digit'

������� <xsd:pattern value="\d{3}-\d{4}"/>

��� </xsd:restriction>

</xsd:simpleType>������

Obviously, in this example the regular expression makes the length facet redundant


The following example is creates a new type called US-Flag-Colors. An element declared to be of this type must have either the value red, or white, or blue:

<xsd:simpleType name="US-Flag-Colors">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="red"/>
<xsd:enumeration value="white"/>
<xsd:enumeration value="blue"/>
</xsd:restriction>
</xsd:simpleType>



General Form of Creating a New Datatype by Specifying Facet Values





Creating a simpleType from another simpleType


We can create a simpleType using one of the built-in datatypes as our base type. However, we can create a simpleType that uses another simpleType as the base.

<xsd:simpleType name= "EarthSurfaceElevation">
<xsd:restriction base="xsd:integer">
<xsd:minInclusive value="-1290"/>
<xsd:maxInclusive value="29035"/>
</xsd:restriction>
</xsd:simpleType>


The BostonAreaSurfaceElevation simpleType uses EarthSurfaceElevation as its base type.

<xsd:simpleType name= "BostonAreaSurfaceElevation">
<xsd:restriction base="EarthSurfaceElevation">
<xsd:minInclusive value="0"/>
<xsd:maxInclusive value="120"/>
</xsd:restriction>
</xsd:simpleType>



Local References to Globals

Local elements can reference global elements by name, as shown in the following example:

<xsd:schema xmlns:xsd="http://www.w3.org/2000/08/XMLSchema">
<xsd:element name="person" type="PersonType"/>
<xsd:element name="comment" type="xsd:string"/>
<xsd:complexType name="PersonType">
<xsd:sequence>
<xsd:element name="name" type="xsd:string"/>
<xsd:element ref="comment" minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
</xsd:schema>

References to global attributes are also possible


Inheritance and Extensions

Base types can be inherited and extended. For example:

<complexType name="Address">
<sequence>
<element name="name" type="string"/>
<element name="street" type="string"/>
<element name="city" type="string"/>
</sequence>
</complexType>
<complexType name="USAddress">
<complexContent>
<extension base="ipo:Address">
<sequence>
<element name="state" type="ipo:USState"/>
<element name="zip" type="positiveInteger"/>
</sequence>
</extension>
</complexContent>
</complexType>

Overriding is possible via restriction only.



Previous Next vertical dots separating previous/next from contents/index/pdf Contents

  |   |