Common XLM Schemas
SAX2 Features and Properties
Making JAXP recognize RELAX-NG schema
Although JDK 1.5 and 1.6 are aware of RELAX-NG considering XMLConstants.RELAXNG_NS_URI
, JDK 1.5 and JDK 1.6 don't include RELAX-NG implementation. So, if you use RELAX-NG as a schema for the validation API of JAXP, you should include RELAX-NG implementation to use in your classpath and set system variable to use it before the validation code.
...
System.setProperty("javax.xml.validation.SchemaFactory:" + XMLConstants.RELAXNG_NS_URI,
"com.thaiopensource.relaxng.jaxp.CompactSyntaxSchemaFactory");
SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.RELAXNG_NS_URI);
...
The most well-known RELAX-NG implementation in Java seems to be jing
.
Define empty element using XML Schema
Built-in Datatypes of XML Schema
data:image/s3,"s3://crabby-images/d11af/d11afc48bfd8f386d4e8f084ccaf5abb144afb6b" alt=""
- A value of '100.0' is invalid with
xsd:integer
or it's subtypes becausexsd:integer
is defined with it'sfractionDigits
is 0.
Meaning of Fundamental Element of XML Schema
simpleType
, complexType
, simpleContent
, complexContent
, ... all these are very confusing. So, you need to understand the exact meaning or usage of each element and tell the differences between them.
simple types
- Elements that contain numbers (and strings, and dates, etc.) but do not contain any subelements are said to have simple types.
complex types
- Elements that contain subelements or carry attributes are said to have complex types.
simpleContent
- The
simpleContent
element can specify attributes for simple types. - The
simpleContent
element can specify attribute types viaextension
or restrict existing attribute types viarestriction
to simple types or to complex types with simple content.
complexContent
- The
complextContent
element can specify nested element types. This includes the special case of zero element, also known as 'empty content'. ThecomlextContent
also provides functionality that permits text interspered with elements, also known as 'mixed content'.
Thread-safeness of Factories in JAXP
Factory classes in JAXP such as SAXParserFactory
, DocumentBuilderFactory
, SchemaFactory
are not thread-safe.
DocumentBuilder
DOM related core classes in JAXP, in other words DocumentBuilderFactory
, DocumentBuilder
, and Document
are not thread-safe. Mutator methods such as DocumentBuilderFactory.setSchema()
, DocumentBuilderFactory.setFeature()
, DocumentBuilderFactory.setIgnoringComments()
, DocumentBuilderFactory.setNamespaceAware()
, DocumentBuilderFactory.setValidating()
, DocumentBuilder.setEntityResolver()
, and DocumentBuilder.setErrorHandler()
means those classes are not thread-safe.
DocumentBuilderFactory
or DocumentBuilder
object may be relatively resource demanding. So, they should not be instantiated every time you build a document. But they are not thread-safe, so they should be confined properly.
In usual case where your application need parsing method for your specific documents, you would better provide public parser method that create document object and reuse document builder factory and document builder internally.
public class ApplicationDocumentParsers{
protected DocumentBuilder orderDocBuilder;
protected DocumentBuilder paymentDocBuilder;
protected DocumentBuilder deliveryDocBuilder;
public ApplicationDocumentParsers(){
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
dbf.setValidating(false);
dbf.setXIncludeAware(false);
//initiate document builders
dbf.setSchema("order.xsd");
this.orderDocBuilder = dbf.newDocumentBuilder();
dbf.setSchema("payment.xsd");
this.paymentDocBuilder = dbf.newDocumentBuilder();
dbf.setSchema("delivery.xsd");
this.deliveryDocBuilder = dbf.newDocumentBuilder();
}
public Document parseOrderDoc(InputStream is){
return this.orderDocBuilder.parse(is);
}
public Document parsePaymentDoc(InputStream is){
return this.paymentDocBuilder.parse(is);
}
public Document parseDeliveryDoc(InputStream is){
return this.deliveryDocBuilder.parse(is);
}
I think there maybe threa-safe or immutable document builder or document builder factory but, I haven't still found well-known one.
javax.xml.validation
SchemaFactory
class is not thread-safe, but Schema
class is immutable and thread-safe.
Meaning and Pronunciation of Xerces
As for me who is not native with English, the word starting with 'x' is very unfamiliar and can't even imagine how to pronounce such words. Although I have used Apache Xerces for more than 5 years, I recently become to know the exact pronunciation of 'Xerces'. This may be silly to those who use English as their mother tongue, most of application developers around me are same with me. Anyway you can hear the pronunciation and read the meaning of 'Xerces' at the following pages.
- http://www.merriam-webster.com/cgi-bin/audio.pl?bixxer01.wav=Xerxes+I
- http://education.yahoo.com/reference/dictionary/entry/Xerxes I
Resources on JAXB
There aren't so much books, tutorials or articles about JAXB as JAXP.
Specially, it's much difficult to find in-depth materials on JAXB 2.0.
The followings are the one I have found to be useful.
Embedding Schematron to XML Schema
Schematron can defines complex rules such as relations between values of elements which can't be expressed with XML Schema. But defining the whole schema of an XML document using Schematron is too expensive and improper. So, defining basic structure and rules using XML Schema and more complex rules using Schematron seems to be good strategy.
Then, maintaining a pair of schema files for a XML document is somewhat bothering. Is it possible to merge two schema files into one ?
The following article explains how to embedding constraints expressed with Schematron syntax into the XML Schema file. The basic idea is using
Resources on XML Catalog
- XML Catalogs Committee Specification 06 Aug 2001
- Managing XML data: XML catalogs, 13 May 2005
- XML Catalog on Wikipedia
Document type declaration, public identifier, system identifier
Syntax
document-type-declaration = (external-subset, internal-subset)|external-subset|internal-subset document-type-delaration = '<!DOCTYPE' root-element-name external-subset? ('[' internal-subset ']')? '>' external-subset = ('PUBLIC' public-identifier system-identifier)|('SYSTEM' system-identifier')
- public-identifier : identifier which is meant to be universally unique within its application scope.
- system-identifier : is typically a fragmentless URI reference which is intended to identify a document type which is used exclusively in one application.
Sample
- XHTML 1.0
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
- Web application deployment descriptor for Servlet 2.3
<!DOCTYPE web-app PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN" "http://java.sun.com/dtd/web-app_2_3.dtd">
- IoC configuration of Spring framework 2.0
<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN 2.0//EN" "http://www.springframework.org/dtd/spring-beans-2.0.dtd">
- SQL map of iBATIS 2.0
<!DOCTYPE sqlMap PUBLIC "-//ibatis.apache.org//DTD SQL Map 2.0//EN" "http://ibatis.apache.org/dtd/sql-map-2.dtd">
- DocBook 5.0
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V5.0/EN" "http://www.oasis-open.org/docbook/xml/5.0/docbook.dtd" [ <!ENTITY chap1 SYSTEM "chap1.xml"> <!ENTITY chap2 SYSTEM "chap2.xml"> ]>
XML Schema Documentation Tools
-
xsddoc
The xsddoc subproject is a XML Schema documentation generator for W3C XML Schemas. -
xs3p
The XS3P schema documentation generator is simply an XSLT stylesheet, which generates HTML documentation from an XSD schema file.
Small Tips
- Type extension using
<xs:extension>
element can't override the type of the element in base type.
0 comments:
Post a Comment