要针对多个 xsd 模式验证 XML
我正在编写 xsd 和要验证的代码,所以我在这里可以很好地控制.
I'm writing the xsd and the code to validate, so I have great control here.
我想要一个上传工具,可以根据 xml 文件向我的应用程序添加内容.xml 文件的一部分应根据另一部分中的值之一针对不同的模式进行验证.下面举个例子来说明:
I would like to have an upload facility that adds stuff to my application based on an xml file. One part of the xml file should be validated against different schemas based on one of the values in the other part of it. Here's an example to illustrate:
<foo>
<name>Harold</name>
<bar>Alpha</bar>
<baz>Mercury</baz>
<!-- ... more general info that applies to all foos ... -->
<bar-config>
<!-- the content here is specific to the bar named "Alpha" -->
</bar-config>
<baz-config>
<!-- the content here is specific to the baz named "Mercury" -->
</baz>
</foo>
在这种情况下,<bar>
的内容有一些受控词汇,我可以很好地处理这部分.然后,根据 bar 值,应使用适当的 xml 模式来验证 bar-config 的内容.对于 baz 和 baz-config 也是如此.
In this case, there is some controlled vocabulary for the content of <bar>
, and I can handle that part just fine. Then, based on the bar value, the appropriate xml schema should be used to validate the content of bar-config. Similarly for baz and baz-config.
进行解析/验证的代码是用 Java 编写的.不确定解决方案对语言的依赖性.
The code doing the parsing/validation is written in Java. Not sure how language-dependent the solution will be.
理想情况下,该解决方案将允许 xml 作者声明适当的架构位置和其他内容,以便他/她可以在足够智能的编辑器中即时验证 xml.
Ideally, the solution would permit the xml author to declare the appropriate schema locations and what-not so that s/he could get the xml validated on the fly in a sufficiently smart editor.
另外,<bar>
和 <baz>
的可能值是正交的,所以我不想通过扩展来对每个可能的条进行此操作/baz 组合.我的意思是,如果有 24 个可能的 bar 值/模式和 8 个可能的 baz 值/模式,我希望能够编写 1 + 24 + 8 = 33 个总模式,而不是 1 * 24 * 8 = 192 个总模式.
Also, the possible values for <bar>
and <baz>
are orthogonal, so I don't want to do this by extension for every possible bar/baz combo. What I mean is, if there are 24 possible bar values/schemas and 8 possible baz values/schemas, I want to be able to write 1 + 24 + 8 = 33 total schemas, instead of 1 * 24 * 8 = 192 total schemas.
另外,如果可能的话,我不希望将 bar-config 和 baz-config 拆分为单独的 xml 文件.我意识到这可能会使所有问题变得更容易,因为每个 xml 文件都有一个单一的架构,但我正在尝试看看是否有一个好的单 xml 文件解决方案.
Also, I'd prefer to NOT break out the bar-config and baz-config into separate xml files if possible. I realize that might make all the problems much easier, as each xml file would have a single schema, but I'm trying to see if there is a good single-xml-file solution.
推荐答案
我终于想通了.
首先,在 foo 模式中,bar-config 和 baz-config 元素的类型包含 any
元素,如下所示:
First of all, in the foo schema, the bar-config and baz-config elements have a type which includes an any
element, like this:
<sequence>
<any minOccurs="0" maxOccurs="1"
processContents="lax" namespace="##any" />
</sequence>
那么,在 xml 中,您必须使用 bar-config 或 baz-config 的子元素上的 xmlns
属性指定正确的命名空间,如下所示:
In the xml, then, you must specify the proper namespace using the xmlns
attribute on the child element of bar-config or baz-config, like this:
<bar-config>
<config xmlns="http://www.example.org/bar/Alpha">
... config xml here ...
</config>
</bar-config>
然后,bar Alpha 的 XML 模式文件将具有 http://www.example 的目标命名空间.org/bar/Alpha 并将定义根元素 config
.
Then, your XML schema file for bar Alpha will have a target namespace of http://www.example.org/bar/Alpha and will define the root element config
.
如果您的 XML 文件具有两个模式文件的名称空间声明和模式位置,这足以让编辑器完成所有验证(至少对 Eclipse 来说足够好).
If your XML file has namespace declarations and schema locations for both of the schema files, this is sufficient for the editor to do all of the validating (at least good enough for Eclipse).
到目前为止,我们已经满足了xml作者可以以在编辑器中验证的方式编写xml的要求.
So far, we have satisfied the requirement that the xml author may write the xml in such a way that it is validated in the editor.
现在,我们需要消费者能够进行验证.就我而言,我使用的是 Java.
Now, we need the consumer to be able to validate. In my case, I'm using Java.
如果您有机会提前知道需要用于验证的架构文件,那么您只需创建一个 Schema 对象并照常进行验证,如下所示:
If by some chance, you know the schema files that you will need to use to validate ahead of time, then you simply create a single Schema object and validate as usual, like this:
Schema schema = factory().newSchema(new Source[] {
new StreamSource(stream("foo.xsd")),
new StreamSource(stream("Alpha.xsd")),
new StreamSource(stream("Mercury.xsd")),
});
然而,在这种情况下,在我们解析主文档之前,我们不知道要使用哪些 xsd 文件.所以,一般的程序是:
In this case, however, we don't know which xsd files to use until we have parsed the main document. So, the general procedure is to:
- 仅使用主 (foo) 架构验证 xml
- 确定用于验证文档部分的架构
- 使用单独的架构查找作为要验证的部分的根节点
- 将该节点导入到一个全新的文档中
- 使用其他架构文件验证全新的文档
注意事项:似乎必须构建可识别命名空间的文档才能使其正常工作.
这是一些代码(这是从我的代码的各个地方撕下来的,所以复制和粘贴可能会引入一些错误):
Here's some code (this was ripped from various places of my code, so there might be some errors introduced by the copy-and-paste):
// Contains the filename of the xml file
String filename;
// Load the xml data using a namespace-aware builder (the method
// 'stream' simply opens an input stream on a file)
Document document;
DocumentBuilderFactory docBuilderFactory =
DocumentBuilderFactory.newInstance();
docBuilderFactory.setNamespaceAware(true);
document = docBuilderFactory.newDocumentBuilder().parse(stream(filename));
// Create the schema factory
SchemaFactory sFactory = SchemaFactory.newInstance(
XMLConstants.W3C_XML_SCHEMA_NS_URI);
// Load the main schema
Schema schema = sFactory.newSchema(
new StreamSource(stream("foo.xsd")));
// Validate using main schema
schema.newValidator().validate(new DOMSource(document));
// Get the node that is the root for the portion you want to validate
// using another schema
Node node= getSpecialNode(document);
// Build a Document from that node
Document subDocument = docBuilderFactory.newDocumentBuilder().newDocument();
subDocument.appendChild(subDocument.importNode(node, true));
// Determine the schema to use using your own logic
Schema subSchema = parseAndDetermineSchema(document);
// Validate using other schema
subSchema.newValidator().validate(new DOMSource(subDocument));
相关文章