从流输入中解析没有根元素的 XML 片段列表
在 Java 中使用 SAX api 从流输入中解析没有根元素的 XML 片段列表是否可行?
Is it feasible in Java using the SAX api to parse a list of XML fragments with no root element from a stream input?
我尝试解析这样的 XML,但得到了一个
I tried parsing such an XML but got a
org.xml.sax.SAXParseException: The markup in the document following the root element must be well-formed.
甚至在 endDocument 事件被触发之前.
before even the endDocument event was fired.
我不想解决明显但笨拙的解决方案,例如预先附加自定义根元素或使用缓冲片段解析".
I would like not to settle with obvious but clumsy solutions as "Pre-append a custom root element or Use buffered fragment parsing".
我正在使用 Java 1.6 的标准 SAX API.SAX 工厂有 setValidating(false) 以防万一.
I am using the standard SAX API of Java 1.6. The SAX factory had setValidating(false) in case anyone wondered.
推荐答案
首先,最重要的是,您正在解析的内容不是 XML 文档.来自 XML 规范:
First, and most important of all, the content you are parsing is not an XML document. From the XML Specification:
[定义:只有一个元素,称为根,或文档元素,其任何部分都不会出现在任何其他元素的内容中.]
[Definition: There is exactly one element, called the root, or document element, no part of which appears in the content of any other element.]
现在,至于用 SAX 解析这个 - 尽管你说过笨拙 - 我建议采用以下方法:
Now, as to parsing this with SAX - in spite of what you said about clumsiness - I'd suggest the following approach:
Enumeration<InputStream> streams = Collections.enumeration(
Arrays.asList(new InputStream[] {
new ByteArrayInputStream("<root>".getBytes()),
yourXmlLikeStream,
new ByteArrayInputStream("</root>".getBytes()),
}));
SequenceInputStream seqStream = new SequenceInputStream(streams);
// Now pass the `seqStream` into the SAX parser.
使用 SequenceInputStream
是将多个输入流连接成单个流的便捷方式.它们将按照传递给构造函数的顺序被读取(或者在这种情况下 - 由 Enumeration
返回).
将它传递给您的 SAX 解析器,您就完成了.
Pass it to your SAX parser, and you are done.
相关文章