在 Java 中针对 XSD 验证 XML/获取 schemaLocation

2022-01-09 00:00:00 xml-parsing xsd java xml-validation

如何在 Java 中使用 XSD 验证 XML 文件?我们事先并不知道模式.我希望能够获取 schemaLocation、下载 XSD、缓存它,然后执行实际验证.

How can one validate an XML file using an XSD in Java? We don't know the schema in advance. I would like to be able to get the schemaLocation, download the XSD, cache it and then perform the actual validation.

问题是,使用 javax.xml.parsers.DocumentBuilder/DocumentBuilderFactory 类我似乎无法获得 schemaLocation 提前.这有什么诀窍?我应该研究哪些课程?

The problem is, that with javax.xml.parsers.DocumentBuilder/DocumentBuilderFactory classes I can't seem to be able to get a hold of the schemaLocation in advance. What's the trick for this? Which classes should I look into?

也许我可以使用更合适的 API?整个问题是我们需要动态验证,而不是(必须)在本地拥有 XSD.

Perhaps there's a more suitable API I can use? The whole problem is that we need to validate dynamically, without (necessarily) having the XSDs locally.

如何获取 XSD 文件中定义的 schemaLocation 的 URL?

How could one get a hold of the URL of schemaLocation defined in the XSD file?

我知道您可以设置功能/属性,但那是另一回事.我需要先从 XSD 中获取 schemaLocation.

I know you can set features/attributes, but that's a different thing. I need to get the schemaLocation from the XSD first.

请指教!

推荐答案

鉴于您使用的是 Xerces(或 JDK 默认),您是否尝试过在出厂时将此功能设置为 true:http://apache.org/xml/features/validation/schema.您还可以使用有关架构的其他功能:http://xerces.apache.org/xerces2-j/features.html

Given that you are using Xerces (or JDK default), have you tried setting this feature to true on the factory: http://apache.org/xml/features/validation/schema. There are other features that you can play with regarding schemas: http://xerces.apache.org/xerces2-j/features.html

更新 2(用于缓存):

UPDATE 2 (for caching):

实现一个 org.w3c.dom.ls.LSResourceResolver 并使用 setResourceResolver 方法在 SchemaFactory 上设置它.这个解析器要么从缓存中获取架构,要么从位置所指的任何地方获取它.

Implement a org.w3c.dom.ls.LSResourceResolver and set this on the SchemaFactory using the setResourceResolver method. This resolver would either get the schema from cache or fetch it from wherever the location refers to.

更新 3:

LSResourceresolver 示例(我认为这对您来说是一个很好的起点):

LSResourceresolver example (which I think will be a good starting point for you):

/**
 * Resolves resources from a base URL
 */
public class URLBasedResourceResolver implements LSResourceResolver {

private static final Logger log = LoggerFactory
        .getLogger(URLBasedResourceResolver.class);

private final URI base;

private final Map<URI, String> nsmap;

public URLBasedResourceResolver(URL base, Map<URI, String> nsmap)
        throws URISyntaxException {
    super();
    this.base = base.toURI();
    this.nsmap = nsmap;
}

@Override
public LSInput resolveResource(String type, String namespaceURI,
        String publicId, String systemId, String baseURI) {
    if (log.isDebugEnabled()) {
        String msg = String
                .format("Resolve: type=%s, ns=%s, publicId=%s, systemId=%s, baseUri=%s.",
                        type, namespaceURI, publicId, systemId, baseURI);
        log.debug(msg);
    }
    if (type.equals(XMLConstants.W3C_XML_SCHEMA_NS_URI)) {
        if (namespaceURI != null) {
            try {
                URI ns = new URI(namespaceURI);
                if (nsmap.containsKey(ns))
                    return new MyLSInput(base.resolve(nsmap.get(ns)));
            } catch (URISyntaxException e) {
                // ok
            }
        }
    }
    return null;
}

}

MyLSInput 的实现真的很无聊:

The implementation of MyLSInput is really boring:

class MyLSInput implements LSInput {

private final URI url;

public MyLSInput(URI url) {
    super();
    this.url = url;
}

@Override
public Reader getCharacterStream() {
    return null;
}

@Override
public void setCharacterStream(Reader characterStream) {

}

@Override
public InputStream getByteStream() {
    return null;
}

@Override
public void setByteStream(InputStream byteStream) {

}

@Override
public String getStringData() {
    return null;
}

@Override
public void setStringData(String stringData) {

}

@Override
public String getSystemId() {
    return url.toASCIIString();
}

@Override
public void setSystemId(String systemId) {
}

@Override
public String getPublicId() {
    return null;
}

@Override
public void setPublicId(String publicId) {
}

@Override
public String getBaseURI() {
    return null;
}

@Override
public void setBaseURI(String baseURI) {

}

@Override
public String getEncoding() {
    return null;
}

@Override
public void setEncoding(String encoding) {

}

@Override
public boolean getCertifiedText() {
    return false;
}

@Override
public void setCertifiedText(boolean certifiedText) {

}

}

相关文章