Java、XML DocumentBuilder - 解析时设置编码

2022-01-10 00:00:00 xml xml-parsing encoding domdocument java

我正在尝试将保存 XML 文档的树(扩展 JTree)保存为已更改其结构的 DOM 对象.

I'm trying to save a tree (extends JTree) which holds an XML document to a DOM Object having changed it's structure.

我创建了一个新的文档对象,遍历树成功检索了内容(包括XML文档的原始编码),现在有了一个ByteArrayInputStream具有正确编码的树内容(XML 文档).

I have created a new document object, traversed the tree to retrieve the contents successfully (including the original encoding of the XML document), and now have a ByteArrayInputStream which has the tree contents (XML document) with the correct encoding.

问题是当我解析 ByteArrayInputStream 时,编码会自动更改为 UTF-8(在 XML 文档中).

The problem is when I parse the ByteArrayInputStream the encoding is changed to UTF-8 (in the XML document) automatically.

有没有办法防止这种情况并使用 ByteArrayInputStream 中提供的正确编码.

Is there a way to prevent this and use the correct encoding as provided in the ByteArrayInputStream.

值得补充的是,我已经使用了
transformer.setOutputProperty(OutputKeys.ENCODING, encoding) 方法来检索正确的编码.

It's also worth adding that I have already used the
transformer.setOutputProperty(OutputKeys.ENCODING, encoding) method to retrieve the right encoding.

任何帮助将不胜感激.

推荐答案

这是一个更新的答案,因为 OutputFormat 已被弃用:

Here's an updated answer since OutputFormat is deprecated :

TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");

StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(document), new StreamResult(writer));
String output = writer.getBuffer().toString().replaceAll("
|", "");

第二部分将 XML 文档作为字符串返回

The second part will return the XML Document as String

相关文章