Java 8 java.util.Base64 是 sun.misc.BASE64 的替代品吗?

2022-01-21 00:00:00 encoding mime base64 java-8 java

问题

Java 8 java.util.Base64 MIME 编码器和解码器是不受支持的内部 Java API 的替代品吗sun.misc.BASE64Encodersun.misc.BASE64Decoder?

Are the Java 8 java.util.Base64 MIME Encoder and Decoder a drop-in replacement for the unsupported, internal Java API sun.misc.BASE64Encoder and sun.misc.BASE64Decoder?

编辑(澄清):直接替换我的意思是,对于任何现有的其他客户端代码,我可以使用 sun.misc.BASE64Encodersun.misc.BASE64Decoder 将遗留代码切换到 Java 8 MIME Base64 编码器/解码器.

EDIT (Clarification): By drop-in replacement I mean that I can switch legacy code using sun.misc.BASE64Encoder and sun.misc.BASE64Decoder to Java 8 MIME Base64 Encoder/Decoder for any existing other client code transparently.

到目前为止我的想法和原因

根据我的调查和快速测试(见下面的代码)它应该是一个直接替代品,因为

Based on my investigation and quick tests (see code below) it should be a drop-in replacement because

  • sun.misc.BASE64Encoder 基于其 JavaDoc 是 RFC1521 中指定的 BASE64 字符编码器.此 RFC 是 MIME 规范的一部分...
  • java.util.Base64 基于其 JavaDoc 使用The Base64 Alphabet";如 RFC 2045 的表 1 中指定的编码和解码操作... 在 MIME
  • sun.misc.BASE64Encoder based on its JavaDoc is a BASE64 Character encoder as specified in RFC1521. This RFC is part of the MIME specification...
  • java.util.Base64 based on its JavaDoc Uses the "The Base64 Alphabet" as specified in Table 1 of RFC 2045 for encoding and decoding operation... under MIME

假设 RFC 1521 和 2045 没有重大变化(我找不到任何内容)并且根据我使用 Java 8 Base64 MIME 编码器/解码器的快速测试应该没问题.

Assuming no significant changes in the RFC 1521 and 2045 (I could not find any) and based on my quick test using the Java 8 Base64 MIME Encoder/Decoder should be fine.

我在寻找什么

  • 确认或反驳直接替换"的权威来源点或
  • 一个反例,显示 java.util.Base64 具有与 sun.misc.BASE64Encoder 不同的行为 OpenJDK Java 8 实现 (8u40-b25) (BASE64Decoder) 或
  • 无论您认为如何回答上述问题肯定
  • an authoritative source confirming or disproving the "drop-in replacement" point OR
  • a counterexample which shows a case where java.util.Base64 has different behaviour than the sun.misc.BASE64Encoder OpenJDK Java 8 implementation (8u40-b25) (BASE64Decoder) OR
  • whatever you think answers above question definitely

供参考

我的测试代码

public class Base64EncodingDecodingRoundTripTest {

    public static void main(String[] args) throws IOException {
        String test1 = " ~!@#$%^& *()_+=`| }{[]\;: "?><,./ ";
        String test2 = test1 + test1;

        encodeDecode(test1);
        encodeDecode(test2);
    }

    static void encodeDecode(final String testInputString) throws IOException {
        sun.misc.BASE64Encoder unsupportedEncoder = new sun.misc.BASE64Encoder();
        sun.misc.BASE64Decoder unsupportedDecoder = new sun.misc.BASE64Decoder();

        Base64.Encoder mimeEncoder = java.util.Base64.getMimeEncoder();
        Base64.Decoder mimeDecoder = java.util.Base64.getMimeDecoder();

        String sunEncoded = unsupportedEncoder.encode(testInputString.getBytes());
        System.out.println("sun.misc encoded: " + sunEncoded);

        String mimeEncoded = mimeEncoder.encodeToString(testInputString.getBytes());
        System.out.println("Java 8 Base64 MIME encoded: " + mimeEncoded);

        byte[] mimeDecoded = mimeDecoder.decode(sunEncoded);
        String mimeDecodedString = new String(mimeDecoded, Charset.forName("UTF-8"));

        byte[] sunDecoded = unsupportedDecoder.decodeBuffer(mimeEncoded); // throws IOException
        String sunDecodedString = new String(sunDecoded, Charset.forName("UTF-8"));

        System.out.println(String.format("sun.misc decoded: %s | Java 8 Base64 decoded:  %s", sunDecodedString, mimeDecodedString));

        System.out.println("Decoded results are both equal: " + Objects.equals(sunDecodedString, mimeDecodedString));
        System.out.println("Mime decoded result is equal to test input string: " + Objects.equals(testInputString, mimeDecodedString));
        System.out.println("
");
    }
}

推荐答案

这是一个小测试程序,说明了编码字符串的不同之处:

Here's a small test program that illustrates a difference in the encoded strings:

byte[] bytes = new byte[57];
String enc1 = new sun.misc.BASE64Encoder().encode(bytes);
String enc2 = new String(java.util.Base64.getMimeEncoder().encode(bytes),
                         StandardCharsets.UTF_8);

System.out.println("enc1 = <" + enc1 + ">");
System.out.println("enc2 = <" + enc2 + ">");
System.out.println(enc1.equals(enc2));

它的输出是:

enc1 = <AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>
enc2 = <AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA>
false

请注意,sun.misc.BASE64Encoder 的编码输出末尾有一个换行符.它不会总是附加换行符,但如果编码字符串的最后一行正好有 76 个字符,它就会这样做.(java.util.Base64 的作者认为这是 sun.misc.BASE64Encoder 实现中的一个小错误 - 请参阅 审查线程).

Note that the encoded output of sun.misc.BASE64Encoder has a newline at the end. It doesn't always append a newline, but it happens to do so if the encoded string has exactly 76 characters on its last line. (The author of java.util.Base64 considered this to be a small bug in the sun.misc.BASE64Encoder implementation – see the review thread).

这可能看起来微不足道,但如果您的程序依赖于这种特定行为,则切换编码器可能会导致输出格式错误.因此,我得出结论 java.util.Base64 不是 sun.misc.BASE64Encoder 的直接替代品.

This might seem like a triviality, but if you had a program that relied on this specific behavior, switching encoders might result in malformed output. Therefore, I conclude that java.util.Base64 is not a drop-in replacement for sun.misc.BASE64Encoder.

当然,java.util.Base64 的 intent 是它在功能上是等效的、符合 RFC 的、高性能的、完全受支持和指定的替代品支持从 sun.misc.BASE64Encoder 迁移代码.不过,在迁移时您需要注意一些类似这样的边缘情况.

Of course, the intent of java.util.Base64 is that it's a functionally equivalent, RFC-conformant, high-performance, fully supported and specified replacement that's intended to support migration of code away from sun.misc.BASE64Encoder. You need to be aware of some edge cases like this when migrating, though.

相关文章