将文件编码为 base64 时内存不足

2022-01-21 00:00:00 base64 java

使用来自 Apache commons 的 Base64

Using Base64 from Apache commons

public byte[] encode(File file) throws FileNotFoundException, IOException {
        byte[] encoded;
        try (FileInputStream fin = new FileInputStream(file)) {
            byte fileContent[] = new byte[(int) file.length()];
            fin.read(fileContent);
            encoded = Base64.encodeBase64(fileContent);
        }
        return encoded;   
}


Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space
    at org.apache.commons.codec.binary.BaseNCodec.encode(BaseNCodec.java:342)
    at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:657)
    at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:622)
    at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:604)

我正在为移动设备制作小应用程序.

I'm making small app for mobile device.

推荐答案

你不能只将整个文件加载到内存中,就像这里:

You cannot just load the whole file into memory, like here:

byte fileContent[] = new byte[(int) file.length()];
fin.read(fileContent);

而是逐块加载文件并分段编码.Base64是一种简单的编码,一次加载3个字节并编码就足够了(编码后会产生4个字节).出于性能原因考虑加载 3 个字节的倍数,例如3000 字节 - 应该没问题.还要考虑缓冲输入文件.

Instead load the file chunk by chunk and encode it in parts. Base64 is a simple encoding, it is enough to load 3 bytes and encode them at a time (this will produce 4 bytes after encoding). For performance reasons consider loading multiples of 3 bytes, e.g. 3000 bytes - should be just fine. Also consider buffering input file.

一个例子:

byte fileContent[] = new byte[3000];
try (FileInputStream fin = new FileInputStream(file)) {
    while(fin.read(fileContent) >= 0) {
         Base64.encodeBase64(fileContent);
    }
}

请注意,您不能简单地将 Base64.encodeBase64() 的结果附加到 encoded bbyte 数组.实际上,它不是加载文件而是将其编码为 Base64 导致内存不足问题.这是可以理解的,因为 Base64 版本更大(并且您已经有一个占用大量内存的文件).

Note that you cannot simply append results of Base64.encodeBase64() to encoded bbyte array. Actually, it is not loading the file but encoding it to Base64 causing the out-of-memory problem. This is understandable because Base64 version is bigger (and you already have a file occupying a lot of memory).

考虑将您的方法更改为:

Consider changing your method to:

public void encode(File file, OutputStream base64OutputStream)

并将 Base64 编码的数据直接发送到 base64OutputStream 而不是返回.

and sending Base64-encoded data directly to the base64OutputStream rather than returning it.

更新:感谢 @StephenC 我开发了更简单的版本:

UPDATE: Thanks to @StephenC I developed much easier version:

public void encode(File file, OutputStream base64OutputStream) {
  InputStream is = new FileInputStream(file);
  OutputStream out = new Base64OutputStream(base64OutputStream)
  IOUtils.copy(is, out);
  is.close();
  out.close();
}

它使用 Base64OutputStream 将输入转换为 Base64 on-the-fly 和 IOUtils 类来自 Apache Commons IO.

注意:如果需要,您必须显式关闭 FileInputStreamBase64OutputStream 以打印 = 但缓冲由 IOUtils.copy 处理().

Note: you must close the FileInputStream and Base64OutputStream explicitly to print = if required but buffering is handled by IOUtils.copy().

相关文章