S3 Java 客户端因“内容长度分隔的消息正文过早结束"而失败很多.或“java.net.SocketException 套接字关闭"

我有一个在 S3 上做了很多工作的应用程序,主要是从它下载文件.我看到很多此类错误,我想知道这是否是我的代码中的问题,或者服务是否真的像这样不可靠.

I have an application that does a lot work on S3, mostly downloading files from it. I am seeing a lot of these kind of errors and I'd like to know if this is something on my code or if the service is really unreliable like this.

我用来从 S3 对象流中读取的代码如下:

The code I'm using to read from the S3 object stream is as follows:

public static final void write(InputStream stream, OutputStream output) {

  byte[] buffer = new byte[1024];

  int read = -1;

  try {

    while ((read = stream.read(buffer)) != -1) {
      output.write(buffer, 0, read);
    }

    stream.close();
    output.flush();
    output.close();
  } catch (IOException e) {
    throw new RuntimeException(e);
  }

}

这个OutputStream是一个new BufferedOutputStream(new FileOutputStream(file)).我正在使用最新版本的 Amazon S3 Java 客户端,此调用在放弃前重试 四次.所以,在尝试了 4 次之后,它仍然失败.

This OutputStream is a new BufferedOutputStream( new FileOutputStream( file ) ). I am using the latest version of the Amazon S3 Java client and this call is retried four times before giving up. So, after trying this for 4 times it still fails.

感谢任何关于我如何改进这一点的提示或技巧.

Any hints or tips on how I could possibly improve this are appreciated.

推荐答案

我刚刚设法克服了一个非常相似的问题.就我而言,我得到的例外是相同的;它发生在较大的文件中,但不适用于小文件,并且在单步调试器中从未发生过.

I just managed to overcome a very similar problem. In my case the exception I was getting was identical; it happened for larger files but not for small files, and it never happened at all while stepping through the debugger.

问题的根本原因是 AmazonS3Client 对象在下载过程中被垃圾收集,导致网络连接中断.发生这种情况是因为我在每次调用加载文件时都构建了一个新的 AmazonS3Client 对象,而首选用例是创建一个在调用中仍然存在的持久客户端对象 - 或者至少保证在整个调用过程中都存在下载.因此,简单的补救措施是确保保留对 AmazonS3Client 的引用,以免它被 GC.

The root cause of the problem was that the AmazonS3Client object was getting garbage collected in the middle of the download, which caused the network connection to break. This happened because I was constructing a new AmazonS3Client object with every call to load a file, while the preferred use case is to create a long-lasting client object that survives across calls - or at least is guaranteed to be around during the entirety of the download. So, the simple remedy is to make sure a reference to the AmazonS3Client is kept around so that it doesn't get GC'd.

AWS 论坛上对我有帮助的链接在这里:https://forums.aws.amazon.com/thread.jspa?threadID=83326

A link on the AWS forums that helped me is here: https://forums.aws.amazon.com/thread.jspa?threadID=83326

相关文章