使用套接字时“流结束"是什么意思

2022-01-19 00:00:00 sockets network-programming java

在 Java 中使用套接字时，如何在开始处理之前判断客户端是否已完成所有(二进制)数据的发送.例如:

When working with Sockets in Java, how can you tell whether the client has finished sending all (binary) data, before you could start processing them. Consider for example:

istream = new BufferedInputStream (socket.getInputStream()); ostream = new BufferedOutputStream(socket.getOutputStream()); byte[] buffer = new byte[BUFFER_SIZE]; int count; while(istream.available() > 0 && (count = istream.read(buffer)) != -1) { // do something.. } // assuming all input has been read ostream.write(getResponse()); ostream.flush();

我在 SO 上阅读过类似的帖子，例如 this，但找不到确凿的答案.虽然我上面的解决方案有效，但我的理解是，您永远无法真正判断客户端是否已完成所有数据的发送.例如，如果客户端套接字发送了几块数据，然后在它可以发送更多数据之前阻塞等待来自另一个数据源的数据，上面的代码很可能假设客户端已经完成了自 istream.available() 以来的所有数据的发送将为当前字节流返回 0.

I've read similar posts on SO such as this, but couldn't find a conclusive answer. While my solution above works, my understanding is that you can never really tell if the client has finished sending all data. If for instance the client socket sends a few chunks of data and then blocks waiting for data from another data source before it could send more data, the code above may very well assume that the client has finished sending all data since istream.available() will return 0 for the current stream of bytes.

推荐答案

是的，你是对的 - 像这样使用 available() 是不可靠的.我个人很少使用available().如果您想阅读直到到达 stream 的末尾(根据问题标题)，请继续调用 read() 直到它返回 -1.这很容易.困难的一点是，如果您不想要流的结尾，而是服务器此刻想要发送给您的内容"的结尾.

Yes, you're right - using available() like this is unreliable. Personally I very rarely use available(). If you want to read until you reach the end of the stream (as per the question title), keep calling read() until it returns -1. That's the easy bit. The hard bit is if you don't want the end of the stream, but the end of "what the server wants to send you at the moment."

正如其他人所说，如果您需要通过套接字进行对话，您必须让协议解释数据在哪里结束.就我个人而言，在可能的情况下，我更喜欢长度前缀"解决方案而不是消息结束令牌"解决方案——它通常使阅读代码更简单.但是，它会使编写代码变得更加困难，因为您需要在发送任何内容之前计算出长度.如果您可以发送大量数据，这会很痛苦.

As the others have said, if you need to have a conversation over a socket, you must make the protocol explain where the data finishes. Personally I prefer the "length prefix" solution to the "end of message token" solution where it's possible - it generally makes the reading code a lot simpler. However, it can make the writing code harder, as you need to work out the length before you send anything. This is a pain if you could be sending a lot of data.

当然，您可以混合和匹配解决方案 - 特别是，如果您的协议同时处理文本和二进制数据，我会强烈推荐使用长度前缀字符串而不是以空值结尾的字符串(或任何类似的东西).如果您可以向解码器传递一个完整的字节数组并返回一个字符串，那么解码字符串数据往往会容易得多 - 例如，您不必担心读取到一半的字符.您可以将其用作协议的一部分，但仍然具有带有数据结束"记录的整体记录"(或您正在传输的任何内容)，以让阅读器处理数据并做出响应.

Of course, you can mix and match solutions - in particular, if your protocol deals with both text and binary data, I would strongly recommend length-prefixing strings rather than null-terminating them (or anything similar). Decoding string data tends to be a lot easier if you can pass the decoder a complete array of bytes and just get a string back - you don't need to worry about reading to half way through a character, for example. You could use this as part of your protocol but still have overall "records" (or whatever you're transmitting) with an "end of data" record to let the reader process the data and respond.

当然，如果您无法控制协议，那么所有这些协议设计的东西都是没有意义的:(

Of course, all of this protocol design stuff is moot if you're not in control of the protocol :(

相关文章