使用 C++ Socket 只接收必要的数据
我只是想获取带有标题的页面的内容...但是对于通过的最后一个信息包来说,我的大小为 1024 的缓冲区似乎太大或太小...我不知道不想得到太多或太少,如果这是有道理的.这是我的代码.它可以很好地打印出包含所有信息的页面,但我想确保它是正确的.
I'm just trying to get the contents of a page with their headers...but it seems that my buffer of size 1024 is either too large or too small for the last packet of information coming through...I don't want to get too much or too little, if that makes sense. Here's my code. It's printing out the page just fine with all the information, but I want to ensure that it's correct.
//Build HTTP Get Request
std::stringstream ss;
ss << "GET " << url << " HTTP/1.0
Host: " << strHostName << "
";
std::string req = ss.str();
// Send Request
send(hSocket, req.c_str(), strlen(req.c_str()), 0);
// Read from socket into buffer.
do
{
nReadAmount = read(hSocket, pBuffer, sizeof pBuffer);
printf("%s", pBuffer);
}
while(nReadAmount != 0);
推荐答案
读取 HTTP 回复的正确方法是读取直到收到完整的 LF
分隔行(某些服务器使用 bare LF
即使官方规范说使用 CRLF
),其中包含响应代码和版本,然后继续阅读 LF 分隔的行,即标题,直到遇到一个长度为 0 的行,指示标头的结尾,然后您必须分析标头以找出剩余数据的编码方式,以便您知道读取它的正确方法并知道它是如何终止的.有几种不同的可能性,请参阅 RFC 2616 第 4.4 节实际规则.
The correct way to read an HTTP reply is to read until you have received a full LF
-delimited line (some servers use bare LF
even though the official spec says to use CRLF
), which contains the response code and version, then keep reading LF-delimited lines, which are the headers, until you encounter a 0-length line, indicating the end of the headers, then you have to analyze the headers to figure out how the remaining data is encoded so you know the proper way to read it and know how it is terminated. There are several different possibilities, refer to RFC 2616 Section 4.4 for the actual rules.
换句话说,你的代码需要改用这种结构(伪代码):
In other words, your code needs to use this kind of structure instead (pseudo code):
// Send Request
send(hSocket, req.c_str(), req.length(), 0);
// Read Response
std::string line = ReadALineFromSocket(hSocket);
int rescode = ExtractResponseCode(line);
std::vector<std::string> headers;
do
{
line = ReadALineFromSocket(hSocket);
if (line.length() == 0) break;
headers.push_back(line);
}
while (true);
if (
((rescode / 100) != 1) &&
(rescode != 204) &&
(rescode != 304) &&
(request is not "HEAD")
)
{
if ((headers has "Transfer-Encoding") && (Transfer-Encoding != "identity"))
{
// read chunks until a 0-length chunk is encountered.
// refer to RFC 2616 Section 3.6 for the format of the chunks...
}
else if (headers has "Content-Length")
{
// read how many bytes the Content-Length header says...
}
else if ((headers has "Content-Type") && (Content-Type == "multipart/byteranges"))
{
// read until the terminating MIME boundary specified by Content-Type is encountered...
}
else
{
// read until the socket is disconnected...
}
}
相关文章