解析原始 HTTP 请求
我正在研究由完整的 POST 和 GET 请求组成的 HTTP 流量数据集,如下所示.我已经在 java 中编写了代码,将这些请求中的每一个分开并将其保存为数组列表中的字符串元素.现在我很困惑如何在 java 中解析这些原始 HTTP 请求,有没有比手动解析更好的方法?
I working on HTTP Traffic Data set which is composed of complete POST and GET request Like given below. I have written code in java that has separated each of these request and saved it as string element in array list. Now i am confused how to parse these raw HTTP request in java is there any method better than manual parsing?
GET http://localhost:8080/tienda1/imagenes/3.gif/ HTTP/1.1
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.8 (like Gecko)
Pragma: no-cache
Cache-control: no-cache
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Encoding: x-gzip, x-deflate, gzip, deflate
Accept-Charset: utf-8, utf-8;q=0.5, *;q=0.5
Accept-Language: en
Host: localhost:8080
Cookie: JSESSIONID=FB018FFB06011CFABD60D8E8AD58CA21
Connection: close
推荐答案
我 [正在] 处理 [an] HTTP 流量数据集,该数据集由完整的 POST 和 GET 请求[s] 组成
I [am] working on [an] HTTP Traffic Data set which is composed of complete POST and GET request[s]
所以你想解析一个包含多个 HTTP 请求的文件或列表.你想提取什么数据?反正这里是一个Java HTTP解析类,可以读请求行中使用的方法、版本和 URI,并将所有标头读入 Hashtable.
So you want to parse a file or list that contains multiple HTTP requests. What data do you want to extract? Anyway here is a Java HTTP parsing class, which can read the method, version and URI used in the request-line, and that reads all headers into a Hashtable.
如果您想重新发明轮子,您可以使用它或自己编写一个.查看 RFC 以了解请求在为了正确解析它:
You can use that one or write one yourself if you feel like reinventing the wheel. Take a look at the RFC to see what a request looks like in order to parse it correctly:
Request = Request-Line ; Section 5.1
*(( general-header ; Section 4.5
| request-header ; Section 5.3
| entity-header ) CRLF) ; Section 7.1
CRLF
[ message-body ] ; Section 4.3
相关文章