scrapy采集时伪装成 HTTP/1.1的方法
添加下面的代码到 settings.py 文件
DOWNLOADER_HTTPCLIENTFACTORY = 'myproject.downloader.HTTPClientFactory' ``` 保存以下代码到单独的.py文件 ```python from scrapy.core.downloader.webclient import ScrapyHTTPClientFactory, ScrapyHTTPPageGetter class PageGetter(ScrapyHTTPPageGetter): def sendCommand(self, command, path): self.transport.write('%s %s HTTP/1.1\r\n' % (command, path)) class HTTPClientFactory(ScrapyHTTPClientFactory): protocol = PageGetter
相关文章