Googlebot 和空 CORS 响应
我们有一个 React 应用程序,它从另一个域异步加载一些数据.请求是在 cors
模式下使用 isomorphic-fetch
发出的,并且在使用我自己的浏览器进行测试时,请求和响应看起来都很好并且可以正常工作.
We have a React app that loads some data asynchronously from another domain. The requests are made using isomorphic-fetch
in cors
mode and the requests and responses all look fine and work correctly when testing using my own browser.
我们监控响应并将失败记录回我们的应用程序以供分析.
We have monitoring of the responses and log failures back to our application for analysis.
虽然大部分时间一切都很好(一切似乎都被正确编入索引并且在 Google 中显示良好)我们仍然看到很多失败,仅对于 Googlebot,它未能正确获取数据.调试响应对象我看到 status
是 200,但 statusText
是空的.响应没有正文(因此没有 .json
或 .text
方法),也没有标头(不应该是这种情况),并且模式正确设置为cors
(不是 opaque
,这可能解释了其他一些奇怪的问题).
While most of the time all is well (and everything seems to be getting indexed correctly and showing up fine in Google) we still see a lot of failures, only for Googlebot, where it's failing to fetch the data correctly. Debugging the response object I see that the status
is 200, but the statusText
is empty. The response has no body (and so no .json
or .text
methods), and no headers (which shouldn't be the case) and the mode is correctly set as cors
(not opaque
, which might explain some of the other oddities).
根据我对 CORS 的理解,就发送和接收的标头而言,这一切看起来都是光明正大的,那么为什么 Googlebot 会有这么多间歇性问题呢?Googlebot 说它有一个 HTTP 200 响应(成功,Promise 没有被拒绝),但它缺少 HTTP 200 响应带来的所有东西——它没有正文,也没有暴露标题.为什么 Googlebot 无法返回带有标题和正文的响应(如下所述)?
一个正常的预检请求看起来像这样(来自 Chome devtools)(在 */*
中添加了额外的斜线以阻止 SO 认为它是一个评论开启者)
A normal preflight request looks like this (from Chome devtools) (extra slash in */*
added to stop SO thinking that it's a comment opener)
Accept:*/*
Accept-Encoding:gzip, deflate, sdch, br
Accept-Language:en-GB,en-US;q=0.8,en;q=0.6
Access-Control-Request-Headers:content-type, x-apikey
Access-Control-Request-Method:POST
Cache-Control:no-cache
Connection:keep-alive
DNT:1
Host:my.host.net
Origin:http://my.origin.net
Pragma:no-cache
Referer:http://my.origin.net/
User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.100 Safari/537.36
预检响应如下所示
Access-Control-Allow-Headers:content-type,x-apikey
Access-Control-Allow-Origin:*
Cache-Control:no-cache
Connection:keep-alive
Content-Length:0
Date:Mon, 05 Dec 2016 00:55:05 GMT
Expires:-1
Pragma:no-cache
Server:Microsoft-IIS/8.5
X-AspNet-Version:4.0.30319
X-Powered-By:ASP.NET
随后是看起来像这样的实际请求(作为带有 JSON 正文的 POST 发送)
Which is then followed up by the actual request which looks like this (sent as a POST with a JSON body)
accept:application/json
Accept-Encoding:gzip, deflate, br
Accept-Language:en-GB,en-US;q=0.8,en;q=0.6
Cache-Control:no-cache
Connection:keep-alive
Content-Length:62
content-type:application/json
DNT:1
Host:someapi.net
Origin:http://my.origin.net
Pragma:no-cache
Referer:http://my.origin.net/
User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.100 Safari/537.36
x-apikey:someapikey
返回这样的响应(带有 JSON 正文)
Which returns a response like this (with a JSON body)
Access-Control-Allow-Origin:*
Cache-Control:no-cache
Connection:keep-alive
Content-Length:33576
Content-Type:application/json; charset=utf-8
Date:Mon, 05 Dec 2016 00:55:05 GMT
Expires:-1
Pragma:no-cache
Server:Microsoft-IIS/8.5
X-AspNet-Version:4.0.30319
X-Powered-By:ASP.NET
推荐答案
检查失败的 GoogleBot 调用的 IP 地址
Check the IP address of the failing GoogleBot calls
可能是个恶毒的演员,冒充谷歌
It may be a nefarious actor, pretending to be google
按照此处所述检查 IP 地址:
Check the IP addresses as described here:
https://support.google.com/webmasters/answer/80553?hl=zh-CN
相关文章