Robots.txt 和元数据标签是否足以阻止搜索引擎索引依赖于 $_GET 变量的动态页面?

我创建了一个 php 页面,该页面只能通过通过 $_GET 收到的令牌/通行证访问

I created a php page that is only accessible by means of token/pass received through $_GET

因此,如果您转到以下网址,您将获得一个通用页面或空白页面

Therefore if you go to the following url you'll get a generic or blank page

http://fakepage11.com/secret_page.php

但是,如果您使用带有令牌的链接,它会向您显示特殊内容

However if you used the link with the token it shows you special content

http://fakepage11.com/secret_page.php?token=344ee8260604a8260606008360806060806080606

http://fakepage11.com/secret_page.php?token=344ee833bde0d8fa008de206606769e4

当然,这不如登录页面安全,但我唯一关心的是创建一个不可索引且只能通过提供的链接访问的动态页面.

Of course this is not as safe as a login page, but my only concern is to create a dynamic page that is not indexable and only accessed through the provided link.

依赖于 $_GET 变量的动态页面是否被谷歌和其他搜索引擎索引?

如果是这样,包含以下内容是否足以隐藏它?

  • Robots.txt User-agent: * Disallow:/

元数据:

即使我输入谷歌:

site:fakepage11.com/

site:fakepage11.com/

谢谢!

推荐答案

如果搜索引擎机器人以某种方式找到带有令牌的链接¹,它可能会对其进行抓取和索引.

If a search engine bot finds the link with the token somehow¹, it may crawl and index it.

如果您使用 robots.txt 禁止抓取页面,则符合要求的搜索引擎机器人不会抓取该页面,但它们仍可能将其 URL 编入索引(然后可能会出现在 site: 搜索中).

If you use robots.txt to disallow crawling the page, conforming search engine bots won’t crawl the page, but they may still index its URL (which then might appear in a site: search).

如果您使用 meta-robots 来禁止索引页面,符合标准的搜索引擎机器人不会索引该页面,但它们可能会仍然抓取它.

If you use meta-robots to disallow indexing the page, conforming search engine bots won’t index the page, but they may still crawl it.

你不能同时拥有:如果你禁止爬行,符合标准的机器人永远不会知道你也禁止索引,因为他们不允许访问页面来查看你的meta-robots 元素.

You can’t have both: If you disallow crawling, conforming bots can never learn that you also disallow indexing, because they are not allowed to visit the page to see your meta-robots element.

¹ 搜索引擎可以通过多种方式找到链接.例如,访问该页面的用户可能会使用浏览器工具栏自动将所有访问过的 URL 发送到搜索引擎.

¹ There are countless ways how search engines might find a link. For example, a user that visits the page might use a browser toolbar that automatically sends all visited URLs to a search engine.

相关文章