Sunday, May 1, 2011

Robots.txt block access to all https:// pages

What would the syntax be to block all access to any bots to https:// pages? I have an old site that now doesn't have an SSL and I want to block access to all https:// pages

From stackoverflow
  • Don't accept connections on port 443.

  • I don’t know if it works, if the robots use/request different robots.txt for different protocols. But you could deliver a different robots.txt for requests over HTTPS.

    So when http://example.com/robots.txt is requested, you deliver the normal robots.txt. And when https://example.com/robots.txt is requested, you deliver the robots.txt that disallows everything.

    leen3o : Any idea how I can easily do this?? I use ISAPI rewrite?? Can I use that to serve different robots.txt??
    Gumbo : I don’t know the ISAPI rewrite syntax very well. But try something like this: RewriteCond %SERVER_PORT ^433$ RewriteRule ^/robots\.txt$ /robots.txt.https [L]

0 comments:

Post a Comment