My robots.txt has been respected by every bot that visited it in the past three months. I know this because i wrote a page that IP bans anything that visits it, and l also put it as a not allowed spot in the robots.txt file.
I’ve only gotten like, 20 visits in the past three months though, so, very small sample size.
Oops. As a non-native English speaker I misunderstood what he meant. I understood wrongly that he set the server to ban everything that asked for robots.txt
Just in case it makes you feel any better: I’m a native English speaker who always aced the reading comprehension tests back in school, and I read it the exact same way. Lol! I’m glad I wasn’t the only one. :)
You need to read again the thing that was described, more carefully. Imagine for example that by “a page,” the person means a page called /juicy-content or something.
Interesting way of testing this. Another would be to search the search machines with adding -site:your.domain to show results from your site only. Not an exhaustive check, but another tool to test this behavior.
I doubt Google respects any robots.txt
My robots.txt has been respected by every bot that visited it in the past three months. I know this because i wrote a page that IP bans anything that visits it, and l also put it as a not allowed spot in the robots.txt file.
I’ve only gotten like, 20 visits in the past three months though, so, very small sample size.
This is fuckin GENIUS
only if you don’t want any visits except from yourself, because this removes your site from any search engine
should write a “disallow: /juicy-content” and then block anything that tries to access that page (only bad bots would follow that path)
That’s exactly what was described…?
Oops. As a non-native English speaker I misunderstood what he meant. I understood wrongly that he set the server to ban everything that asked for robots.txt
Just in case it makes you feel any better: I’m a native English speaker who always aced the reading comprehension tests back in school, and I read it the exact same way. Lol! I’m glad I wasn’t the only one. :)
You need to read again the thing that was described, more carefully. Imagine for example that by “a page,” the person means a page called /juicy-content or something.
Thank you for sharing
Interesting way of testing this. Another would be to search the search machines with adding
-site:your.domain
to show results from your site only. Not an exhaustive check, but another tool to test this behavior.for common people they respect and even warn a webmaster if they submit a sitemap that has paths included in robots.txt