Universal Access To All Knowledge
Home donate | Forums | FAQs | Contributions | Terms, Privacy, & Copyright | Contact | Volunteer Positions | Jobs | Bios
Search: Advanced Search
Anonymous User (login or join us)
Upload

Reply to this post | Go Back
View Post [edit]

Poster: Katecurl Date: June 29, 2013 01:27:46am
Forum: web Subject: Why can't I access old crawled sites who have recent robot.txt exclusions?

I've been trying to visit some old pages, but the Wayback machine gives me robot.txt errors which are recent. That makes no sense, I can understand that the domain excludes the site now, but it can't possibly be 10 or more years ago. Since when do robot.txt time travel?

Reply to this post
Reply [edit]

Poster: nick.clark Date: July 19, 2013 10:09:29am
Forum: web Subject: Re: Why can't I access old crawled sites who have recent robot.txt exclusions?

I am wondering the same thing. Is it possible that it's just a waiting game to see a new robots.txt file? I was in the process of rebuilding a site and normally I turn off robots while I am building. How soon does the IA scan the domain to put the archive back online?

Terms of Use (10 Mar 2001)