cant find robot.txt in file manager, still google reads it!!??

  • 2
  • Question
  • Updated 9 years ago
  • Answered
i dont have a robot.txt file in my file manager, still google webmaster tool is reading it from ujjvalshah.synthasite.com/robot.txt and preventing the files in my resources folder from getting indexed.
i dont know how this robot file was created. i dont want it.

can anyone please help me out?..

thanks in advance
Photo of ujjval shah

ujjval shah

  • 11 Posts
  • 0 Reply Likes
  • frustrated

Posted 9 years ago

  • 2
Photo of Peter

Peter

  • 2569 Posts
  • 113 Reply Likes
Hello ujjval shah,

I'm unaware of a robot.txt file being on any Yola website. It isn't able to be created by the user as it needs to be put in the head section of the page. users don't have access to this at all.

Would you mind copying the message that you got from Google on this issue . maybe that will give us a bit of a clue?
Photo of ujjval shah

ujjval shah

  • 11 Posts
  • 0 Reply Likes
the image shows that two of my files are being blocked
it is reading the text file from www.ujjvalshah.synthasite.com/robot.txt and the robot file is also seen in the snapshot. how do i edit it or remove it?
Photo of ujjval shah

ujjval shah

  • 11 Posts
  • 0 Reply Likes
actually i read in a couple of threads that the robot file can be found/added/edited in file manager
Photo of Peter

Peter

  • 2569 Posts
  • 113 Reply Likes
Hi ujjval shah,

I see what you are saying. This file has been inserted by Yola. It prevents indexing of information that's either private or of no consequence to the indexation process. I wouldn't worry about this.

You don't have access to this file to edit it. I think that Yola people can add more information for your understanding.
Photo of ujjval shah

ujjval shah

  • 11 Posts
  • 0 Reply Likes
Hi Peter,

Thanks for your replies and help.
I need the pdf files in my file manager to get indexed!
I guess i will contact Yola directly
Photo of Peter

Peter

  • 2569 Posts
  • 113 Reply Likes
Do you want the content of the pdf to get indexed or the title?
Photo of ujjval shah

ujjval shah

  • 11 Posts
  • 0 Reply Likes
the title is getting indexed as it is being displayed on the site;
i want the content to get indexed
Photo of Peter

Peter

  • 2569 Posts
  • 113 Reply Likes
I am not sure if the content of a pdf can be read by the crawler. Can I get back to you on this?
Photo of Peter

Peter

  • 2569 Posts
  • 113 Reply Likes
I think from what i've just read that the crawlers can only read HTML. There are a family of crawlers but no mention of "pdf" at all.

This is by no means definitive and it's worth your while exploring further. Sorry for not having this info.

I wish you luck. :)
Photo of ujjval shah

ujjval shah

  • 11 Posts
  • 0 Reply Likes
Hi Peter,

As far as i know i have seen several pdf files on internet which are indexed and are listed by google if it contains matching search queries.
for example if you google for 'amplitude modulation tutorial', you should get a result which is a pdf file (3rd result in my search), and you will notice it points directly to the pdf and also shows a small preview of the content of the file.

i have sent a mail to Yola regarding my query however, lets see..

thanks for your help anyways :)
Photo of Peter

Peter

  • 2569 Posts
  • 113 Reply Likes
Thanks ujjval. I'm grateful for that info.

Please post your summary of finds back here. I for one would find it valuable.

Regards,

peter
Photo of ujjval shah

ujjval shah

  • 11 Posts
  • 0 Reply Likes
sure!
Photo of ujjval shah

ujjval shah

  • 11 Posts
  • 0 Reply Likes
I contacted the yola guys and heres what they had to say:

==========================================
Hello,

I'm sorry that that seems confusing! That file gets created when anyone creates a site and it doesn't affect any indexing. The search engines will access and index whatever is on your pages such as mysite.com/page1.php.

Kind Regards,

Emmy

SynthaSite is now Yola!

Visit our Support page for step-by-step instructions | Follow us on Twitter! | For fast, friendly community support join us at Get Satisfaction! | For the best user experience, Yola recommends Firefox
==========================================
Reply
Forward
Ujjval to Yola
show details Apr 5 (13 days ago)

Reply

Hi Emmy,

Thanks for your reply. I agree that all the pages are getting indexed on my site. But the files that are in my resources folder (or the file manager and whose links are on my home page) are being restricted from getting indexed.

I am attaching a snapshot of my Google Webmaster Tools pages to give you more idea. As you can see the two pdf files are not getting indexed as they are in the resources folder whose url has been disallowed by the robot.txt file generated. I dont want this, i want those pdf files to get indexed. I never made a robot file by myself, neither can i find one in the file manager. so how do i remove it or edit it.

Hope, this clears up your confusion.
Please help.

Thanks,

Regards,
Ujjval Shah
- Show quoted text -
robot google webmaster.JPG
196K View Download
==========================================
Reply
Forward
Yola Support to me
show details Apr 6 (12 days ago)

Reply

Request Update View the complete request history

Hello,

I understand what you are saying. Unfortunately, we have made the decision to disallow bots into the /resources of a site at this time. I will put this idea forth to our team for future consideration. I'm sorry I don't have better news on this. Please let me know if I can further assist you.

Kind Regards,

Emmy

SynthaSite is now Yola!

Visit our Support page for step-by-step instructions | Follow us on Twitter! | For fast, friendly community support join us at Get Satisfaction! | For the best user experience, Yola recommends Firefox
Photo of Peter

Peter

  • 2569 Posts
  • 113 Reply Likes
hello ujjval shah,

thank you for your time posting this to the thread. It's valuable info.

Good luck.
Photo of Emmy

Emmy

  • 5892 Posts
  • 299 Reply Likes
Hey guys,
The indexing of files in your /resources has been put forth as a feature request.

Emmy
Photo of Peter

Peter

  • 2569 Posts
  • 113 Reply Likes
Thank you Emmy.

Could you also request access to the robots.txt file please?
Photo of Peter

Peter

  • 2569 Posts
  • 113 Reply Likes
Emmy,

Would it be possible to ask the Yola people to make changes in the robots.txt as an adhoc request?
Photo of Emmy

Emmy

  • 5892 Posts
  • 299 Reply Likes
Hi Peter,
That would be a solution and it has been mentioned to our team that the ability to edit the robots.txt file is something that has been requested by users. At this point both are feature requests.
Photo of Peter

Peter

  • 2569 Posts
  • 113 Reply Likes
Thanks again Emmy.
Photo of ujjval shah

ujjval shah

  • 11 Posts
  • 0 Reply Likes
cool! :)
Photo of Donald

Donald

  • 2991 Posts
  • 37 Reply Likes
I don't believe the pdf files are crawled - the keywords, description, and title of these documents are crawled, not the actual document (pdf) - this is how I understood it. Would pdf be considered javascript? Is the text read as text or picture? because if the text is imaged as a picture - kind of like a scanned page... then google would not recognize it and the contents.
Photo of Peter

Peter

  • 2569 Posts
  • 113 Reply Likes
There seems to be a pdf intelligent robot that can crawl pdf files. The content buried within can be indexed and it doesn't have to be part of the title. I can't find the name of this specific robot. I will keep looking.
Photo of Donald

Donald

  • 2991 Posts
  • 37 Reply Likes
ok... according to a couple of sources... google WILL read your pdfs if they are published on your site and your robots.txt file does NOT indicate.... disallow: /pdf/ ... to find this out... on your webmasters... click the tools tab... and then click the analyze robots.txt link... scroll down and look in the first text box.... if disallow: /pdf/ is present then it won't index pdfs... otherwise, it will crawl them and can index them.
Photo of Peter

Peter

  • 2569 Posts
  • 113 Reply Likes
Thanks Donald :)
Photo of ujjval shah

ujjval shah

  • 11 Posts
  • 0 Reply Likes
i just found an alternative of getting the pdf indexed.

upload them on scribd, take the embeddable code and paste it on your site.
The paper can be viewed on your site+it is downloadable+indexed by the search engines+know how many times your pdf file was viewed and downloaded :D

a great way to keep a track of who all read your files.. especially if it is your resume ;)
Photo of Ruth

Ruth

  • 2819 Posts
  • 135 Reply Likes
Sounds like a great solution. I'm happy you found it.
Photo of ujjval shah

ujjval shah

  • 11 Posts
  • 0 Reply Likes
just one drawback..
when you search for the file, it will be displayed by the search engine, but the link will be that to a page on scribd, wonder how to direct it to my site.. :(
Photo of jeremy

jeremy, Employee

  • 1349 Posts
  • 90 Reply Likes
Hi Ujjval,
Yes, unfortunately your workaround will mean your pdf will be indexed on scribd. I am not quite sure ow you would direct it to your site - I will look in to it for you and see what I can find.