Google and Alexa are saying no one visit my website, when 78 to 100 new people join the site daily and more than 400 people are using the website daily.
Please help..
--------------------------------------------
So I contacted Alexa they told me that the problem is my website.
"The robots.txt conflicts with the Robots Exclusion Standard. Found errors with this web page that can cause problems for search engines.
Warnings:
Some crawlers are denied access to this website."
|
Delete robots.txt file if you don't need it or post its content here.
For access stats you should not rely on 3. party services. Use the stats from your hosting provider.
|
# robots.txt generated at http://www.mcanerin.com User-agent: twiceler I have no clue what is twiceler' Disallow: / User-agent: baiduspider : I have no clue what is baiduspider" Disallow: / User-agent: naverbot : I have no clue what is Naverbot" Disallow: / User-agent: yeti I have no clue what is yeti" Disallow: / User-agent: asterias and no clue what is asterias" Disallow: / User-agent: * Disallow: Disallow: /cgi-bin/ Sitemap: http://vivestar.com/sitemap.xml |
where to Delete robots.txt? How to remove that text? I remove it after refresh it come back.
# robots.txt generated at http://www.mcanerin.com User-agent: twiceler Disallow: / User-agent: baiduspider Disallow: / User-agent: naverbot Disallow: / User-agent: yeti Disallow: / User-agent: asterias Disallow: / User-agent: * Disallow: Disallow: /cgi-bin/ Sitemap: http://vivestar.com/sitemap.xml
|
You need an empty line after the disallow statement.
Delete "Disallow:" (3. last line)
|
They can't delete or edit |
no I did not create anything.. I just go there after you told me to Delete robots.txt .. I have no clue what they are. |
You can't delete it with your ftp client or in your admin panel?
Then you might contact your hosting provider.
|
Sorry my bad I was trying to remove them from google webmater .. So delete everything? below?
# robots.txt generated at http://www.mcanerin.com User-agent: twiceler Disallow: / User-agent: baiduspider Disallow: / User-agent: naverbot Disallow: / User-agent: yeti Disallow: / User-agent: asterias Disallow: / User-agent: * Disallow: Disallow: /cgi-bin/ Sitemap: http://vivestar.com/sitemap.xml
|
I would only keep this:
User-agent: * Disallow: /cgi-bin/
(followed by an empty line)
Sitemap: http://vivestar.com/sitemap.xml
|
Thank you very for your help.. I hope that fix the problem... :) |
This is what you do with a robots.txt file; you tell the good bots what they can index on your site because the bad bots will do whatever they please. To block the bad bots, you need to create a while list of who you want to allow into your site and block everything else. It is not easy because the spammers are constantly working to get around things. Instead of setting up a blacklist and keep adding to it every time the bad bots change the user agent; set up a white list and tell it that google and bing and yahoo! and any other good bot, are allowed in. Other bots will be blocked by default. Geeks, making the world a better place |
Except when the good bots change their name…so you best trash the whole robots.txt thingy. |
User-agent: baiduspider : I have no clue what is baiduspider" Disallow: / User-agent: naverbot : I have no clue what is Naverbot" Disallow: / User-agent: yeti I have no clue what is yeti" Disallow: / User-agent: asterias and no clue what is asterias"
Baidu = China Search Engine like google, http://www.baidu.com/
Naverbot = Korean Search Company, http://www.navercorp.com/ko/index.nhn
Yeti = Another Search company, http://www.yeti.co/
I allow all these on my "global" sites, IMO it is good to be indexed by all these.
ManOfTeal.COM a Proud UNA site, six years running strong! |
You need an empty line after the disallow statement.
Delete "Disallow:" (3. last line)
As I said, you set up a whitelist, NOT a blacklist. Don't let any bot in except the ones in the whitelist. It is not just indexing bots that will crawl a site, there are scrappers as well as email hunters, etc.
Geeks, making the world a better place |
remember msnbot bingbot?
If you don't have a robots.txt you don't have to do anything - it works.
If you have one you always have to tune it no matter what the strategy is.
|
remember msnbot bingbot?
If you don't have a robots.txt you don't have to do anything - it works.
If you have one you always have to tune it no mather the strategy.
First of all, the robots.txt file tells the good bots which part of your site they can index. Don't use it as a firewall to try and block bad bots. Next, set up a whitelist; for Apache, this can be done with .htaccess, in which you block all bots then allow the ones you want to come in.
Geeks, making the world a better place |
My point is: This is not a good idea. As always with prohibition you will punish the good ones.
You don't know about future good ones but you ban them.
The only good statement in robots.txt is:
User-agent: * Disallow: /youarewastingyourtimehere/
|
I am not sure the problem is causing by robots.txt
the sitemap has been giving problem for over months or year. but Alex told me the sitemap was fine. http://www.boonex.com/forums/#topic/Sitemap-stop-working.htm So I told him Alex "Thanks Alex, It works now." but it was not working.
Anyway I contacted hostfoweb for help this is what they told me about the sitemap.
In order to know why these sites report no visitors, we need to know how they track them. Your sitemap file is here: -rw-rw-rw-. 1 branchlo branchlo 0 Dec 9 02:00 sitemap.xml
and it contains no data at all, so it has not been properly generated.
It should not say branchlo it should say Vivestar that's one thing about the email hostforweb sent me.
|
Alexa gets site stats from its toolbar and other means. It is not reliable for small sites. You should use the stats from your hosting provider (web hosting control panel) or if they don't offer stats (VPS, dedicated server) generate your own from log files. |