Skip to main content

Blogger's robots.txt and sitemap

If you are using blogger for your hosting you have a robots.txt file automatically, and can NOT change it.

To find the robots.txt, open up your web browser, and type in your Blogger blog’s URL, at the end of the URL add robots.txt.

For example, if the URL of your blog is, then enter

You may find the following entries:
User-agent: Mediapartners-Google

User-agent: *
Disallow: /search


Mediapartners-Google is an Adsense crawler, which crawls pages to determine AdSense content. Google only use this bot to crawl your site if AdSense ads are displayed on your site. So the first two lines means your blog allow Mediapartners-Google to crawl blog contents, nothing are disallowed, so it's empty after "Disallow:".

"User-agent:* " means all search engine, a star sign '*' means all. The robots.txt instruct all search engines not to craw the subdirectory /search, the purpose is to avoid duplicate content. Your Blogger posts can be reached by archive date (normal), and also by label, each different label on any blog will result in a different URL pointing to the same post.

We can see there is only on URL named by date - if index only by date - each and every post has one and only one post date.

But, how many lablels you assign to your post, there will be same number extra URL point to the same post. "If the search engines were allowed to index by label (under /search subdirectory), they would see 6 extra instances of that post, one per label search. Since the post was already indexed by archive date, those 6 label search instances would be considered duplicate content. The search engines would penalise all 7 URLs for having duplicated content." (Blogger help forum)

Now about sitemap. You can find a line in the Robot.txt that starts with 'Sitemap:'. The URL after that label is the location of your sitemap.

"Using the example above, the line would look like:


Back in Google’s Webmaster Tools, the domain name part of the URL would already be included, so you would just need to specify the feeds/posts/default?orderby=updated portion of the sitemap URL. "(Technically Easy)

You can also ignore this sitemap, anyway Google Webmaster tools will look at the robots.txt file for a sitemap if one isn’t specified.

Google Sitemap will accept all xml pages, you can submit your blog feed as sitemap URL, for example atom.xml.


Cris said…
HI friends, this information is very interesting, I would like read more information about this topic, thanks for sharing. homes for sale in costa rica
songwriter said…
Yes, this was helpful as I was trying to fix a problem was Google had indexed which I point I added no follow to robots.txt for and was looking for info about sitemap info in robots.txt file. Thanks for posting. I already have a sitemap registered with Google. But now I'll add a link to it in my robots.txt too. Hope to keep Google from accessing the same content via subdomain that the realdomain points to. Don't want duplicate content.
Blogger said…
Get free website marketing tools at TraffiCheap.

Popular posts from this blog

How to Input Phonetic Symbols (IPA) in Google Docs

You can insert special characters by clicking "Insert" on the menu, then click the "Ω Special Characters", the choose "Latin" category from the drop-down menu, and then Phonetics (IPA) sub-category. Insert Special Characters in Google Docs There is a short-cut for inputting some IPA symbols which you use them frequently. Automatic Substitution in Google Docs similar to Auto Correct in MS Word. You can replace common acronyms, misspellings and other symbols. So you can set auto-replace for your IPA symbols, for example, "e<" for "ɛ", "o/" for "ø", "o>" for "ɔ" etc. Automatic Substitutions in Google Docs

How to stop Freenet?

How to stop or temporally shutdown Freenet? On Windows, you may find "stop freenet" in Freenet Tray. On Ubuntu, or other Linux system, go to your Freenet folder, run a command inside the terminal: FreenetUser@ubuntu:~/Freenet$ ls *.sh You can see command, have six options, one of them is to stop the Freenet: FreenetUser@ubuntu:~/Freenet$ ./ ? Usage: ./ { console | start | stop | restart | status | dump } FreenetUser@ubuntu:~/Freenet$ ./ stop Stopping Freenet 0.7... Waiting for Freenet 0.7 to exit... Stopping Freenet 0.7... Stopped Freenet 0.7. This is how you to stop the Freenet on Ubuntu.

Virgin Media Netgear Wireless Router Username and Password

As Virgin Media customer, if you find your wireless router is Netgear, then you may type the router's setup URL into a web browser address bar. is the default Netgear router IP address. will work for some Netgear models. Mine setup URL is . Then you are required to enter a username and password. If you haven't change the default setting, it is "virgin" and "password", you may find that on a label stuck on the router. This default username and password of Virgin Wireless router is different from that of the normal Netgear router. The default username of Netgear is admin and the password is either password or 1234. Then you open the configure interface, change the settings, such as change your DNS server to OpenDNS . For a normal Netgear router, if you forget the username and password, you can reset and restore the NETGEAR device to factory default settings. But I couldn't find any button on