Posts Tagged ‘url structure’

Tweaking WordPress – Sitemap

March 31st, 2010

 

Way To Improve Your Content Overview

 

 

 

I was looking for some site map solution that would satisfy my picky needs (to be seen as wp page not as wp post and to show list of categories with posts inside them). Sitemap is cool webmaster tool. It helps visitor to easy navigate through your site. It shows page structure and also, some sitemap WordPress plugins create sitemap.xml file that is very useful when search engines map your site. I found couple that were partially good enough. First was WP-Archives by Jim Penaloza Calixto (more info at http://blog.unijimpe.net/, wp plug-in download url – http://downloads.wordpress.org/plugin/wp-archives.zip). It was easy and simple to install and configure. It is installed as post and it shows posts in chronological order (but not what category they belong) with day/month info. The second was PS Auto Sitemap by Hitoshi Omagari (more info at http://www.web-strategy.jp/wp_plugin/ps_auto_sitemap/, wp plug-in download url – http://downloads.wordpress.org/plugin/ps-auto-sitemap.zip). It had visually all that I wanted (categories with list of posts that belong to them), but it was implemented as a post. I didn’t want that (yes, I know I could easily put it in sidebar as a link, but I wanted it to be on the top with other page links). So I was using the WP Archives plug-in for a while. It looked like this:


Old Sitemap

 

When I managed to find more free time, I decided to fix this. I used PS Auto Sitemap plug-in and did some simple code change in php.

After plug-in installation, I edited header.php of my current theme and I put something like this at the place where pages are being listed:

<li <?php if($post->ID == xxx) echo ‘class=”current_page_item”‘; ?>><a href=”http://yoursite/post-name-xxx”>Post Name</a></li>

Here, xxx is ID of your plug-in post that you can find in your post URL at the end, or in _posts table of your WordPress database. In my case ID was 391, table name was wp_posts, page url was www.geekwidget.com and post name was Sitemap Widget, and it looked like this:


Header Tweak

 

Now, it was still a post. In some themes, like in one that I am using, it had previous post and next post links whenever a post was separately opened. Since I was picky, as mentioned, I wanted to change that, too. I edited single.php in same folder and put something like this:

<?php if ($post->ID != xxx) { ?>

<div>
<div><?php previous_post_link(‘&laquo; %link’) ?></div>
<div><?php next_post_link(‘%link &raquo;’) ?></div>
</div>

<?php } ?>

The xxx part was again plugin post id. In my case it looked like this:


Single Tweak

 

After that I had what I wanted:


New Sitemap

 

 

Google’s SEO Starter Guide – Webmaster Tools

March 26th, 2010

 

Make Use of Free Webmaster Tools

 

 

 

Major search engines, including Google, provide free tools for webmasters. Google’s Webmaster Tools help webmasters better control how Google interacts with their websites and get useful information from Google about their site. Using Webmaster Tools won’t help your site get preferential treatment; however, it can help you identify issues that, if addressed, can help your site perform better in search results. With the service, webmasters can:

 

• see which parts of a site Googlebot had problems crawling
• upload an XML Sitemap file
analyze and generate robots.txt files
remove URLs already crawled by Googlebot
specify the preferred domain
identify issues with title and description meta tags
• understand the top searches used to reach a site
• get a glimpse at how Googlebot sees pages
remove unwanted sitelinks that Google may use in results
receive notification of quality guideline violations and file for a site reconsideration

 

Yahoo! (Yahoo! Site Explorer) and Microsoft (Live Search Webmaster Tools) also offer free tools for webmasters.

 

Google’s SEO Starter Guide – robots.txt

March 24th, 2010

 

Make Effective Use of robots.txt

 

 

 

A “robots.txt” file tells search engines whether they can access and therefore crawl parts of your site. This file, which must be named “robots.txt”, is placed in the root directory of your site.

 

 


SEO Guide Robots1


The address of our robots.txt file

 

 


SEO Guide Robots 2


All compliant search engine bots (denoted by the wildcard * symbol) shouldn’t access and crawl the content under /images/ or any URL whose path begins with /search

 

 

You may not want certain pages of your site crawled because they might not be useful to users if found in a search engine’s search results. If you do want to prevent search engines from crawling your pages, Google Webmaster Tools has a friendly robots.txt generator to help you create this file. Note that if your site uses subdomains and you wish to have certain pages not crawled on a particular subdomain, you’ll have to create a separate robots.txt file for that subdomain. For more information on robots.txt, we suggest this Webmaster Help Center guide on using robots.txt files.

There are a handful of other ways to prevent content appearing in search results, such as adding “NOINDEX” to your robots meta tag, using .htaccess to password protect directories, and using Google Webmaster Tools to remove content that has already been crawled. Google engineer Matt Cutts walks through the caveats of each URL blocking method in a helpful video.

 

 

Good practices for robots.txt

 

• Use more secure methods for sensitive content – You shouldn’t feel comfortable using robots.txt to block sensitive or confidential material. One reason is that search engines could still reference the URLs you block (showing just the URL, no title or snippet) if there happen to be links to those URLs somewhere on the Internet (like referrer logs). Also, non-compliant or rogue search engines that don’t acknowledge the Robots Exclusion Standard could disobey the instructions of your robots.txt. Finally, a curious user could examine the directories or subdirectories in your robots.txt file and guess the URL of the content that you don’t want seen. Encrypting the content or password-protecting it with .htaccess are more secure alternatives.
Avoid:
•• allowing search result-like pages to be crawled (users dislike leaving one search result page and landing on another search result page that doesn’t add significant value for them)
•• allowing a large number of auto-generated pages with the same or only slightly different content to be crawled: “Should these 100,000 near-duplicate pages really be in a search engine’s index?”

• allowing URLs created as a result of proxy services to be crawled