Search Engine Optimization

When the Web was younger, the search engine field was all but wide open. There were lots of major search engines, including: AltaVista, Excite, HotBot, and Webcrawler. This proliferation of search engines had both its advantages and disadvantages. One disadvantage was that you had to make sure you had submitted to several different places. One advantage was that you had several inflows of search engine spawned traffic.

Google's Importance to Webmasters

It's becoming ever more important what Google thinks of your site. That means you're going to be sure that your site abides by the Google rules or risk not being picked up. If you're very concerned about search engine traffic, you're going to have to make sure that your site is optimized for luring in Google's spiders and being indexed in an effective manner. And if you're concerned that Google should not index some parts of your site, you need to understand the ins and outs of configuring your robots.txt file to reflect your preferences.

The Mysterious PageRank

You'll hear a lot of people talk about Google's PageRank, bragging about attaining the misty heights of rank seven or eight, talking hushed tones of sites that have achieved nine or ten. PageRanks range from 0 (sites that haven't been ranked or have been penalized) to 10 (reserved only for the most popular sites like Yahoo! and Google itself). The only place where can actually see what PageRank a given URL has is from the Google Toolbar, though you can get some idea of popularity from the Google Directory. Listings in the Google Directory contain a green bar next to them that allow you to give a good idea of the listing's popularity without having an exact number.

Google has never provided the entire formula for their PageRank, so all you'll find in this book is conjecture. And it wouldn't surprise me to learn that it's changing all the time; as millions of people try myriad things to make sure their pages rank better, Google has to take these efforts into account and (sometimes) reacted against them.

Why is PageRank so important? Because Google uses that as one aspect of determining how a given URL will rank among millions of possible search results, but that's only one aspect. The other aspects are determined via google's ranking algorithm.

The Equally Mysterious Algorithm

If you thought Google was close-mouthed about how it determine's PageRank, it's an absolute oyster when it comes to the ranking algorithm, the way that Google determines the order of search results. The articles in the book can give you some ideas, but again it's conjecture and again it's constantly changing. Your best bet is to create a content-rich web site and update it often. Google appreciates good content.

Of course, getting listed in Google's index is not the only way to tell visitors about your site. You also have the option to advertise on Google.

A Webmaster's Introduction to Google

The cornerstone of any good search engine is highly relevant results. Google's unprecedented success has been due to its uncanny ability to match quality information with a user's search terms. The core of Google's search results are based upon a patented algorithm called PageRank.

There is an entire industry focused on getting sites listed near the top of search engines. Google has proven to be the toughest search engine for a site to do well on. Even so, it isn't all that difficult for a new web site to get listed and begin receiving some traffic from Google.

It can be a daunting task to learn the ins and outs of getting your site listed with any search engine. There is a vast array of information about search engines on the Web, and not all of it is useful or proper. This discussion of getting your site into the Google database focuses on long term techniques for successfully promoting your site through Google. It will stay well away from some of the common misconceptions and problems that a new site owner faces.

Search Engine Basics

When you type in a search term at a search engine, it looks up potential matches in its database. It then presents the best web page matches first. How those web pages get into the database, and consequently, how you can get yours in there too, is a three step process:

  1. A search engine visits a site with an automated program called a spider (sometimes they're also called robots). A spider is just program similar to a web browser that downloads your site's pages. It doesn't actually display the page anywhere, it just downloads the page data.
  2. After the spider has acquired the page, the search engine passes the page to a program called an indexer. An indexer is another robotic program that extracts most of the visible portions of the page. The indexer also analyzes the page for keywords, the title, links, and other important information contained in the code.
  3. The search engine adds your site to its database and makes it available to searchers. The greatest difference between search engines is in this final step where rankings or results positions under a particular keyword are determined.

Submitting Your Site to Google

For the site owner, the first step is to get your pages listed in the database. There are two ways to get added. The first is direct submission of your site's URL to Google via its add URL or Submission page. To counter programmed robots, search engines routinely move submission pages around on their sites. You can currently find Google's submission page linked from their Help pages or Webmaster Info pages (http://www.google.com/addurl.html).

Just visit the add URL page and enter the main index page for your site into the Google submission page form, and press submit. Google's spider (called GoogleBot) will visit your page usually within four weeks. The spider will traverse all pages on your site and add them to its index. Within eight weeks, you should be able to find your site listed in Google.

The second way to get your site listed in Google is to let Google find you. It does this based upon links that may be pointing to your site. Once GoogleBot finds a link to your site from a page it already has in its index, it will visit your site.

Google has been updating its database on a monthly basis for three years. It sends its spider out in crawler mode once a month too. Crawler mode is a special mode for a spider when it traverses or crawls in the entire Web. As it runs into links to pages, it then indexes those pages in a never ending attempt to download all the pages it can. Once your pages are listed in Google, they are revisited and updated on a monthly basis. If you frequently update your content, Google may index your search terms more often.

Once you are indexed and listed in Google, the next natural question for a site owner is, “How can I rank better under my applicable search terms?"

The search Engine Optimization Template

  • In META keywords. It's not necessary for Google, but a good habit. Keep your META keywords short (128 characters max, or 10).
  • In META description. Keep keyword close to the left but in a full sentence.
  • In the title at the far left but possibly not as the first word.
  • In the top portion of the page in the first sentence of first full bodied paragraph (plain text: no bold, no italic, no style).
  • In an H3 or larger heading.
  • In bold - second paragraph if possible and anywhere but the first usage on page.
  • In italic - anywhere but the first usage.
  • In subscript/superscript.
  • In URL (directory name, filename, or Domain name). Do not duplicate the keyword in the URL.
  • In an image filename used on the page. In ALT tag of that previous image mentioned.
  • In the title attribute of that image.
  • In link text to another site.
  • In an internal link's text.
  • In title attribute of all links targeted in an out of page.
  • In the filename of your external CSS (Cascading style Sheet) of JavaScript file.
  • In an inbound link on site (preferably from your home page).
  • In an inbound link from off site (if possible).
  • In a link to a site that has a PageRank of 8 or better.

Other search engine optimization things to consider include

  • Use “last modified” headers if you can.
  • Validate that HTML. Some feel Google's parser has become stricter at parsing instead of milder. It will miss an entire page because of a few simple errors - we have tested this in depth.
  • Use an HTML template throughout your site. Google can spot the template and parse it off. (Of course, this also means they are pretty good at spotting duplicate content.)
  • Keep the page as.htmlor.htm extension. Any dynamic extension is a risk.
  • Keep the HTML below 20K. 5-15K is the ideal range.
  • Keep the ratio of text to HTML very high. Text should out weigh HTML by significant amounts.
  • Double check your page in Netscape, Opera, and IE. Use Lynx if you have it.
  • Use only raw HREFs for links. Keep JavaScript far, far away from links. The simpler the link code the better.
  • The traffic comes when you figure out that 1 referral a day to 10 pages is better than 10 referrals a day to 1 page.
  • Don't assume that keywords in your site's navigation template will be worth anything at all. Google looks for full sentences and paragraphs. Keywords just laying around orphaned on the page or not worth as much as when used in a sentence.

Generating Google AdWords

You've written the copy and you've planned the budge. Now, what keywords are you going to use for your ad?

You've read about it and you've thought about it and you're ready to buy one of Google's adWords. You've even got your copy together and you feel pretty confident about it. You've only got one problem now; figuring out your keywords, the e search words that will trigger your AdWord to appear.

You're probably buying into the AdWords program on a budget, and you definitely want to make every penny count. Choosing the right keywords means that your ad will have a higher click through. Thankfully, the google AdWords program allows you to do a lot of tweaking, so if your first choices don't work, experiment, test, and test some more!

Choosing AdWords

So where do you get the search keywords for your ad? There are four places that might help you find them:

Log files Examine your site's log files. How are people finding your site now? What words are they using? What search engines are they using? Are the words they're using too general to be used for AdWords? If you look at your log files, you can get an idea of how people who are interested in your content are finding your site. (If they weren't interested in your content, why would they visit?)

Examine your own site If you have an internal search engine, check it's logs. What are people searching for once they get to your site? Are there any common phrases you could use?

Brainstorm What do people think of when they look at your site? What keywords do you want them to think of? Brainstorm about the products that's most closely associated with your site. What words come up?

Imagine someone goes to a store and asks about your products. How are they going to ask? What words would they use? Consider all the different way someone could look for or ask about your product or service, and then consider if there's a set of words or a phrase that pops up over and over again.

Glossaries If you've brainstormed until wax dribbles out your ears but you're no closer to coming up with words relevant to your site or product, visit some online glossaries to jog your brain. The Glossarist(http://www.glossarist.com)links to hundreds of glossaries on hundreds of different subjects. Check and see if they have a glossary relevant to your product or service, and see if you can pull some words from there.

Exploring your competitor's AdWords

Once you've got a reasonable list of potential keywords for your ad, take them and run them in the Google search engine. Google rotates advertisements based on the spending cap for each campaign, so even after running a search three or four times you may see different advertisements each time. Use the AdWords scraper to save these ads to a file and review them later.

If you find a potential keyword that apparently contains no advertisements, make a note. When you're ready to buy an AdWord, you'll have to check its frequency; it might not be searched often enough to be a lucrative keyword for you. But if it is, you'll have found a potential advertising spot with no other ads competing for searchers' attention.

Inside the PageRank Algorithm

Delving into the inner-workings of google PageRank algorithm and how it affects results.

What is PageRank?

PageRank is the algorithm used by the Google search engine, originally formulated by Sergey Brin and Larry Page in their paper “The Anatomy of a Large-Scale Hypertextual Web Search Engine."

It is based on the premise, prevalent in the world of academia, that the importance of a research paper can be judged by the number of citations the paper has from other research papers. Brin and Page have simply transferred this premise to its web equivalent: the importance of a web page can be judged by the number of hyperlinks pointing to it from other web pages.

So What Is the Algorithm?

It may look daunting to non mathematicians, but the Page Rank algorithm is in fact elegantly simple and is calculated as follows:

algorithm

where:

  • PR(A) is the PageRank of a page A.
  • PR(T1) is the PageRank of a page T1.
  • C(T1) is the number of outgoing links from the page T1.
  • d is a damping factor in the range 0 < d < 1, usually set to 0.85.

The PageRank of a web page is therefore calculated as a sum of the PageRanks of all pages linking to it (its incoming links), divided by the number of links on each of those pages (its outgoing links).

And What Does This Mean?

From a search engine marketer's point of view, this means there are two ways in which PageRank can affect the position of your page on Google:

  • The number of incoming links. Obviously the more of these, the better. But there is another thing the algorithm tells us: no incoming link can have a negative effect on the PageRank of the page it points at. At worst, it can simply have no effect at all.
  • The number of outgoing links on the page that points to your page. The fewer of these, the better. This is interesting: it means given two pages of equal PageRank linking to you, one with 5 outgoing links and the other with 10, you will get twice the increase in PageRank from the page with only 5 outgoing links.

At this point, we take a step back and ask ourselves just how important PageRank is to the position of your page in the google search results.

The next thing we can observe about the PageRank algorithm is that it has nothing whatsoever to do with relevance to the search terms queries. It is simply one single (admittedly important) part of the entire Google relevance ranking algorithm.

Perhaps a good way to look at PageRank is as a multiplying factor, applied to the Google search results after all its other computations have been completed. The Google algorithm first calculates the relevance of pages in its index to the search terms, and then multiplies this relevance by the PageRank to produce a final list. The higher your PageRank, therefore, the higher up the results you will be, but there are still many other factors related to the positioning of words on the page that must be considered first.

So What's the Use of the PageRank Calculator?

If no incoming link has a negative effect, surely I should just get as many as possible regardless of the number of outgoing links on its page?

Well, not entirely. The PageRank algorithm is very cleverly balanced. Just like the conservation of energy in physics with every reaction, PageRank is also conserved with every calculation. For instance, if a page with a starting PageRank of 4 has two outgoing links on it, we know that the amount of PageRank it passes on is divided equally between all its outgoing links. In this case, 4 / 2 = 2 units of PageRank is passed on to each of 2 separate pages, and 2 + 2 = 4 - so the total PageRank is preserved!

There are scenarios where you may find that total PageRank is not conserved after a calculation. PageRank itself is supposed to represent a probability distribution, with the individual PageRank of a page representing the likelihood of a “random surfer” chancing upon it.

On a much larger scale, supposing Google's index contains a billion pages, each with a PageRank of 1, the total PageRank across all pages is equal to a billion. Moreover, each time we recalculate PageRank, no matter what changes in PageRank may occur between individual pages, the total PageRank across all 1 billion pages will still add up to a billion.

First, this means that although we may not be able to change the total PageRank across all pages, by strategic linking of pages within our site, we can affect the distribution of PageRank between pages. For instance, we may want most of our visitors to come into the site through our home page. We would therefore want our home page to have a higher PageRank relative to other pages within the site. We should also recall that all the PageRank of a page is passed on and divided equally between each outgoing link on a page. We would therefore want to keep as much combined PageRank as possible within our own site without passing it on to external sites and losing its benefit. This means we would want any page with lots of external links (i.e., links to other people's web sites) to have a lower PageRank relative to other pages within the site to minimize the amount of PageRank that is “leaked” to external sites. Bear in mind also our earlier statement, that PageRank is sim0ply a multiplying factor applied once Google's other calculations regarding relevance have already been calculated. We would therefore want our more keyword-rich pages to also have a higher relative PageRank.

Second, if we assume that every new page in Google's index begins it's life with a PageRank of 1, there is a way we can increase the combined PageRank of pages within our site - by increasing the number of pages! A site redistributed through its hyperlinks. A site with 12 pages will therefore start with a combined PageRank of 12. We can thus improve the PageRank of our site as a whole by creating new content (i.e., more pages), and then control the distribution of that combined PageRank through strategic interlinking between the pages.

An this is the purpose of the PageRank Calculator - to create a model of the site on a small scale including the links between pages, and see what effect the model has on the distribution of PageRank.

How Does the PageRank Calculator Work?

To get a better idea of the realities of PageRank, visit the PageRank Calculator(http://www.markhorrell.com/seo/pagerank.asp).

It's very simple really. Start by typing in the number of interlinking pages you wish to analyze and hit Submit. I have confined this number to just 20 pages to ease server resources. Even so, this should give a reasonable indication of how strategic linking can affect the PageRank distribution.

Next, for each of reference once the calculation has been performed, provide a label for each page (e.g., Home Page, Links Page, Contact Us Page, etc.) and again hit Submit.

Finally, use the list boxes to select with pages each page links to. You can use Ctrl and Shift to highlight multiple selections.

You can also use this screen to change the initial PageRanks of each page. For instance, if one of your pages is supposed to represent Yahoo!, you may wish to raise its initial PageRank to, say, 3. However, in fact, starting PageRank is irrelevant to its final computed value. In other words, even if one page were to start with a PageRank of 100, after many iterations of the equation (see below), the final computed PageRank will converge to the same value as it would had it started with: a Page Rank of only 1!

You can play around with the damping factor d, which defaults to 0.85 as this is the value quoted in Brin and Page's research paper.

26 Steps to 15K a Day

Solid content thoughtfully prepared can make more impact than a decade's worth of fiddling with META tags and building the perfect title page.

Too often, getting visitors from search engines is boiled down to a succession of tweaks that may or may not work. But as Brett Tabke shows in this section, solid content thoughtfully put together can make more impact than a decade's worth of fiddling with META tags and building the perfect title page.

From A to Z, following these 26 steps will build you a successful site, bringing in plenty of visitors from Google.

A. Prep Work

Prepare work and begin building content. Long before the Domain name is settled on, start putting together notes to build at least a 100 page site. That's just for openers. That's 100 pages of “real content,” as opposed to link pages, resource pages, about, copyright - necessary but not content-rich pages.

Can't think of 100 pages' worth of content? Consider articles about your business or industry, Q&A pages, or back issues of an online newsletter.

B. Choose a brand able Domain Name

Choose a Domain name that's easily brand able. You want Google.com and not Mykeyword.com

Keyword Domains are out; branding and name recognition are in. Big time in. The value of keywords in a Domain name have never been less to search engines. Learn the lesson of Goto.com becoming Overture.com and why they did it. It's one of the powerful gut check calls I've ever seen on the Internet. That took resolve and nerve to blow away several years of branding. (That's a whole 'nuther article, but learn the lesson as it applies to all of us).

C. Site Design

The simpler your site design, the better. As a rule of thumb: text content should outweigh the HTML content. The pages should validate and be usable in everything from Lynx to leading browsers. In other words, keep it close to HTML 3.2 if you can. spiders are not to the point they really like eating HTML 4.0 and the mess that it can bring. Stay away from heavy Flash, Java, or JavaScript.

Go external with scripting languages if you must have them, though there's little reason to have them that I can see. They will rarely help a site and stand to hurt it greatly due to many factors most people don't appreciate (the search engines' distaste for JavaScript is just one of them). Arrange the site in a logical manner with directory names hitting the top keywords you wish to emphasize. You can also go the other route and just throw everything into the top level of the directory (this is rather controversial, but it's been producing good long term results across many engines). Don't clutter and don't SPAM your site with frivolous links like “best viewed” or other things like counters. Keep it clean and professional to the best of your ability.

Learn the lesson of google itself: simple is retro cool. Simple is what surfers want.

speed isn't everything, it's almost the only. Your site should respond almost instantly to a request. If your site has three to four seconds' delay until “something happens” in the browser, you are in long term trouble. That three to four seconds response time may vary in sites destined to be viewed in other countries than your native one. The site should respond locally within three to four seconds (maximum)to any request. Longer than that, and you'll lose 10% of your audience for each additional second. That 10% could be the difference between success and not.

D. Page Size

The smaller the page size, the better. Keep it under 15K, including images, if you can. The smaller the better. Keep it under 12K if you can. The smaller the better. Keep it under 10K if you can - I trust you are getting the idea here. Over 5K and under 10K. It's tough to do, but it's worth the effort. Remember, 80% of your surfers will e at 56K or even less.

E. Content

Build one page of content (between 200 - 500 words) per day and put it online.

If you aren't sure what you need content, start with the Overture keyword suggestor(http://inventory.overture.com/d/searchinventory/suggestion/)and find the core set of keywords for your topic area. Those are your subject starts.

F. Keyword Density and Keyword Positioning

This is simple, old fashioned, SEO (Search Engine Optimization) from the ground up.

Use the keyword once in title, once in description tag, once in a heading, once in the URL, once in bold, once in italic, once high on the page, and make sure the density is between 5 & 20% (don't fret about it). Use good sentences and spell check them! Spell checking is becoming important as search engines are moving to auto correction during searches. There is no longer a reason to look like you can't spell.

G. Outbound Links

From every page, link to one or two high ranking sites under the keyword you're trying to emphasize. Use your keyword in the link text (this is ultra important for the future).

H. Cross-Link

Cross links are links within the same site.

Link to on-topic quality content across your site. If a page is about food, make sure it links to the apples and veggies page. With Google, on-topic cross-linking is very important for sharing your PageRank value across your site. You do not want an “all star” page the outperforms the rest of your site. You want 50 pages that produce one referral each a day; you don't want one page that produces 50 referrals a day. If you do find one page that drastically out produces the rest of the site with Google, you need to offload some that PageRank value to other pages by cross-linking heavily. It's the old share-the-wealth thing.

I. Put it Online

Don't go with virtual Hosting; go with a standalone IP address.

Make sure the site is “crawlable” by a spider. All pages should be linked to more than one other page on your site, and not more than two levels deep from the top directory. Link the topic vertically as much as possible back to the top directory. A menu that is present on every page should link to your site's main “topic index” pages (the doorways and logical navigation system down into real content). Don't put it online before you have a quality site to put online. It's worse to a “nothing” site online than no site at all. You want it fleshed out from the start.

Go for a listing in the ODP (the Open Directory Project, (http://dmoz.org/add.html). Getting accepted to the Open Directory project will probably get your pages listed in the Google Directory.

J. Submit

Submit your main URL to: Google, FAST, AltaVista, Wisenut, Teoma, DirectHit, and Hotbot. Now comes the hard part: forget about submissions for the next six months. That's right, submit and forget.

K. Logging and Tracking

Get a quality logger/tracker that can do justice to inbound referrals based on log files. don't use a graphic counter; you need a program that's going to provide much more information that that. If your Host doesn't support referrals, back up and get a new Host. You can't run a modern site without full referrals available 24/7/365 in real time.

L. Spiderings

Watch for spiders from search engines - one reason you need a good logger and tracker! Make sure those that are crawling the full site can do so easily. If not, double-check your linking system to make sure the spider found its way throughout the site. Don't fret if it takes two spiderings to get your whole site done by Google or FAST. Other search engines are pot luck; with them, it's doubtful that you will be added at all if you haven't been added within 6 months.

M. Topic Directories

Almost every keyword sector has an authority hub on it's topic. Find it )Google Directory can be very helpful here, because you can view sites based on how popular they are) and submit within the guidelines.

N. Links

Look around your keyword section in the Google Directory; this is best done after getting an Open Directory Project listing - or two. Find sites that have link pages or freely exchange links. Simply request a swap. Put a page of on-topic, in-context links up on your site as a collection spot. Don't worry if you can't get people to swap links - move on. Try to swap links with one fresh site a day. A simple personal E-mail is enough. Stay low key about it and don't worry if site Z won't link to you. Eventually they will.

O. content

Add one page of quality content per day. Timely, topical articles are always the best. try to stay away from too much web logging personal materials and look more for “article” topics that a general audience will like. Hone your writing skills and read up on the right style of “web speak” that tends to work with the fast and furious web crowd: lots of text breaks - short sentences - lots of dashes - something that reads quickly.

Most web users don't actually read, they scan. This is why it is so important to keep key pages to a minimum. If people see a huge overblown page, a portion of them will hit the back button before trying to decipher it. They've got better things to do than waste 15 seconds (a stretch) at understanding your whizbang menu system. Because some big support site can run Flash-heavy pages is no indication that you can. You don't have the pull factor they do.

Use headers and bold standout text liberally on your pages as logical separators. I call them scanner stoppers where the eye will logically come to rest on the page.

P. Gimmicks

Stay far away from any “fades of the day"or anything that appears spammy, unethical, or tricky. Plant yourself firmly on the high ground in the middle of the road.

Q. Linkbacks

When you receive requests for links, check sites out before linking back to them. Check them through Google for their PageRank value. Look for directory listing. Don't link back to junk just because they asked. Make sure it is a site similar to your and on topic. Linking to “bad neighborhoods,” as Google calls them, can actually cost you PageRank points.

R. Rounding Out Your Offerings

Use options such as “E-mail a friend,” forums, and mailing lists to round out your site's offerings. Hit the top forums in your market and read, read, read until your eyes hurt. Stay away from “affiliate fades” that insert content on to your site like banners and pop-up windows.

S. Beware of Flyer and Brochure Syndrome

If you have an economical site or online version of bricks and mortar, be careful not to turn your site into a brochure. These don't work at all. Think about what people want. They aren't coming to your site to view “your content,” they are coming to your site looking for “their content.: Talk as little about your products and yourself as possible in articles (sounds counterintuitive, doesn't it?)

T. Keep Building One Page of content Per Day

Head back to the Overture suggestion tool to get ideas for fresh pages.

U. Study Those Logs

After a month or two you will start to see a few referrals from places you've gotten listed. Look for the keywords people are using. See any bizarre combinations? Why are people using those to find your site? If there is something you have overlooked, then build a page around that topic. Engineer your site to feed the search engine what it wants. If your site is about oranges, but your referrals are all about orange citrus fruit, then you can get busy building articles around citrus and fruit instead of the generic oranges. The search engines will tell you exactly what they want to be fed; listen closely! There is gold in referral logs, it's just a matter of panning for it.

V. Timely Topics

Nothing breeds success like success of a site. This is where all that time you spend in forums will pay off. Here's the catch-22 about forums: lurking is almost useless. The value of a forum is in the interaction with your fellow colleagues and cohorts. You learn long term by the interaction, not by just reading. Networking will pay off in linkbacks, tips, E-mail exchanges, and will generally put you “in the loop” of your keyword sector.

X. Notes, Notes, Notes

If you build one page per day, you will find that brainstorm-like inspiration will hit you in the head at some magic point. Whether it is in the shower (dry off first), driving down the road (please pull over), or just parked at your desk, write it down! Ten minutes of work later, you will have forgotten all about that great idea you just had. Write it down and get detailed about what you are thinking. When the inspirational juices are not longer flowing, come back to those content ideas. It sounds simple, but it's a lifesaver when the ideas stop coming.

Y. Submission Check at Six Months

Walk back through your submissions and see if you got listed in all the search engines you submitted to after six months. If not, resubmit and forget again. Try those freebie directories again, too.

Z. Keep Building Those Pages of Quality Content!

Starting to see a theme here? Google loves content, lots of quality content. The content you generate should be based around a variety of keywords. At the end of a year's time, you should have around 400 pages of content. That will get you good placement under a wide range of keywords, generate reciprocal links, and overall position your site to stand on its own two feet.

Do those 26 things, and I guarantee you that in one year's time you will call your site a success. It will be drawing between 500 and 2,000 referrals a day from search engines. If you build a good site and achieve an average of 4 to 5 page views per visitors, you should be in the 10-15k page views per day range in one year's time. What you do with that traffic is up to you!

Being a Good Search Engine Citizen

Five don'ts and one do for getting your site indexed by Google.

A high ranking in Google can mean a great deal of traffic. Because of that, there are lots of people spending lots of time trying to figure out the infallible way to get a high ranking from Google. Add this. Remove that. Get a link from this. Don't post a link to that.

Submitting your site to Google to be indexed is simple enough. Google's got a site submission for(http://www.google.com/addurl.html), though they say if your site has at least a few inbound links (other sites that link to you), they should find you that way. In fact, google encourages URL submitters to get listed on The Open Directory Project (DMOZ, http://www.dmoz.org/) or Yahoo! (http://yahoo.com/).

Nobody knows the holy grail secret of high page rank without effort. Google uses a variety of elements, including page popularity, to determine page rank. Page rank is one of the factors determining how high up a page appears in search results. But there are several things you should not be doing combined with one big thing you absolutely should.

Does breaking one of these rules mean that you're automatically going to be thrown out of Google's index? No; there are over 2 billion pages in Goggle's index at this writing, and it's unlikely that they'll find out about your rule-breaking immediately. but there's a good chance they'll find out eventually. Is it worth it having your site removed from the most popular search engine on the Internet?

Thou shalt not:

Cloak. “Cloaking” is when your web site is set up such that search engine spiders get different pages from those human surfers get. How does the web site know which are the spiders and which are the humans? By identifying the spider's User Agent or IP - the latter being the more reliable method.

An IP (Internet Protocol) address is the computer address from which a spider comes from. Everything that connects tot he Internet has an IP address. Sometimes the IP address is always the same, as with web sites. Sometimes the IP address changes - that's call a dynamic address. (If you use a dial-up modem, chances are good that every time you log on to the Internet your IP address is different. That's a dynamic UP address.)

A “User Agent” is a way a program that surfs the Web identifies itself. Internet browsers like Mozilla use User Agents, as do search engine spiders. There are literally dozens of different kinds of User Agents; see the Web Robots Database (http://www.robotstxt.org/wc/active.html) for an extensive list.

Advocates of cloaking claim that cloaking is useful to absolutely optimize content for spiders. Anticloaking critics claim that cloaking is an easy way to misrepresent site content - feeding a spider a page that's designed to get the site hits for pudding cups when actually it's all about baseball bats. You can get more details about cloaking and different perspectives on it at http://pandecta.com/search_engines/cloaking.html, http://www.apromotionguide.com/cloaking.html, and http://www.webopedia.com/TERM/C/cloaking.html

Hide text. Text is hidden by putting words or links in a web page that are the same color as the page's background - putting white words on a white background, for example. This is also called “font matching.” Why would you do this? Because a search engine spider could read the words you've hidden on the page while a human visitor couldn't. Again, doing this and getting caught could get you banned from Google's index, so don't.

That goes for other page content tricks too, like title stacking (putting multiple copies of a title tag on one page), putting keywords in comment tags, keyword stuffing (putting multiple copies of keywords in very small font on page), putting keywords not relevant to your site in your META tags, and so on. Google doesn't provide an exhaustive list of these types of tricks on their site, but any attempt to circumvent or fool their ranking system is likely to be frowned upon. Their attitude is more like: “You can do anything you want to with your pages, and we can do anything we want to with our index - like exclude your pages."

Use doorway pages. Sometimes doorway pages are called “gateway pages.” These are pages that are aimed very specifically at one topic, which don't have a lot of their own original content, and which lead to the main page of a site (thus the name doorway pages).

For example, say you have a page devoted to cooking. You create doorway pages for several genres of cooking - French cooking, Chinese cooking, vegetarian cooking, etc. The pages contain terms and META tags relevant to each genre, but most of the text is a copy of all the other doorway pages, and all it does is point to your main site.

This is illegal in Google and annoying to the google-user; don't do it. You can learn more about doorway pages at http://searchenginewatch.com/webmasters/bridge.html or http://www.searchengineguide.com/whalen/2002/0530_jw1.html.

Check your link rank with automated queries. Using automated queries (except for the sanctioned google API) is against Google's Terms of Service anyway. Using an automated query to check your PageRank every 12 second is triple bad; it's not what the search engine was built for and Google probably considers it a waste of their time and resources.

Link to “bad neighborhoods". Bad neighborhoods are those sites that exist only to propagate links. Because link popularity is one aspect of how Google determines PageRank, some sites have set up “link farms” - sites that exist only for the purpose of building site popularity with bunches of links. The links are not topical, like a specialty subject index, and they're not well-reviewed, like Yahoo!; they're just a pile of links. Another example of a “bad neighborhood” is a general FFA page. FFA stands for “free for all"; it's a page where anyone can add their link. Linking to pages like that is grounds for a penalty from Google.

Now, what happens if a page like that links to you? Will Google penalize your page? No. Google accepts that you have no control over who links to your site.

Thou shalt:

Create great content. All the HTML contortions in the world will do you little good if you've got lousy, old, or limited content. If you create great content and promote it without playing search engine games, you'll get noticed and you'll get links. remember Sturgeon's Law ("Ninety percent of everything is crud.") Why not make your web site an exception?

What Happens if you Reform?

Maybe you've got a site that's not exactly the work of a good search engine citizen. Maybe you've got 500 doorway pages, 10 title tags per page, and enough hidden text to make an O'Reilly Pocket Guide. But maybe now you want to reform. You want to have a clean lovely site and leave the doorway pages to Better Homes and Gardens. Are you doomed? Will Google ban your site for rest of its life?

No. The first thing you need to do is clean up your site - remove all traces of rule breaking. Next, send a note about your site changes and the URL to help@google.com. Note that Google really doesn't have the resources to answer every E-mail about why they did or didn't index a site - otherwise, they'd be answering E-mails all day - and there's no guarantee that they will reindex your kinder, gentler site. But they will look at your message.

What Happens if You Spot Google Abusers in the Index? What if some other site that you come across in your Google searching is abusing Google's spider and pagerank mechanism? You have two options. you can send an E-mail to SPAMreport@google.com or fill out the form at http://www.google.com/contact/SPAMreport.html. (I'd fill out the form; it reports the abuse in a standard format that Google's used to seeing.)

Cleaning Up for a Google Visit Before you submit your site to Google, make sure you've clean it it up to make the most of your indexing.

You clean up your house when you have important guests over,right? Google's crawler is one of the most important guests your site will ever have if you want visitors. A high Google ranking can lead to incredible numbers of referrals, both from Google's main site and those site that have search powered by Google.

To make the most of your listing, step back and look at your site. By making some adjustments, you can make your site both more Google-friendly and more visitor-friendly.

If you must use a splash page, have a text link from it. If I had a dollar for every time I sent to the front page of a site and saw you way to navigate besides a Flash movie, I'd be able to nap for a living. Google doesn't index Flash files, so unless you have some kind of text link on your splash page (a “Skip This Movie” link, for example, that leads into the heart of your site) you're not giving Google's crawler anything to work with. You're also making it difficult for surfers who don't have Flash or are visually impaired.

Make sure your internal links work. Sounds like a no-brainer, doesn't it? Make sure your internal page links work so the Google crawler can get to all your site's pages. You'll also make sure your visitors can navigate.

Check your title tags. There are few things sadder than getting a page of search results and finding “Insert Your Title Here” as the title for some of them. Not quite as bad is getting results for the same Domain and seeing the exact same title tag over and over and over and over.

Look. Google makes is possible to search just the title tags in its index. Further, the title tags are very easy to read on Google's search results and are an easy way for a surfer to quickly get an idea of what a page is all about. If you're not making the most of your title tag you're missing out on a lot of attention on your site.

The perfect title tag, to me, says something specific about the page it heads, and is readable to both spiders and surfers. That means you don't stuff it with as many keywords as you can. Make it a readable sentence, or - and I've found this useful for some pages - make it a question.

Check your META tags. Google sometimes relies on META tags for a site description when there's a lot of navigation code that wouldn't make sense to a human searcher. I'm not crazy about MeTA tags, but I'd make sure that at least the front page of my web site had a description and keyword META tag set, especially if your site relies heavily on code-based navigation (like from JavaScript).

Check your ALT tags. Do you use a lot of graphics on your pages? Do you have ALT tags for them so that visually impaired surfers and the Google spider can figure out what those graphics are? If you have a splash page with nothing but graphics on it, do you have ALT tags on all those graphics so a Google spider can get some idea of what your page is all about? ALT tags are perhaps the most neglected aspect of a web site. Make sure yours are set up.

By the way, just because ALT tags are a good idea, don't go crazy. You don't have to explain in your ALT tags that a list bullet is a list bullet. You can just mark it with a *.

Check your frames. If you use frames, you might be missing out on some indexing. Google recommends you read Danny sullivan's article, “Search Engines and Frames,” at http://www.searchenginewatch.com/webmasters/frames.html. Be sure that Google can either handle your frame setup or that you've created an alternative way for Google to visit, such as using the NOFRMAES tag.

Consider your dynamic pages. Google says they “limit the number of amount of dynamic pages” they index. Are you using dynamic pages? Do you have to?

Consider how often you update your content. There is some evident that Google indexes popular pages with frequently updated content more often. How often do you update the content on your front page?

Make sure you have a robots.txt file if you need one. If you want Google to index your site in a particular way, make sure you've got arobots.txtfile for the Google spider to refer to. You can learn more aboutrobots.txtin general at http://www.robotstxt.org/wc/norobots.html.

If you don't want Google to cache your pages, you can add a line to every page that you don't want cached.Add this line to the <HEAD> section of your page:

<META NAME='GOOGLEBOT' CONTENT='NOARCHIVE'>

This will tell all robots that archive content, including engines like Daypop and Gigablast, not to cache your page. If you want to exclude just the Google spider from caching your page, you'd use this line:

<META NAME="GOGGLEBOT” CONTENT="NOARCHIVE">

Getting the Most out of AdWords

Guest commentary by Andrew Goodman of Traffick on how to write great AdWords.

AdWords (https://adwords.google.com/select/?hl=en) is just about the sort of advertising program you might expect to roll out of the big brains at Google. The designers of the advertising system have innovated thoroughly to provide precise targeting at low cost with less work - it really is a good deal. The flipside is that it takes a fair bit of savvy to get a campaign to the point where it stops failing and starts working.

For larger advertisers, AdWords Select is a no-brainer. Within a couple of weeks, a larger advertiser will have enough data to decide whether to significantly expand their ad program on AdWords Select or perhaps to upgrade to a premium sponsor account.

I'm going to assume you have a basic familiarity with how cost-per-click advertising works. AdWords Select ads currently appear next to search results on Google.com (and some international versions of the search engine) and near search results on AOL and a few other major search destinations. There are a great many quirks and foibles to this form of advertising. My focus here will be on some techniques that can turn a mediocre, non performing campaign into one that actually makes money for the advertiser while conforming to Google's rules and guidelines.

One thing I should make crystal clear is that advertising with Google bears no relationship to having your web site's pages indexed in Google's search engine. The search engine remains totally independent of the advertising program. Ad results never appear within search results.

I'm going to offer four key tips for maximizing AdWords Select campaign performance, but before I do, I'll start with four basic assumptions:

  • High CTRs (click-through rates) save you money, so that should be one of your main goals as an AdWords Select advertiser. Google has set up the keyword bidding system to reward high-CTR advertisers. Why? It's simple. If two ads are each shown 100 times, the ad that is clicked on eight times generates revenue for Google twice as often as the ad that is clicked on four times over the same stretch of 100 search queries served. So if your CTR is 4% and your competitor's is only 2%, Google factors this into your bid. Your bid is calculated as if it were"worth” twice as much as your competitor's bid.
  • Very low CTRs are bad. Google disables keywords that fall below a minimum CTR threshold ("0.5% normalized to ad position,” which is to say, 0.5% for position 1, and a more forgiving threshold for ads as they fall further down the page). Entire campaigns will be gradually disabled if they fall below 0.5% CTR on the whole.
  • Editorial disapprovals are a fact of life in this venue. Your ad copy or keyword selections may violate Google's editorial guidelines from time to time. Again, it's very difficult to run a successful campaign when large parts of it are disabled. You need to treat this as a normal part of the process rather than giving up or getting flustered.
  • The AdWords Select system is set up like an advertising laboratory; that is to say, it makes experimenting with keyword variations and small variations in ad copy a snap. No guru can prejudge for you what will be your “magical ad copy secrets,” and it would be irresponsible to do so, because Google offers such detailed real-time reporting that can tell you very quickly what does and does not catch people's attention.

Now onto four tips to get those CTRs up and to keep your campaign from straying out of bounds.

Matching Can Make a Dramatic Difference

You'll likely want to organize your campaign's keywords and phrases into several distinct “ad groups” (made easy by Google's interface). This will help you more closely match keywords to the actual words that appear in the title of your ad. Writing slightly different ads to closely correspond to the words in each group of keywords you've put together is a great way to improve your click through rates. You'd think that an ad title (say, “Deluxe Topsoil in Bulk") would match equally well to a range of keywords that mean essentially the same thing. That is, you'd think this ad title would create about the same CTR with the phrase “bulk topsoil” as it would with a similar phrase ("fancy dirt wholesaler"). Not so. Exact matches tend to get significantly higher CTRs. Being diligent about matching your keywords reasonably closely to your ad titles will help you outperform your less diligent competition.

If you have several specific product lines, you should consider better matching different groups of key phrases to an ad written expressly for each product line. If your clients like your store because you offer certain specialized wine varieties, for example, have an ad group with “ice wine” and related keywords in it, with “ice wine” in the ad title. Don't expect the same generic ad to cover all your varieties. Someone searching for an “ice wine” expert will be thrilled to find a retailer who specializes in this area. They probably won't click on or buy from a retailer who just talks about wine in general. Search engine users are passionate about something, and their queries are highly granular. Take advantage of this passion and granularity.

The other benefit of getting more granular and matching keywords to ad copy is that you don't pay for clicks from unqualified buyers, so your sales conversion rate is likely to be much higher.

Copy writing Tweaks Generally Involve Clarity and Directness

By and large, I don't run across major copy writing secrets. Psychological tricks to entice more people to click, after all, may wind up attracting unqualified buyers. But there are times when the text of an ad falls outside the zone of “what works reasonably well.” In such cases, excessively low CTRs kill any chance your web site might have had to close the sale.

Consider using the Goldilocks method to diagnose poor-performing ads. Many ads lean too far to the “too cold” side of the equation. Overly technical jargon may be unintelligible and uninteresting even to specialists, especially given that this is still an emotional medium and that people are looking at search results first and glancing at ad results as a second thought.

The following example is “too cold":

Faster CWMGT Apps
Build GMUI modules 3X more secure than KLT. V. 2.0 rated as
“best pligtonferg” by WRSS Mag.

No one clicks. Campaign limps along. Web site remains world's best kept secret.

So then a “hotshot” (the owner's nephew) grabs the reins and tries to put some juice into this thing. Unfortunately, this new creative genius has been awake for the better part of a week, attending raves, placing second in a snowboarding competition, and tending to his various piercings. His agency work for a major Fortune 500 client's television spots once received rave reviews. Of course, those were rave reviews from industry pundits and his best friends, because the actual ROI on the big client's TV “branding” campaign was untrackable.

The hotshot's copy reads:

Reemar's App Kicks!
Reemar ProblemSolver 2.0 is the real slim shady. Don't trust
your Corporate security to the drones at BigCorp.

Unfortunately, in a non-visual medium with only a few words to work with, the true genius of this ad is never fully appreciated. Viewers don't click and may be offended by the ad and annoyed with Google.

The simple solution is something unglamorous but clear, such as:

Easy & Powerful Firewall
Reemar ProblemSolver 2.0 outperforms BigCorp
Exacerbate 3 to 1 in industry tests.

You can't say it all in a short ad. This gets enough specific (and true) info out there to be of interest to the target audience. Once they click, there will be more than enough info on your web site. In short, your ads should be clear. How's that for a major copywriting revelation?

The nice thing is, if you're bent on finding out for yourself, you can test the performance of all three styles quickly and cheaply, so you don't have to spend all week agonizing about this.

Be Inquisitive and Proactive with Editorial Policies (But Don't Whine)

Editorial oversight is a big task for Google adWords staff - a task that often gets them in hot water with advertisers, who don't like to be reined in. For the most part, the rules are in the long term best interest of this advertising medium, because they're aimed at maintaining consumer confidence in the quality of what appears on the page when that consumer types something into a search engine. Human error, however, may mean that your campaign is being treated unfairly because of a misunderstanding. Or maybe a rule is ambiguous and you just don't understand it.

Reply to the editorial disapproval messages (they generally come from adwords-support@google.com). Ask questions until you are satisfied that the rule makes sense as it applies to your business. The more Google knows about your business, in turn, the more they can work with you to help you improve your results, so don't hesitate to give a bit of brief background in your notes to them. The main thing is, don't let your campaign just sit there disabled because you're confused or angry about being “disapproved.” Make needed changes, make the appropriate polite inquiries, and move on.

Avoid the Trap of “insider Thinking” and Pursue the Advantage of Granular Thinking

Using lists of specialized keywords will likely help you to reach interested consumers at a lower cost per click and convert more sales, than using more general industry keywords. Running your ad on keywords from specialized vocabularies is a sound strategy.

A less successful strategy, though, is to get lost in your own highly specialized social stratum when considering how to pitch your company. Remember that this medium revolves around consumer search engine behavior. You won't win new customers by generating a list of different ways of stating terminology that only management, competitors, or partners might actually use, unless your ad campaign is just being run for vanity's sake.

Break things down into granular pieces and use industry jargon where it might attract a target consumer, but when you find yourself listing phrases that only your competitors might know or buzzwords that came up at the last interminable management meeting, stop! You've started down the path of insider thinking! By doing so, you may have forgotten about the customer and about the role market research must play in this type of campaign.

It sounds simple to say it, but in your AdWords Select keywords selection, you aren't describing your business. You're trying to use phrases that consumers would use when trying to describe a problem they're having, a specific item they're searching for, or a topic that they're interesting in. Mission statements from above versus what customers and prospects actually type into search engines. Big difference. (At this point, if you haven't yet done so, you'd better go back and read overThe Cluetrain Manifesto to get yourself right out of this top-down mode of thinking.)

One way to find out about what consumers are looking for is to use Wordtracker (http://www.wordtracker.com) or other keyword research tools (such as the one that Google offers as part of the adWords Select interface, a keyword research tool Google promises it's working on). However, these tools are not in themselves enough for every business; because more businesses are using these “keyphrase search frequency reports,” the frequently searched terms eventually become picked over by competing advertisers - just what you want to avoid if you're trying to sneak along with good response rates at a low cost per click.

You'll need to brainstorm as well. In the future, there will be more sophisticated software-driven market research available in this area. Search technology companies like Ask Jeeves Enterprise Solutions are already collecting data about the hundreds of thousands of customer questions typed into the search boxes on major corporate sites, for example. This kind of market research is under used by the vast majority of companies today.

There are currently many low-cost opportunities for pay-per-click advertisers. As more and larger advertisers enter the space, prices will rise, but with a bit of creativity, granular thinking, and diligent testing, the smaller advertiser will always have a fighting chance on AdWords Select. Good luck!

Removing Your Materials from Google

Some people are more than thrilled to have Google's properties index their sites. Other folks don't want the Google bot anywhere near them. If you fall into the latter category and the bot's already done its worst, there are several things you can do to remove your materials from Google's index. Each of Google's properties - Web Search, Google Images, and Google Groups - has its own set of methodologies.

Google's Web Search

Here are several tips to avoid being listed.

Making sure; your pages never there to begin with. While you can take steps to remove your content from the Google index after the fact, it's always much easier to make sure the content is never found and indexed in the first place.

Google's crawler obeys the “robot exclusion protocol,” a set of instructions you put on your web site that tells the crawler how to behave when it comes to your content. You can implement these instructions in two ways: via a META tag that you put on each page (handy when you want to block some spiders completely or want to restrict access to kinds of directories of content). You can get more information about the robots exclusion protocol and how to implement it at http://www.robotstxt.org/.

Removing your pages after they're indexed. There are several things you can have removed from Google's results.

These instructions are for keeping your site out of google's index only. For information on keeping your site out of all major search engines, you'll have to work with the robots exclusion protocol.

Removing the whole site Use the robots exclusion protocol, probably with robots.txt.

Removing individual pages Use the following META tag in the HEAD section of each page; you want to remove: <META NAME="GOOGLEBOT” CONTENT="NOINDEX, NOFOLLOW:>

Removing snippets A “snippet” is the little excerpt of a page that Google displays on its search result. To remove snippets, use the following META tag in the HEAD section of each page for which you want to prevent snippets: <META NAME="GOOGLEBOT” CONTENT="NOSNIPPET">

Removing cached pages To keep Google from keeping cached versions of your pages in their index, use the following META tag in the HEAD section of each page for which you want to prevent caching: <META NAME="GOOGLEBOT” CONTENT="NOARCHIVE">

Removing that content now. Once you implement these changes, Google will remove or limit your content according to your META tags and robots.txt file the next time your web site is crawled, usually within a few weeks. But if you want your materials removed right away, you can use the automatic remover at http://services.google.com:882/urlconsole/controllerYou'll have to sign in with an account (all an account requires is an E-mail address and a password). Using the remover, you can request either that Google crawl your newly createdrobots.txtfile, or you can enter the URL of a page that contains exclusionary META tags.

Make sure you have your exclusion tags all set up before you use this service. Going to all the trouble of getting Google to pay attention to a robots.txt file or exclusion rules that you've not yet set up will simply be a waste of your time.

Reporting pages with inappropriate content. You may like your content fine, but you might find that even if you have filtering activated you're getting search results with explicit content. Or you might find a site with a misleading title tag and content completely unrelated to your search.

You have tow options for reporting these sites to Google. And bear in mind that there's no guarantee that Google will remove the sites from the index, but they will investigate them. At the bottom of each page of search results, you'll see “Help US Improve” link; follow it to a form for reporting inappropriate sites. You can also send the URL of explicit sites that show up on a SafeSearch but probably shouldn't to safesearch@google.com. If you have more general complaints about a search result, you can send an E-mail to search-quality@google.com.

Google Images

Google Images' database of materials is separate from that of the main search index. To remove items from Google Images, you should use robots.tx to specify that the Google bot Image crawler should stay away from your site. Add these lines to you robots.txt file:

User-agent: Googlebot-Image
Disallow: /

You can use the automatic remover mentioned in the web search section to have Google remove the images from its index database quickly.

There may be cases where someone has put images on their server for which you own copyright. In other words, you don't have access to their server to add a robots.txt file, but you need to stop Google's indexing of your content there. In this case, you need to contact Google directly. Google has instructions for situations just like this at http://www.google.com/remove.html; look at Option 2, “If you do not have any access to the server that Hosts your image."

Removing Material from Google Groups

Like the Google Web Index, you have the option to both prevent material from being archived on Google and to remove it after the fact.

Preventing your material from being archived. To prevent your material from being archived on Google, add the following line to the headers of your Usenet posts:

X-No-Archive: yes

If you do not have the options to edit the headers of your post, make that line the first line in your post itself.

Removing materials after the fact. If you want materials removed after the fact, you have a couple of options:

  • If the materials you want removed were posted under an address to which you still have access, you may use the automatic removal tool mentioned earlier in this hack.
  • If the materials you want removed were posted under an address to which you no longer have access, you'll need to send an E-mail to groups-support@google.com with the following information:
    • Your full name and contact information, including a verifiable E-mail address.
    • The complete Google Groups URL or message ID for each message you want removed.
    • A statement that says “I swear under penalty of civil or criminal laws that I am the person who posted each of the foregoing messages or am authorized to request removal by the person who posted those messages."
    • Your electronic signature.

Removing Your Listing from Google Phonebook

You may not wish to have your contact information made available via the phonebook searches on Google. You'll have to follow one of two procedures, depending on whether the listing you want removed is for a business or for a residential number.

If you want to remove a business pone number, you'll need to send a request on your business letterhead to:

Google PhoneBook Removal
2400 Bayshore Parkway
Mountain View, CA 94043

You'll also have to include a phone number where Google can reach you to verify your request.

If you want to remove a residential phone number, it's much simpler. You'll need to fill out a form a http://www.google.com/help/pbremoval.html. The form asks for your name, city and state, phone number, E-mail address, and reason for removal, a multiple choice: incorrect number, privacy issue, or “other."