Spamming search engines

What is it?

Before getting started on using gateway pages and other HTML techniques to improve your search engine ranking, you need to know a little about spam and spamdexing. Spamming the search engines (or spamdexing) is the practice of using unethical or unprofessional techniques to try to improve search engine rankings. You should be aware of what constitutes spamming so as to avoid trouble with the search engines. For example, if you have a page with a white background, and you have a table that has a blue background and white text in it, you are actually spamming the Infoseek engine without even knowing it! Infoseek will see white text and see a white page background, concluding that your background color and your page color are the same so you are spamming! It will not be able to tell that the white text is actually within a blue table and is perfectly legible. It is silly, but that will cause that page to be dropped off the index. You can get it back on by changing the text color in the table to, say, a light gray and resubmitting the page to Infoseek. See what a difference that makes? Yet you had no idea that your page was considered spam! Generally, it is very easy to know what not to do so as to avoid being labeled a spammer and having your pages or your site penalized. By following a few simple rules, you can safely improve your search engine rankings without unknowingly spamming the engines and getting penalized for it.

What constitutes spam?

Some techniques are clearly considered as an attempt to spam the engines. Where possible, you should avoid these:
  • Keyword stuffing. This is the repeated use of a word to increase its frequency on a page. Search engines now have the ability to analyze a page and determine whether the frequency is above a "normal" level in proportion to the rest of the words in the document.
  • Invisible text. Some webmasters stuff keywords at the bottom of a page and make their text color the same as that of the page background. This is also detectable by the engines.
  • Tiny text. Same as invisible text but with tiny, illegible text.
  • Page redirects. Some engines, especially Infoseek, do not like pages that take the user to another page without his or her intervention, e.g. using META refresh tags, cgi scripts, Java, JavaScript, or server side techniques.
  • Meta tags stuffing. Do not repeat your keywords in the Meta tags more than once, and do not use keywords that are unrelated to your site's content.
  • Never use keywords that do not apply to your site's content.
  • Do not create too many doorways with very similar keywords.
  • Do not submit the same page more than once on the same day to the same search engine.
  • Do not submit virtually identical pages, i.e. do not simply duplicate a web page, give the copies different file names, and submit them all. That will be interpreted as an attempt to flood the engine.
  • Code swapping. Do not optimize a page for top ranking, then swap another page in its place once a top ranking is achieved.
  • Do not submit doorways to submission directories like Yahoo!
  • Do not submit more than the allowed number of pages per engine per day or week. Each engine has a limit on how many pages you can manually submit to it using its online forms. Currently these are the limits: AltaVista 1-10 pages per day; HotBot 50 pages per day; Excite 25 pages per week; Infoseek 50 pages per day but unlimited when using e-mail submissions. Please note that this is not the total number of pages that can be indexed, it is just the total number that can be submitted. If you can only submit 25 pages to Excite, for example, and you have a 1000 page site, that's no problem. The search engine will come crawling your site and index all pages, including those that you did not submit.

Gray Areas

There are certain practices that can be considered spam by the search engine when they are actually just part of honest web site design. For example, Infoseek does not index any page with a fast page refresh. Yet, refresh tags are commonly used by web site designers to produce visual effects or to take people to a new location of a page that has been moved. Also, some engines look at the text color and background color and if they match, that page is considered spam. But you could have a page with a white background and a black table somewhere with white text in it. Although perfectly legible and legitimate, that page will be ignored by some engines. Another example is that Infoseek advises against (but does not seem to drop from the index) having many pages with links to one page. Even though this is meant to discourage spammers, it also places many legitimate webmasters in the spam region (almost anyone with a large web site or a web site with an online forum always has their pages linking back to the home page). These are just a few examples of gray areas in this business. Fortunately, because the search engine people know that they exist, they will not penalize your entire site just because of them.

What are the penalties for spamdexing?

There is an inappropriate amount of fear over the penalties of spamming. Many webmasters fear that they may spam the engines without their knowledge and then have their entire site banned from the engines forever. That just doesn't happen that easily! The people who run the search engines know that you can be a perfectly legitimate and honest web site owner who, because of the nature of your web site, has pages that appear to be spam to the engine. They know that their search engines are not smart enough to know exactly who is spamming and who happens to be in the spam zone by mistake. So they do not generally ban your entire site from their search engine just because some of your pages look like spam. They only penalize the rankings of the offending pages. Any non-offending page is not penalized. Only in the most extreme cases, where you aggressively spam them and go against the recommendations above, flooding their engine with spam pages, will they ban your entire site. Some engines, like HotBot, do not even have a lifetime ban policy on spammers. As long as you are not an intentional and aggressive spammer, you should not worry about your entire site being penalized or banned from the engines. Only the offending pages will have their ranking penalized.

Is there room for responsible search engine positioning?

Yes! Definitely! In fact, the search engines do not discourage responsible search engine positioning. Responsible search engine position is good for everybody - it helps the users find the sites they are looking for, it helps the engines do a better job of delivering relevant results, and it gets you the traffic you want!

As a webmaster, you should not be too afraid that you are spamming the search engines in your quest for higher search engine rankings. No question about it, though, spam is something that every webmaster should understand thoroughly. Fortunately, it is easy to understand it. So learn the rules, re-examine your web pages, resubmit to the engines, then create gateway pages to get better ranking on the engines, using the rules above. If you need any more information on search engine spamming and search engine positioning, see