SEO Technical Issues & Solutions

Outlined below is very technical, how-to information on how to adjust for server and design issues that may be negatively impacting your search engine optimization efforts. It has long been known (we admit that it may only be based upon coincidence at times) that certain server issues seem to impact search engine ranking and that when "repaired" the ranking significantly improves. This page discusses ways to address those that we feel are going to have a negative impact upon your ranking.

You have multiple domain names pointing to the same
content.
Usually your first consideration for purchasing multiple domain names
may be to cover your product category with similar names and possibly
the purchase of commonly misspelled variations. Or you may just want to deprive your competition of some other nice domain names. But you own a bunch of them, and you have had each of them pointed to your main web site. This is usually considered spam because you're trying to index multiple
website domain names that point to the same physical content on the server. This is common where DNS services are used to "map" multiple sites to the same server files. Most search engines have the
ability to save and check file size and the ability to save and check
long text strings. They will notice duplicate content and only index
one site while throwing the rest out.
How to correct this problem.
By using what we refer to as an IP-funnel you can have the domains go to your production site, but without the problems associated with duplicate data for each domain. Below are directions to correct the domain spam problem so that a search
engine does not view your multiple sites as deceptive or misleading.
| |
|
|
|
|
| |
Most
domain services web sites (domain.com and most others) provide the ability
to "point" or "forward" to another site. |
|
The
multiple domains will point to a feeder site that is hosted |
|
| The
feeder index will include a "meta refresh" and a "no
index" statement. |
The
feeder index file should have an "optimized" title,
description, and keyword tag. |
The
Feeder Site is hosted and only contains an index file and a robots.txt
file |
Next, add a 301 Permanently Moved action to the Feeder Site that will redirect to the Main Site so that any links and status is passed. The
feeder site will correctly redirect to your "real" site. |

How to move a site to a new host
If you are
moving your site to a new IP address or ISP this procedure will help minimize the downtime and confusion
during DNS propagation.
| |
Set up the DNS on your new host to point to your existing (old host) site first. This is an important first step.
Now change the TLD (top level domain) information at your domain registrar to point to this new site DNS. Your old site should still show by either by IP or domain name.
Copy your existing site to your new site and validate that all files have transferred and the links work.
After allowing 4 days for the DNS to be fully propagated, point your new DNS to your new site. Make sure that your old site mailboxes have been emptied before you change any DNS info at this time. Once this DNS change occurs you cannot get to your old mail.
*Once everything has been validated you should then point the old DNS
to your new site. This is a safety issue in case there is a lingering propagation
error.
Search Engine listings or bookmarked pages should transfer to your new site with a 301 redirect.
After everything has been checked you should be able to delete your old site after a sufficient amount of time has passed (not more than 3 months). Note that Google does cache the old DNS address information and until they verify that the site has moved and store the new DNS information they may not visit your new site. The 301 will assist in this area.
* If you are moving from an IIS server to Linux (Apache) you should validate your formmail scripts, and any items that may not be cross platform compatible.
If you are moving from Linux to IIS then your .htaccess file will not be compatible as well as the ability to CHMOD permissions. Validate all functions with your ISP Administrator (some of the following steps may need to be redone on your new server).

How to configure a 301 Permanently Moved action
In the above example the IP Funnel page should transfer to the production site via a 301 Permanently Moved action. A 301 is also useful in resolving typographical errors in links from other sites, as well as creating shortcuts. In unix you may also choose to use it to handle improper case specifications in URL's.
How to correct this problem.
In a UNIX/Linux (Apache) environment you would modify the .htaccess file to include the command:
RedirectPermanent /SEO_Technical_Issues.htm http://www.alchemistmedia.com/SEO_Technical_Issues.htm
RedirectPermanent / http://www.alchemistmedia.com/
In a Microsoft IIS environment you would normally open the control panel, select "Home Directory", and select "Redirection to a url".
In both cases you would be well advised to include a custom 404 page. The 301 Moved action (where you do not specify a from page name as in the "/" example above) causes http://www.old-domain.com/xyz.htm to be sent to http://www.new-domain.com/xyz.htm -- specifically requiring that the page names match or a 404 will result. It is possible to intercept and alter the 404 to be a 301 thru programming in IIS or Apache.

How to configure a 404 File Not Found action
A customized web page with complete navigation is very helpful for visitors in the event they come across and old web page that has been renamed, moved, or deleted.
How to correct this problem.
In a UNIX/Linux (Apache) environment you would modify the .htaccess file to include the command(s):
ErrorDocument 404 /404.htm
ErrorDocument 403 /404.htm
ErrorDocument 401 /buynowpage.htm
----- Error codes appear below
In a Microsoft IIS environment you would normally right-click on the appropriate Web service icon in the IIS Management Console and select Properties. Select Custom Errors then select the 404 error, and then select Edit Properties. In the Message Type list select Edit URL being careful (!) to specify the error recovery page URL.

Tables are causing long navigation lists to be seen as the first content
Your site is developed in “tables” and by the time the search engines encounter your “main content” it is far down the page. Generally the text at the top of the page is considered the most important text, and sometimes site design pushes the main body content hundreds of lines from the top of the page source file.
A normal web page usually has a header, a nav bar that is usually on
the left side of the page, and the main content is on the right. The
search engines usually look at many page attributes, i.e., title, description,
and at least the first 200 words of your content after your opening <body>
tag. The engines do spider your whole page but if your nav bar lists
many products, the search engine may not encounter your main body within
the first 200 words.
How to correct this problem.
This technique is also known as the "table trick". We include our description as an example for our clients.
Most of
the search engines will read a table a certain way. They will
find the opening <table> tag and look for the first "table
row" <tr>. They will begin to read each "data
set" <td>"data"</td> inside the "table
row" from left to right until they find the closing </tr>
tag. They will try to keep going until they find the closing </table>
tag. They will continue until they have crawled the entire page. Your
"main body content" is usually where you would have most of
your keyword phrases and the "relevant" body copy that you
would want the search engine to index. Knowing that the spider
will often try to figure out your "theme" within the first
200 words of your site you would want them to see the relevant
text as soon as possible. The table technique will "push"
your left-side navigation (etcetera) bar down below your body content and "pull" up your
body content so it will usually be within the first 200 words.
| |
|
Normal
site without "table technique" |
Normal
site with "table technique" |
|
|
View
the page
Before implementation
When you "view source" the page is heavily
commented. |
In
our example we used "includes" to show where most of your
body content would normally be located |
|
|
The
most important part of the technique is to insert an "empty
data set" right before your body content "include" |
|
| The
left nav may have many product links listed that
would easily be over 200 words. Especially if all of the links are
fully qualified. |
When
you look at the source you will notice that the body is
below the nav bar |
There
are many variables to contend with. We are assuming that
your site header could be mostly pictures, <alt> tags and
some navigation. The first area that is usually content
intensive is a text based nav bar.
If your nav bar is javascript or flash based, you have a whole
new set of concerns. |
If
your site header is mostly pictures and the table technique "data
set" is empty. The second item that is read is really your
"main content". Depending on your page architecture,
this could take some thinking and major redesign work. |
You
will notice that the nav bar is down at the bottom of
the page and the only visible difference is that your
nav bar is slightly lower. A small price to pay for better ranking.
|

The site is frames-based -- What do I do?
Search engines are having increased problems with trying to spider a "frames
based" site. We believe there will be continued problems and we STRONGLY
recommend a redesign to a non-frames environment. Often the URL included
in the frames pages are being indexed instead of your invoking frames
page, so if you include content from another site in your frames page
you are not getting ANY credit for that content... you are simply causing
the other site to get spidered. It does not help you in the least in this
case. If the content is from your own site and you need to use frames
then there is a solution that will help by re-establishing the frames
environment for the site.
How to correct this problem.
In the interim, adding this to the top of each page included in a frame
will result in having that page detect that it is being loaded outside of
the frame and it will re-establish the frame around this page:
<SCRIPT LANGUAGE="JavaScript">
<!-- Hide script from older browsers
function changePage() {
if (self.parent.frames.length == 0) self.parent.location="http://yourdomain/index.html";
}
changePage();
// end hiding contents -->
</SCRIPT> |

Server Side Include Tips
If your pages are .htm or .html and your server does not recognize your files when you’re using server side includes (SSI), then the server must be configured for their application. In addition, if you are using SSI commands but are afraid that the .shtml commands are harmful to your search engine optimization efforts, OR your site is entirely .html extensions, but you want to add SSI commands for tracking or promotional purposes, then the server must be confiqured appropriately.
How to correct this problem.
If you're on an Apache (linux) server, your ISP will have to edit the httpd.conf
file to include your extensions. They usually know what to do. However, the
ISP will need to restart the apache server which will impact other sites.
| example: |
AddType text/html .shtml |
| AddHandler server-parsed .shtml .html .htm |
If you're on a MS server (IIS) your ISP may have to edit the registry
to include your additional file extensions.
IIS 6.0 will include a new node that is named Web Server Extensions.
Your ISP should be familiar with how to do it.
As with Apache (Linux) servers, your ISP has to stop IIS, make the change,
and restart IIS.
For additional information on MS servers please view the following links
to the Microsoft Knowledge Base.
IIS
6.0: Definition of Term Web Service Extensions
Setting
up SSI's with different extensions.

Opps, your IP is either dirty or virtual
About 3% of all web sites "own" a private IP number, with the remainder being on virtual, or name-based, servers. Although only 3% are dedicated IP's, we have seen that in many instances well over 90% of the top-50 results in the search engines are sites having dedicated IP numbers. This was so strange that we have repeatedly validated these findings, and have found that switching a site from a virtual IP to a dedicated IP number alone has caused significant ranking increases. Of course, the web is so dynamic that this could be coincidence, but we do not think so.
Likewise, we have found that there are "dirty" IP c-blocks, ranges of IP numbers that have been tarnished by spammers and left to be reassigned to unsuspecting sites. If your site is in the range of the spammers IP, then you are equally penalized. We have likewise found instances where simply moving a site has caused the ranking to improve.
How to correct this problem.
First, check your server for problems:
This tool will perform a SOCKET TCP/IP Read for your site, a Request Read, and a Browser-type Get, comparing various header and source data to see if the site is the same and error free for each method.
The purpose of the Check Server Page is to check the configuration of your web server for errors that could create problems for search engines. This information is very helpful since search engines often reduce web site rankings due to web server errors they encounter. At the very least, even if you encounter a common error that will not cause you to be dropped from the index, a "cleaner" site is likely to rank above you in the search engine results if all else is equal.
The following is a listing of the common codes produced by web servers:
| Code | Description |
| 200 | OKAY - No errors returned from server |
| 206 | Partial Content - Server error |
| 301 | Permanently Moved - Acceptable to search engines |
| 302 | Document found elsewhere - Redirect code - usually unacceptable to search engines |
| 304 | Not modified since last retrieval - Not likely for a search engine spider |
| 400 | Bad Request - Usually a spider or browser error. |
| 401 | Authentication required - Password protected access |
| 403 | Access forbidden - always protected |
| 404 | Document not found at this location - Page not found |
| 408 | Request timeout - probable server problem |
| 500 | Internal Server Error - Script failed |
Second, if there are ANY errors reported then correct them. The tool has a help file describing many remedies for each section of the report. For instance, error with robots.txt means that you do not have one or it is corrupted. Add one to the httpdocs directory (same as your home page) even if it is empty. Dirty IP list, contact your ISP and complain. Redirects, your server may be mis-configured... read the bottom of the report since the remedy is well defined right there.
If you find the information above helpful and would like to link to this page as a reference, please use the following link code:
<a href="http://www.alchemistmedia.com/SEO_Technical_Issues.htm">SEO Technical Issues - Alchemist Media, Inc.</a> 
Source: BruceClay.com
|