Ask the Experts: Rewriting Looooooong URLs

Are Short URLs Important To Your Website?

To answer this very important question, Website Services turned to two 'Net experts. The consensus, if you have ever had to deal with the issue, will not be surprising. "My years of experience have shown that short URLs out perform the long URLs in almost every way," said Bret Fencl of FenclWebDesign.com LLC. "There are various reasons that web sites may require long URLs, such as proprietary software that is licensed, but there still may be workarounds for most of your important pages."

While it is those workarounds that the majority of us are most interested in, as always, it is important to start at the beginning to learn what a URL is and then to move on and see why long URLs create problems.

Bret Fencl: Let's start with the basics. URL stands for Uniform Resource Locator. The URL is your web address that you see in the address bar of your browser. It usually begins with HTTP, but may begin with HTTPS for SSL secure pages. Usually your home page URL will look something like this: https://www.example.com. The problem of long URLs isn't usually on the home page, but the rest of the pages and especially shopping cart pages. Take a close look through your web site to see if you have any URLs that look like the following:

https://www.example.com/shoppingcart/script/contents.php?prod_id=395847&w=9&p=1

If you did find some long URL then you may want to read further. Shortening most or all of your URLs to the length below would make it easier for your web site visitors to understand, plus help the search engines with proper indexing of your web site.
https://www.example.com/widget/green.html

"short URLs out perform the long

URLs in almost every way


There are many possible issues that long URLs may cause. One issue may be that your visitors many not be able to email a working link of your site to a co-worker or family member. The links may take up two lines in the email, or break after the (?) question mark, and therefore cannot be clicked properly.

Many search engines may have an easier time navigating and indexing your web site correctly if the URL simply matches the name of the actual product or subject on the page. I am in no way suggesting that you stuff the URL with spam techniques such as:

https://www.example.com/keyword-widget-keyword/keyword-green-keyword.html


Just add the name of the product or subject of the page to the URL, but do not go overboard. Going overboard such as the above keyword example could cause your site to be penalized by search engines.

The simplified readability of short URLs can help your click through ratio on the search engines. When looking for a particular item in the search results of a search engine, do certain web sites stand out clearer, simply because they are easier to read? Did seeing the items name in the URL catch your eye? Let's assume that two web sites ranked equally on your favorite search engine, and both web sites carry the exact same product. If you had to choose one web site first to click, which result below would you most likely click?

Example Company Widget Sales
https://www.example.com/shoppingcart/script/contents.php?prod_id=395847&w=9&p=1

Example Company Widget Sales
https://www.example.com/widget/green.html

Why do long URLs create indexing problems for search engines and what can be done about it?
While Bret gave us a great start into the basics of rewriting long URLs, Website Services also turned to Jim Goslin, Director of Optimization at IncreaseVisibility.com.

Jim Goslin: Most major search engines can index sites that use the querystring in the address that produce long URL's, but they do present drawbacks. Search spiders tend to crawl and index the pages much slower than static html pages and if the Querystring contains more than 3 parameters, they can confuse the spiders and cause pages not to be indexed. In fact, if the URL's contain session ID information or if the URL contains the string "&id=" they may not be indexed at all. The reason is that long URL's tend to be pages that all use the same template and can be seen as duplicate content and may not be indexed.

Static pages however are not the only solution to long URLs. Websites can still use dynamic pages and mask the URL's with something called URL Rewriting; a simple process of having the server transform long url's into pages that appear to be static. This can be done on both Windows/IIS and Linux/Unix based systems. Most implementations require some basic knowledge of Regular Expressions.

For Unix/Linux Web Servers:
The easiest implementation is done on Unix/Linux based systems using the .htaccess File in your root directory. If speed is a concern, the most popular web server on 'Nix based systems is Apache, which includes a built in module (created exactly for this purpose) called "Mod_Rewrite.

For ASP Windows/IIS Web Servers.
The best solution here is to use a third party program like ISAPI / Rewrite from Helicon Tech. This will require having your systems administrator install the program on your server.

For .ASPX Windows/IIS Web Servers
Windows/IIS doesn't have the same support for this functionality as 'Nix/apache but it can be done. The functionality can be coded into your web applications in a few different ways, the simplest way is to have regular expressions in your Global.asax file do the translations, this is best used on a server without heavy load. If speed is a factor, a custom or third party ISAPI Filter is best, this would be an application that you install on the server that would do the translations. Some example third party filters are URL Rewrite by IISMods and URL Rewrite by Smalig|Webworks.

Are there some content management systems that are more user-friendly when it comes to recreating file names?
Many content management systems provide the functionality to create websites that are dynamic and still use search engine friendly URL's. The main functionality to look for with CMS Systems is how customizable the templates are, as well as the URL's. Two good programs that perform well under heavy load, are ablecommerce and Siterefresh from Refreshsoftware. If cost is a factor, there are many free open source tools available, like OpenCMS and Typo3.

Many thanks to Bret Fencl of FenclWebDesign.com, LLC and Jim Goslin, Director of Optimization at IncreaseVisibility.com for their valuable contributions to the Website Services Magazine Ask The Experts section.