:: By Everett Sizemore, Inflow ::
You’ve seen a content-rich page with only a few lines of code (and no iframes) when viewing the source within the last few years. For example:
You see , or what looks like server-side code in the meta tags instead of the actual content of the tag. For example:
You’re asked to optimize a site with #! (hashbangs) in the URLs. For example:
If the website uses hashbangs in the URL, you can crawl the site to get the “URLs” but not the content. To get the content, you’ll need to replace the hashbang with the escaped fragment parameter using Excel, and import that new list of URLs for a new crawl. The Screamingfrog SEO crawling tool has a nice feature in the AJAX tab that will fetch the HTML snapshot of the content if you follow these guidelines. You can also do it with Deep Crawl. More on hashbangs and escaped fragments later.
window.location.href = "www.mysite.com/page2.php";
#2 According to a study from Builtvisible, Google may stop rendering the content if it takes more than about four seconds to load, leaving the page only partially indexed and cached.
#3 The same study shows that content requiring an “event” may not load at all. Examples include content that gets loaded after clicking a “more” button, image carousels or “tabs” to show further details about a product.
According to the Builtvisible study, content rendered this way may not get indexed.
SEOs have to deal with misinformation from Google about the risk of “cloaking” when it comes to serving a pre-rendered snapshot of the page. It comes from this Google Webmaster Central post in which a Googler answers an important question:
The highlighted portions are quoted by developers when pushing back on serving pre-rendered content to Googlebot.
But, as John Mueller told webmasters two weeks after this post was published, sometimes you have to pre-render content in order for Google to see it.
What’s The Difference Between Server-Side and Client-Side Rendering?
Server-side rendering happens before the page is even loaded into the browser. Client-side rendering typically occurs in your browser. Search engines incorporate “headless browsers” into the crawling routine so they can “see” the page once all of the client-side content has been rendered — otherwise they’d miss out on all of the content that gets rendered client-side.
Basically, you want both of these windows to look the same. Googlebot uses a headless browser to render the page.
But there is another important issue other than the rendering of content — one that is even more confusing than trying to understand Google’s conflicting and misleading advice about pre-rendering. Every “page” needs its own URL.
What Are Hashbangs and Escaped Fragments All About?
When developers first started using AJAX heavily, the URLs were constructed like this:
www.domain.com/#page. Internal anchor links like this are used to skip down the same page, rather than signifying a separate URL to be indexed.
Use of the #! URL format instructs the search engine crawler to request an “escaped fragment” version of the URL. The web server then returns the requested content in the form of an HTML snapshot, which is then processed by the search engine. For example:
Crawler sees https://www.example.com/page#!content=123
Crawler requests https://www.example.com/page?_escaped_fragment_=content=123
Server returns a static, crawlable “snapshot” of /page#!content=123
Answer: No. There are other ways to go about it with better results.
The first way is to have links to content in the HTML, and to pull the URL and route to a specific resource based on that. This method treats routing the same way you'd traditionally do it.
For an example of routing that way, take a look at Pete Wailes’ project: https://wail.es/fly-me-to-the-moon/. This site uses basic routing in JS to serve different pages based on the URL. This means you get a page refresh when changing URLs, just like you'd have on a traditional website.
Another way is to load stuff via XML Http Request (XHR), and to update the URL via the HTML5 history API. The logic behind the scenes works the same way as above, but the URL updates happen with JS, rather than through actual page changes. Go have a look at this example. Click on a link and watch the URL change without refreshing the page.
So What’s the Bottom Line Best Practice Right Now?
All “pages” should be accessible on their own, indexable URL. For example: https://www.domain.com/page1.
Everett Sizemore is the director of marketing at Inflow, an e-commerce marketing agency specializing in SEO, paid search and conversion optimization. He has over a decade of experience with e-commerce, and is a Moz Associate. Everett presents on technical SEO at industry events like SMX Advanced, SMX Milan, MozTalk and Confluence.