Getting to Know Structured Data Markup


HTML5, Microdata and the Rel=Attribute

Familiarizing one's self with the modern markup languages, semantic structures and schemas available today can yield a more engaging, easier to manage Web property. Let's get to know markup in this edition of Website Magazine's Small Business Lab.

Search engines and social media networks have taken the Web community a long way in terms of providing access to information. While Web publishers and Internet consumers have lived up to their end of the bargain in terms of publishing content (exabytes of data every day in fact), much still needs to be done to help organize that information and of course filter through it. Structured data markup, found in a dizzyingly complex and often overwhelming variety, provide at least part of the answer.

The most significant announcement for the broader community of Web professionals as it relates to this was that of Schema.org. Google, Bing, and Yahoo recently entered into a joint initiative to support a common vocabulary for structured data markup on Web pages. For those Internet professionals looking to continue the success of their natural/organic search optimization campaigns the announcement was of particular interest.

Search engines have long supported a wide array of markup formats in the past including RDFa, Microformats and Microdata, but by all three major search engines providing support for a common set of schemas, all that remains for Web professionals is the integration. One of the first steps they should take however is in the direction of HTML5.

HTML5 first emerged as a W3C (World Wide Web Consortium) recommendation in 2007 and is expected to be the officially recommended standard by 2014. That, of course, affords Web professionals some time but there are opportunities to get involved right now.

Since the release of the first draft in 2008, most major browsers implemented support for some of the features proposed and it began generating a lot of interest from developers (see sidebar - Brief History of HTML5). While still a few years away from the formal W3C standard, there are actually many active sites using HTML5 today and they aren't all informational sites. (see sidebar for an example).

Open Graph Protocol & Social Meta Tags:
A few months ago, Facebook adopted Open Graph Protocol as a standard for pulling information about pages. You can include these new Open Graph tags on your web pages. These new Open Graph tags allow you to specify exactly the content to share. So instead of just scraping the web page for the data, it will use the specified data instead. This gives you more control to make the connection between a web page and how it's shared by a user on their profile or wall. Google says that Schema.org provides more detail for the entities contain by web pages, and Open Graph, while very well suited for what it does, it is somehow limited, so it is not the solution needed for search.


The vast majority of Web workers however are not making the most of this language's potential (call it slowness to adopt to new approaches to design); which is unfortunate as the benefits are many - particularly in the realm of information discovery. How can the Web community accelerate the use of HTML5? While Web professionals could focus on the design elements of HTML5, another way is to consider the use of Microdata - a proposed features of HTML5 intended to provide a way to embed semantic markup into HTML documents. If success is partly reliant on information being found and shared (and for whom is it not), then HTML5 and those standards provided recently by the Schema.org initiative provide a framework within which Web professionals can work more effectively.

Of particular interest recently is how it relates to authorship and the "rel" attribute. This rel=attribute is related to HTML5 in the use of microdata. Microdata is an HTML5 specification that provides extra semantic markup for web pages to help machines understand what the content is more readily. The rel attribute provides a form of microdata regarding the relationship between links and the documents they are on.

article continues below...


HTML5 In Action:
Pirates Love Daisies
- Check out Pirates Love Daisies for a good example of HTML5 and how they can be used outside of traditional, information- only sites. Game action takes place within an HTML5 canvas element, and both sessionStorage and localStorage are used to store game data. Sounds used in the game are also played via the audio element.



HTML5 adds a new link type relationship: rel=author. What is interesting about this link type is that it allows you to point to author pages from articles and indicate to search engines the identity of the author. Ultimately this could reduce copyright infringement and identify sites that scrape content more easily. And it will also help search engines identify specific authors more directly and further establish the authority of specific outlets and individuals within those outlets.

Including authorship information on your web articles helps search engines provide better search results to their users and enables content marketers to get more of their content in front of prospective customers and users. If you were a search engine, wouldn't you trust an article with a verified author more than an article without?

The rel attribute has been a part of HTML for a long time. With Google taking an interest in this attribute and Yahoo! and Bing throwing their support behind structured data, it is likely that top Web destinations will start concentrating their efforts on it fairly quickly - which means you should consider it too.