Standardizing Your Content Should Be Standard Practice

:: Converting to XML ensures your content is ready for publication on the Web and elsewhere ::


By Mark Gross, Data Conversion Laboratory, Inc.


There are over 1 billion websites on the Internet today, each of which has content. New content is created basically in perpetuity across the Web on sites for news, retail, sports, health and wellness, education... the list could go on and on. Is this news to you? I doubt it. You might be thinking that this is all rather obvious - that the Internet is full of content. But the key point here isn't so much about putting new content up on the Internet, but getting older, complex, legacy content ready for the Web, a much bigger challenge. 


A perfect example of this is the American Academy of Pediatrics (AAP), who needed to convert intricate point-of-care content for their website - no simple task. The content had numerous conversion bugaboos like graphics, footnotes and images. From a formatting perspective this was very challenging, and accuracy was a top priority. What to do? 


The Optical Society of America (OSA) faced a similar challenge when they decided to digitally convert their almost 100 years-worth of materials in order to provide their members and others with more robust product offerings. In other words, this historic information (technical journals) needed to be reformatted, reused and accessible. These journals needed to be digital and like AAP were fairly complex and featured many equations, tables and algorithms. What to do? 


For both of these examples, the answer was a simple one: Convert to XML. This process frees materials from the constraints of paper and other formats without losing essential information. XML is standardized, flexible and future-proof.


After all, every organization wants their content, including valuable legacy content, to be standardized and flexible in ways that make getting it onto the Internet easier. That's the only way to wring true value out of it. Converting your content to XML prepares digital content for electronic publishing, data distribution, and the Web. Further, JATS, DITA, S1000D, SGML, and the alphabet soup the various available schemas serve the same valuable function - make content standardized within a specific subject area, with a high level of accuracy and quality.


However quality is more than just accuracy of the content; it's also the total cost, speed, and seamlessness of the end-to-end process. Time and again I hear horror stories of conversions gone wrong; but the stories rarely involve the conversion process, but instead focus on after. Here are three things to think about to avoid XML conversion problems, pitfalls and prolonged pains:


1. Take stock - have a deep understanding of what kind of content you have, how much of it, and how the content varies over the corpus of material.   


2. Identify examples - make sure you have examples of the corpus of your materials ready so you or your vendor can use them to plan, and prevent problems down the road.


3. Know your schema - lock down your technical schema and make sure the variations of your materials are adequately considered, so standardization is that much easier.


4. Test - build testing time into your project plan, both during the conversion process, and after you think you're done.


Having a copious amount of content is a good start for any company, but it's not the end-game. That content, especially legacy content, needs to be flexible, findable and accessible. Standardizing is a major step towards creating new revenue streams, product offerings, and membership benefits, to name just a few notable benefits. Further, moving to a standardized format can be distilled down to the basic idea of preservation of your content assets, which is essential to having a winning plan going forward. 


Standardizing shouldn't be an afterthought. It is an important strategic decision, like hiring new personnel or implementing new technologies. And, like anything of value, it should be protected from being lost or degraded. Standardizing your content is an investment that will pay off for years to come.


Mark Gross, president & CEO - founder of Data Conversion Laboratory (DCL), is a recognized authority on XML implementation and document conversion. Prior to founding DCL in 1981, Mark was with the consulting practice of Arthur Young & Co. Mark has a BS in Engineering from Columbia University and an MBA from New York University. He has also taught at the New York University Graduate School of Business, the New School, and Pace University. He is a frequent speaker and writer on the topic of automated conversions to XML.