In Newspapers, Standards Often Drive Debate but not Progress.

Reprinted from Newspapers and Technology Magazine

by: Barry Schaeffer

As the world lunges near headlong toward complete electrification, newspapers are finding themselves confronted with complexities they never faced before. If the growth of technology itself weren't enough, the fourth estate must now deal with the challenge of standards. Don't get me wrong, the news and printing business has dealt successfully with standards of many types since before Gutenburg. From paper sizes and weights, to type slug formats to color and so on, standards have been with us since uniformity and co-operation began. But today's emerging standards are a little different and a lot more challenging. I'm referring to the murky subject of information content standards. If the industry is to avoid the pitfalls that have befallen many of its counterparts, some background on this challenge seems appropriate. This will be my attempt at providing a partial map.

From Notation Standards to Content Standards: As technology made possible the creation of complex data, the standards world began development in two general directions.

One path, similar to that on which nearly all modern technology evolved, was the development of standards for the physical components of information. From the gauge of rails and angle of screw threads to the bit patterns for characters and the size, shape and rotational logic of disks and CDs, industry has always needed stable targets for its development and manufacturing. In the information world, standards of this type are most often known as "notational standards" because they provide a stable but neutral notation in which content can be expressed. Standards such as ASCII, SGML, and XML fall into this category because they allow users to describe the underlying structure of information but do not define its content or purpose. HTML, while structurally close to SGML, does not fall into this group because it dictates how content must be identified, even partially dictating the behavior of software processing that content.

The other path, one that has grown exponentially since the advent of the Internet, is the attempt to standardize content itself. From e-commerce languages to financial exchange protocols to news story formats, the information industry, with the participation (some would say collusion) of the standards world, is attempting to create and enforce uniformity on the actual content of communication. History and sad experience has shown that this will be difficult, contentious and potentially deadly for those who do not understand it. Perhaps most important reality for newspapers to keep in mind as they confront both the reality and hype of content standards is the fact that most are not standards at all but one or another group or firm's idea of how content and communication should be structured. As such, given the nature of business, such efforts are almost always competitive. This, of course, means that they will come and go with the fortunes of their sponsors, creating a moving target for those who would adopt or ignore them. To make matters worse, the developers of these pretenders to the control of content are often able to co-opt the standards world into anointing their efforts with official (or official sounding) designations. Indeed, the current squabble among the e-commerce standards (Biztalk, RosettaNet, cXML, CBL, OBI, etc., etc.) cannot produce all winners. In time, the fortunes of the market will determine on which standards e-commerce is built and who, among vendors and users, has backed the winner.

Pathways and Pitfalls:

For newspapers, this moving target creates several pathways to creation of better and more saleable information products. Unfortunately, it also contains serious pitfalls to be understood and avoided. The most important of these are described below:

  1. Adding intelligence to content is always a plus. For news organizations in the business of collecting and disseminating news content, the ultimate value of the product varies with the level of intelligence built into it. In news, this means how richly the content is tagged and how easy it is to use that tagging to create information products. This reality transcends the fortunes of any particular content standard, building instead on the use of notation standards that allow the intelligence to be captured and transported with the content. The only critical variable here is the use of a neutral notation standard like SGML or XML (the choice makes less difference than you would think) avoiding dependence on any particular vendor. Once content is captured with a defined level of intelligent tagging, it can be easily transformed to other structures but will carry its enhanced value with no matter the final application. The message for newspapers here is to begin the process of capturing intelligent content, even if the final use of the intelligence is yet to be fully defined. The selection of notation and structure, the deployment of needed technology and the training of writers and editors are demanding enough that uses for the enhanced content will likely be developed before the capture part is complete. In any case, the product of this first phase will be a more valuable content product and that, whatever the vehicle for harvesting that value, benefits the entire operation.
  2. Don't let the Content Control disagreements confuse you. Every group touting a fledgling content standard wants potential users to wait for its standard and to commit to it exclusively. In a changing sea of competing standards, this can have serious consequences; either waiting too long to start creating better content, or building a complex and expensive technological edifice tied to a vendor and standard that may not survive. Newspapers must understand that their product is their content and that enhancing it, while supported by effective standards, must not be captive to the standardization process. There is no reason to wait for the finalization of a particular content standard before moving to capture content at a higher level of intelligence and value. Neither is there much, if any, advantage to buying systems from a particular vendor just because he touts his adherence to standards beyond the generally-accepted notation standards that will make his system capable of creating and processing content at higher levels of intelligence. Putting this in direct terms; don't buy systems from vendors who can’t support SGML or XML notation but, beyond that, don't worry about which SGML or XML news content standard a vendor supports in making your decision.

To stay abreast of demand in the burgeoning electronic culture, newspapers must move toward the creation of richly tagged, highly flexible content. In that endeavor, the industry can use the standards process to support key parts of its efforts without becoming mired in the internecine quarrels and ego-trips that characterize today's content standards world. Hopefully, this prudence will, over time, percolate into the standards world itself, helping to smooth out what has become more a part of the problem than the solution.