Skip to main content

The XML Paradox

I have been working on my tutorial for the O'Reilly Tools of Change conference. I'm presenting PDF as a cost-effective option to create revenue from the the backlist as an alternative to XML. As a dedicated markup advocate from the days of SGML, and someone who helped simplify SGML down to XML, I still find it odd to be talking about other kinds of solutions, but I think I learned something from my custom web site customers... The XML Paradox is that XML is a high-quality archival medium, and obviously then, books and scholarly content would make the jump first. It just makes sense that everyone would use the high-value format for the longest-lived, highest value content. Wrong! The economics of publishing have played out the opposite way. The more ephemeral the content, the faster production methods can change. So newspapers were doing full-text databases from very early on. In the scholarly markets, journals are now almost all electronic. Books, however, are only starting to move fitfully in the XML direction, and are mostly not digital at all. So the least archivable stuff, moves to the best archival format fastest — because serial content does not have a legacy that needs conversion to make a new channel profitable, so the payoff from a production change can be pretty fast. A publisher with a rich backfile has items that can earn for 20 years or more — as long as costs can be controlled. So any change to the book production process has to pay off immediately on new books. And for any large-scale change across a publisher's line to be successful, it must be very cheap for old books. And that's where e-books stand, revenue unearned because there's not a clear path to get it. XML is great, and enables the production of an optimized presentation for a new media format, but it's not cheap at all. It's an expensive and tricky management challenge to change editorial production processes for new content, and data-conversion costs for old content are very high. Once the data is in hand, the development cost to create a new output format (print, web, handheld, or whatever) is not cheap either. Problems like typesetting, layout and display all have to be solved anew for each output format. It takes work to optimize presentation, especially from the level of abstraction gives good XML that power. So page images (and especially PDF) get a big boost from the XML paradox because they capture a lot of the production value of the existing process and they're the cheapest searchable format to produce from paper. So here I am, a guy who courted his wife over conversations about markup, working with page images. We are managing them with very rich metadata at a fine level, to capture much of the commercial benefit of XML, but still, I'm enabling something I used to rail against. And it's not easy to make page images work over the web, let publishers control the presentation, and still be good to readers. In this discussion I am leaving out the small number of crown-jewel properties that earn large amounts quickly in a new channel, and thus merit technology investment — Projects like that are important, but don't shift the business as a whole. And their emphasis on frequent updates makes them similar to serials in the need for continuous editorial management. Coming soon: I used to think that page scanning projects were a waste of money in terms of long-term investment, and I hope to post soon about why I no longer believe that either.

Comments

Popular posts from this blog

Case Study: Orca Book Publishers Unifies its Digital Offerings Using the Tizra Platform

"We needed to find a way to keep all our of customers together on the same site," --Melanie Jeffs, Director of Digital Products, Orca Book Publishers Orca, a Canadian-based publisher of award-winning books for children, teens and reluctant readers, used to maintain separate websites and e-commerce platforms for its various digital offerings. The company had its free teaching resources under its own domain, separate e-commerce stores for the U.S. and Candian booklists, and a third site supporting paid subscriptions to its Text2Reader language arts resources. Maintaining these disparate platforms was a headache and didn't provide a smooth customer experience. "We needed to find a way to keep all our of customers together on the same site," said Melanie Jeffs, Director of Digital Products at Orca. After closely examining a number of different e-commerce solutions, Orca selected the Tizra Digital Publishing Platform because it offered: The ability...

See Tizra at the Frankfurt Book Fair

Guten Tag! If you are attending the Frankfurt Book Fair and thinking about your digital publishing strategy, we’d like to meet with you to tell you about all the great things happening at Tizra including: Our recent partnership with HighWire to power their Folio ebook platform (see below) How Goodheart-Willcox uses Tizra to create digital first content Exciting new features such as an improved e-reading experience , new mobile responsive design templates , and new APIs for faster uploading and better design integration Plus, some big news we can't tell you about until the show! Find Tizra at: The American Collective Stand Hall 8.0 S31 or email us at carlos.martinez@tizra.com to arrange a meeting. Please join us in congratulating HighWire on the launch of Folio! Built on the HighWire Open Platform and leveraging Tizra for ebook integration, Folio is a flexible, scalable, ebook solution, providing a user-friendly, intuitive reading experience t...

Tizra Upgrade Provides a Crisper, More Interactive E-Reading Experience

In the print world, when you think about a reader’s user experience, you consider factors like the size and weight of a book, paper quality, typeface, layout and design.  Moving to digital, some of these factors still hold true, but others are replaced with concerns such as speed, intuitive controls, cross-platform compatibility, plus as with any human interface, a host of intangibles.  We’re always working to make the Tizra reading experience crisper, easier, and less distracting, because happier readers mean happier publishers. Tizra reader upgrade makes it easy to enhance content with interactive lightbox effects. The update builds on Tizra’s ability to provide usability and compatibility across all the most popular web browsers and viewing devices, and is now available to all Tizra customers. Enhancements include:   Speed -- e-reading should be as crisp, fast and simple as turning a page. Your readers are not going to tolerate delays waiting for cont...