Skip to main content

Tizra gets faster

Non-technical summary: things are lots faster at Tizra sites and admin tools. There's certainly more to do, but we've got more tricks up our sleeves! Because the big current speed boost is related to one cause, and it took me a while to track down, the geek appendage to this post describes what we found and how we fixed it.

Geekly details

I spent a bunch of time last week looking at system performance. As we've been adding customers and usage, we were beginning to feel the pinch. Performance always varies, but the range of response times was getting wider as things slowed, leading me to think that there might be some systemic issues that would give us a quick improvement (and indeed there was some Linux tuning that helped a bit). But data access seemed to be the real issue, so I spent a bunch of time looking into hibernate, and our caching and querying, and then wound up spending a day or so basically watching all the queries go through Postgres. And you know what? most of them seemed much slower than they should be, even though they are pretty hairy.

Of course, the next step was to check for database indexes, and how the query plans were using them. But in hand testing the plans looked good, and the indexes were sensible. But when run by hand the queries were also significantly faster than when hibernate ran them! This was much easier to see now that we have a live load, which is inevitably different from a test setup. So why the difference? Postgres was ignoring our indexes only when Tizra publisher made the queries.

Turns out that there's an old bug in Postgres where it would ignore indexes on bigint fields in prepared statements unless there was an explicit data type cast. (That type confusion was an obscure result of skew between Postgresql and the SQL standard.) And that was the behavior I was seeing, even though we were using a much more recent vintage of all the software. This was terrible for us, because we have a multi-tenant publishing system for large document collections and we use bigints as primary object identifiers!

So, why the old problem if the bug is gone, and we are not using postgres 7? It turns out that we dynamically build those hairy queries, in HQL (hibernate query language), using the String trick. But nowadays instead of making your indexes work, it breaks them! The differences are invisible in the SQL. It turned out that we were in a version "donut hole." Our database was recent enough so the String trick worked the opposite way (preventing fast queries for our prepared statements), but the JDBC driver wasn't making the calls in the right way to make the old trick work. End result: we're now running the latest JDBC driver with compatibility options set while we update our hairy query generator. And now we can really start tuning our setup!

If the web had not provided the history of the old bug, I would have had a much worse time even knowing where to look to find our somewhat subtle configuration issue. So enjoy the speedup, I sure am!

Comments

Anonymous said…
This is pretty wierd, any chance you can post the exact version numbers of Postgres, Hibernate, and JDBC involved?

Popular posts from this blog

Optimizing eContent Sales: 5 Strategies for Monetizing Content

Targeted promotions and content bundle upsells are two ideas to test in optimizing sales. This is the third in a series of blog posts based on our webcast,  "10 Factors to Consider when Developing your Digital Publishing Strategy."  You can still watch it in its entirety here: So far we have talked about the importance of understanding your   audience  and whether or not to sell direct . Today, we investigate the various monetization strategies publishers utilize.  The entire publishing industry has been experimenting with pricing and delivery models, from the Netflix-like subscription services offered by Oyster and Scribd, to bundling of ebooks with print editions, to chapter-at-a-time sales and more. Yet no single pricing model has emerged.  What does that tell publishers? It means that you need the flexibility to experiment with your pricing strategy and adapt quickly to market fluctuations and demands. You will need a commerce and delivery...

Slater Invests in Tizra

This is a big one for us. Rhode Island's Slater Technology Fund is betting $500,000 that Tizra will "really open the floodgates for book-based content from thousands of publishers." Their investment caps a year in which we've gone from four people , two dogs and an idea to a company that someone besides us and our friends and families believe will set online publishing on its ear. We even have our very own Forbes article . Thanks to the folks at Slater for being great advisors as well as investors, and to the many friends and family members who preceded them!

Association of Research Libraries Goes Live with Tizra

After extensive internal testing, the Association of Research Libraries has begun offering recent issues of its flagship publication on a public test site hosted by Tizra. The organization announced recently that Research Library Issues is now available in full-text searchable form at… http://publications.arl.org We're thrilled about this, not only because because it's a vote of confidence from a high profile organization, but also because ARL's membership includes some of the most prestigious research institutions in the world (including the libraries of MIT and Indiana University, whose presses are already using Tizra). In addition to greater production efficiency and flexibility, ARL's use of Tizra stems from a desire to provide members with capabilities including… Better full-text search. More targeted references via social software and other links. Better compatibility with web enabled mobile devices like the iPhone. We are proud to count ARL—and RLI reade...