Thursday, March 29, 2012

Two methods for optimize performance for finding comments for template usage in html-documents

Ok, so we are using comments in our project as placeholders for our templating system. We want to do that, since we don’t want to add unnecessary elements to our DOM-tree.
However, finding comments is not as easy as finding for example a div with an ID, since they aren’t a part of the DOM model. And that was exactly why we want to use them in the first place, remember!? J
So how do we find comment elements? Well, one way as we found was to traverse the document using jQuery, but instead of using children() we use contents(). Contents return all content, including comments, making it possible to find them.
However, finding an object in a document-tree manually, by traversing the elements one by one, is a very expensive operation that scales badly. That’s not good, since we need our site and code to be fast no matter how large our page is.


So what can be done? Well, it turns out that Firefox and Chrome supports xpath lookups, making the comment finding extremely easy. All you have to do is to use document.evaluate together with comment(): 

But this doesn’t work on IE. Too bad, since it’s extremely fast.
However, we know something about our problem. We aren’t interested in just any comments, but comments to be used with our template system. That means that the comments that we are looking for are more likely to be found close to the root of the document tree. In our first implementation we did a traditional, recursive traverse of the document tree, going from root to branch and leaf, one at a time, finishing the previous branch before starting in the next.
That’s not optimal. For the footer templates that mean that we need to traverse through almost the full tree every time before we find it.

Horizontal versus vertical traversing 

The solution then is to traverse the tree one level at the time, starting with looking for the comments one level down from the document root, then two levels down, then three, etc. But however much faster that it, it still means a lot of tree traversal. And what can we do then? 

Of course, as we all know, the answer is cache. We want to make the comment easy to find. We can’t really store the comment itself, because we can’t insert DOM elements after it. But what we can do is to save the comment’s parent’s location. That way, the next time we need to find the comment, we will start looking directly at the right branch.
Of course, if the templates changes position, if a template placeholder is within another, the comments we’re after might move, or their parents being removed. That’s easy enough to solve; If that happens, we start a new, smart search.


So what are the results of the smart document tree traversal and caching? A great speedup, actually, that gets better as the DOM grows. For a tiny hml-document our optimized version where 40% faster, for a small page (~600 elements in the DOM-tree) the speedup was 150%, and for a normal sized page (~1500 elements) the speedup was 700%!
That’s a good result. Also, as long as the template comments are close to the document root, the optimized version will scale perfectly no matter how large the document is.

(I did this as a part of a project here at Betsson. We do great stuff here and have a lot of fun as well, so check out Betsson's open positions!)

No comments: