March 28, 2011

HTMLCollections & NodeLists

Most of us believed, at least for some time, that in our DOM Scripting, we always dealt with arrays in our JavaScript:
var my_links = document.getElementsByTagName('a'); // we have three links
alert(my_links.length) // outputs "3"

We later found out that the things we thought were arrays, were instead array-like objects. But how exactly are they like arrays? Those "array-like" objects/elements/things, most of the time, are either HTMLCollections or NodeLists, not native JavaScript array objects. Take a look at what the specification says of them, the keyword is live:

"An HTMLCollection is a list of nodes. Collections in the HTML DOM are assumed to be live meaning that they are automatically updated when the underlying document is changed."
- DOM Level 1

"The NodeList interface provides the abstraction of an ordered collection of nodes, without defining or constraining how this collection is implemented. NodeList objects in the DOM are live."
- DOM Level 3

But what does that mean? It means that live collections, if modified, are updated as the program runs. For example, this is an infinte loop:

var i, j,
    my_links = document.getElementsByTagName('a'); // we have three links

for (i = 0, j = my_links.length; i < j; i += 1 ) {
    document.body.appendChild(document.createElement('a'));
}

We're getting our collection (of three links) and then for each link that we have, we're going to append another link to the body. So why is this infinite? Because the collection is live, which means that not only will i increment, j also will, so naturally the loop will keep going.

So why are they called array-like objects? If it looks like an array and acts like an array, then it must be an array, right? Wrong, DOM collections look like arrays because:
  • They have an associated index to each value in the container. But that's something an object can have too:
    var my_obj = {
        0: 'zero',
        1: 'one',
        2: 'two',
        3: 'three'
    };
    
    alert( my_obj[0] ); // 'zero'
    alert( my_obj[3] ); // 'three'
    
    The alert statements might look like they want the elements with index 0 and 3 but you're really getting the value from the property named 0 and 3.
  • They have a length property. This is deceptive because arrays have this same property, but so do HTMLCollections and NodeLists. But, because these are not true arrays they do not have push, concat, splice or any of the other array methods.

Be prepared, these are some of the DOM methods (that I know of) that return an HTMLCollection or NodeList:

// DOM Level 1/HTML 4.0
// ---------------------------
// Return an HTMLCollection
document.anchors
document.applets
document.forms 
document.images 
document.links

document.getElementsByName

formElement.elements
selectElement.options
tableElement.rows
tableElement.tBodies
tableRowElement.cells

// Not part of any standard
document.embeds
document.plugins

// DOM Level 2
// ------------------
// Return a NodeList
Node.childNodes

document.getElementsByName
document.getElementsByTagName
document.getElementsByTagNameNS
document.getElementsByName

element.getElementsByTagName
element.getElementsByTagNameNS

// WHATWG Web Applications 1.0
// ---------------------------
// Return a NodeList
document.getElementsByClassName
element.getElementsByClassName

Throughout this post I've been talking about NodeLists as such and not as live NodeLists because they are inherintely live. There's an exception to this, there are static NodeLists that act as snapshots and do not update when the document is modified:

// Selectors API Level 1
// ---------------------
// Return a static NodeList
document.querySelectorAll
element.querySelectorAll

In conclusion, I think it's important to know the differences between a live DOM collection a true JavaScript array, it's also an important thing to be aware of because you'll eventually interact with these.

I know so far I've been talking mostly about the DOM (sorry, this wasn't the exception) but you cannot say that this was a boring topic, or was it?

Thanks for reading and let me know your comments.

Sources:
Why is getElementsByTagName() faster than querySelectAll()?
Speed Up Your JavaScript (video)
HTMLCollection - MDN Doc Center, NodeList - MDN Doc Center
DOM Level 1 Specification, DOM Level 2 Specification, DOM Level 3 Specification

March 3, 2011

Using innerHTML

innerHTML is a read/write property of a DOM element that gets/sets the HTML contained in the element.

It's fast
This might vary between browsers but, it's almost a fact that creating and inserting elements using innerHTML instead of DOM methods is faster, not only at execution time, it'll also make the script size lighter because less code is needed.

It's clean & readable
Although the name, innerHTML, might seem confusing at first, it comes as the better choice in terms of code readability because DOM methods are very verbose and can consume a lot of lines of code.

It's supported
It was first introduced by Microsoft as proprietary to IE and there's no spec that defines the behavior of innerHTML but it has been adopted by all major browsers because of its usefulness and it pretty much works the same in all of them.

Creating and inserting using DOM methods:
var newDiv = document.createElement('div');
newDiv.setAttribute('id', 'new-div');
newDiv.setAttribute('class', 'big-div');
var text = document.createTextNode('Some text here');
newDiv.appendChild(text);

document.body.appendChild(newDiv); // div is inserted in the tree

Creating and inserting using innerHTML:
var newDiv = '<div id=\'new-div\' class=\'big-div\'>Some text here</div>';
document.body.innerHTML = newDiv; // div is inserted in the tree

Reasons not to use it

  • Not standard. Although it's fast and it works, the bottom line is that it is not part of any W3C or DOM standard. However, there are plans of adding it to the HTML5 specification.
  • XSS unsafe. You have to know when to use it, otherwise you are exposing your application to XSS attacks, choose DOM methods until you're familiar with the subject.
  • Not implemented everywhere. There are some table related elements, in IE, that can't be modified with it. The implementation and behavior might vary from browser to browser.
  • Destroys the children. Setting a value to innerHTML will destroy every descendant to that element, if any of those descendants had event handlers, that could potentially create a memory leak in some browsers.
Thanks for reading and let me know your comments.

Sources:
innerHTML (Mozilla Developer Network)
innerHTML (Microsoft Developer Network)