Taming silent multithreading into sequentiality
This is the third in a series of articles on aspects of designing XUL applications.
The silently multithreaded nature of Javascript and DOM programming can result in behaviour that is unintuitive and difficult to cope with. Here I try to better define the problem and then suggest a solution that does not sacrifice code expressiveness.
Some operations that look sequential when reading the source in reality are not. I will use as an example the simple task of loading a document in a content panel and then displaying its title in an alert box. The code below would seem to accomplish it:
var contentPanel = getBrowser().selectedBrowser;
contentPanel.contentDocument.location.href = 'file:///tmp/test.html';
alert(contentPanel.contentDocument.title)
However, the alert box will come up with the title of the previously
loaded document (or just a blank, if the previously loaded document
happened to be about:blank). Why?
Although the instructions look sequential, a more accurate representation would be:
var contentPanel =
getBrowser().selectedBrowser;
|Start........|
|..........End|
contentPanel.contentDocument.location.href =
'file:///tmp/test.html';
|Start........|
|.............|
|.............| alert(
|.............| contentPanel.contentDocument.title);
|.............| |Start........|
|.............| |..........End|
|.............|
|.............|
|.............|
|..........End|
The second instruction, the one that starts the loading, returns before the document has finished loading, so when the alert box pops up there is no title yet to read.
It may look wrong and annoying, but it is understandable when one keeps in mind that this code runs in the user interface thread of a browser, and stopping it to wait for a document to finish loading would cause the interface to freeze and prevent the user to do anything else in the meanwhile.
The most naïve approach to dealing with this is setTimeout(). Just
delay the call to alert() long enough that the load operation has a
chance to finish:
var contentPanel = getBrowser().selectedBrowser;
contentPanel.contentDocument.location.href = 'file:///tmp/test.html';
setTimeout(function() {
alert(contentPanel.contentDocument.title)
}, 100);
However, while the time to read from a local disk is more or less
predictable, replacing the file:// location with an http://
location is enough to break the approach. Loading the document might
take one second as well as ten.
A step forward is to synchronize with the load event.
var contentPanel = getBrowser().selectedBrowser;
contentPanel.addEventListener(
'load', function(event) {
alert(contentPanel.contentDocument.title);
}, true);
If the content document contains other content panels in turn,
multiple load events will happen and multiple alerts will be shown. A
little guard will make it so alert() is only called when the event
pertains to the original content panel:
var contentPanel = getBrowser().selectedBrowser;
contentPanel.addEventListener(
'load', function(event) {
if(event.target != contentPanel.contentDocument)
return;
alert(contentPanel.contentDocument.title);
}, true);
contentPanel.contentDocument.location.href = 'file:///tmp/test.html';
The third parameter to addEventListener() is set to true because
the load event would not bubble up to the contentPanel, so it has
to be catched in the capture phase.
Some things to note:
- with regard to the
setTimeout()hack, precision has been gained, but sequentiality has been lost: instructions that describe happens after the document has loaded appear before those that cause it to load; - the code is becoming verbose; should the same operation be needed in a different places, there would be lots of copying and pasting;
- the second time the code is run, the first event listener is still in effect, so the alert box will be shown twice; thrice for the third time, and so forth.
The last problem can be solved easily, one just needs to remove the event listener after it has been called:
var contentPanel = getBrowser().selectedBrowser;
contentPanel.addEventListener(
'load', function(event) {
if(event.target != contentPanel.contentDocument)
return;
contentPanel.removeEventListener('load', arguments.callee, true);
alert(contentPanel.contentDocument.title);
}, true);
contentPanel.contentDocument.location.href = 'file:///tmp/test.html';
(arguments.callee is just a reference to the function from within the
function itself.)
Two problems are left: verbosity and a lack of sequentiality. Both can be solved by factoring out the event handling code, and from it passing the relevant data (in this case, the document) back to our code:
function withDocumentLoaded(url, action) {
var contentPanel = getBrowser().selectedBrowser;
contentPanel.addEventListener(
'load', function(event) {
if(event.target != contentPanel.contentDocument)
return;
contentPanel.removeEventListener('load', arguments.callee, true);
action(contentPanel.contentDocument);
}, false);
contentPanel.contentDocument.location.href = url;
}
withDocumentLoaded(
'http://www.google.com', function(document) {
alert(document.title);
});
Although the withDocumentLoaded() function does not look pretty, it
can be written just once and it is general. (It can be made even more
general by accepting the content panel as a parameter instead of
assuming the programmer wants the currently selected browser.)
The call to withDocumentLoaded() is a significant improvement over
both the setTimeout() hack, because it works precisely and reliably,
and the more verbose example, because it can be called many times
without duplication and expresses the operation clearly and
sequentially.
The pattern is not limited to DOM events, but can put a
sequential-looking face on processes that rely on nsIObserver,
nsIWebProgressListener, and other asynchronous interfaces.
Summary
Operations that return before they have finished can be surprising, but there is good reason for them to work this way, and they can be dealt with by careful use of events. Still, an appearance of succinct sequentiality is important for understanding the program flow, and can be gained by factoring event listening and coordination in a separate function, which ultimately calls back the programmer’s code.
Related readings
- Inversion Of Control on Martin Fowler’s site. Passing actual code to the event handling code so that the latter decides when the former is run is a form of inversion of control.
- The
with-*style is very popular in Lisp, see for example the Common Lisp Hyperspec master index.
Okay, this is the first example I've seen that comes close to addressing my problem - I keep getting "about:blank" returned as the URL when using XUL in my extension in FireFox.
Obviously I can't use the last instance of your solution because rather than replace the existing URL, I want to simply know what the URL is (the problem many folks are facing is the issue you have raised here - the calls to get the URL information are non-sequential).
Any suggestions about how your last piece of code could be (correctly) adapted to use with a pre-existing URL?
I like the Match Question capcha by the way... not seen that before. ;-)
Post new comment