Sunday, February 8, 2015

An Introduction to Duck Typing using Arrays and Collections

Previously I discussed the ES 5 Properties feature and its relationship to the Browser DOM. The post focused on how useful it is when you can wrap, replace and otherwise configure your entire DOM, the way you want it, and the Browser will simply obey your commands. While this is possible in FireFox and IE, it isn't yet possible in Chrome. So this post will instead focus on something that is more universally available, duck typing the array methods and using them with DOM collections.

Array methods

You can interrogate the Array methods by going through the constructor to get the prototype. In our case, lets look at Array.prototype by using Object.getOwnPropertyNames. This will tell us all of the supported Array methods and properties. Its actually quite a large number, and many of them are very helpful.
["length", "constructor", "toString", "toLocaleString", "join", "pop", "push", "concat", "reverse", "shift", "unshift", "slice", "splice", "sort", "filter", "forEach", "some", "every", "map", "indexOf", "lastIndexOf", "reduce", "reduceRight", "entries", "keys"]
Most of them are useful for initial iteration, and most of them, once invoked the first time, will return a new array, allowing you to chain operations together. Combining map's, filter's and reduce operations can solve a large set of problems. It would be great if you could use those on HTMLCollection and other DOM types that aren't themselves Array's.

HTMLCollection

So what is an HTMLCollection? In the words of Dr Frankenstein, "It's Alive!!!". Unlike most collections that only mutate if you add/remove items from them, the HTMLCollection is a live view of the DOM. For instance, if you call getElementsByTagName("p") to get a collection back, then modify the DOM, your returned collection will already be updated to include the modifications made. No need to query getElementsByTagName again. You can see that here in this fiddle. Note, I also added in StaticNodeList using querySelectorAll to show the differences between a live and static collection. In the case of querySelectorAll, the returned collection does not automatically resolve new DOM mutations and the query must be re-made to get the latest results.

The next bit about HTMLCollection is that it inherits from Object. So by default, it does not get Array methods. It also has its own definition of the length property backed by the Browser itself and in the case of IE and FireFox this is a true ES 5 property, not a field. So it "looks like" an Array but is itself, not an array. If it "looks like" an Array is that good enough?

Executing Array methods on HTMLCollection

For this step we can use the call, apply or bind methods to redirect any Array method to use our own object as the this pointer. Further, because those methods are duck typed, and simply work with any object that "looks like" an Array, they'll function just as if you had used an actual Array.

So that is pretty neat, and you'll notice some inefficiencies are avoided by this approach. First, the length property is only read at the beginning of the forEach loop. So if you modify the DOM and add more things, you are guaranteed that your loop body will exit anyway since it doesn't fetch length on each iteration. This is similar to lifting the length call out of your for loops, which is common practice among seasoned web developers. Since forEach is a built-in, there may also be JIT time optimizations that would apply that may not apply when using your own for loop (though the reverse is also true so measurement in your scenario can be critical).

Some other common pit-falls are not avoided. For instance, if you mutate the DOM in a way that changes the contents ahead of you in the iteration, then you will see those changes. Including, if someone inserts an element, then you'll process that element, but you will miss an element now beyond your length.

This later issue is easy to avoid. You can turn any HTMLCollection into an Array. The number of ways is staggering, but the easiest might be to invoke the Array constructor passing each element as an argument. But how do you unwrap the collection? You don't need to, it is "like an array" so you can use it with the apply method. (Note: This is in the absence of Array.from which is an ES 6 API)

Browser Built-Ins

So, most JavaScript built-ins can work on a wide range of objects. That is due to their duck typed nature and also due to their being part of the programmable fabric that underlies script engine instance itself. As we start to build out APIs in the Browser itself, we have to ask how useful those APIs will be. The stringifier on Location for instance returns the current URL. This is useless unless you ARE an instance of Location because that built-in queries some built-in state. So the Browser introduces an additional level of restriction based on the types of objects. It validates the "this" pointer and if we can't find an object on which our method can execute we throw an exception. In IE we call this the "Invalid Calling Object" exception. Its pretty much just a TypeError to help you know we were unable to process your call. Chrome will throw an "Illegal Invocation" exception. Everyone has their own way of noting this situation. You can test this by invoking the postMessage API on a non-window this pointer. We are telling you, your object is not "window like" enough for us to continue. In this case, the message ports and underlying internal event pump can't be retrieved from the object you provided ;-)

This leads to an interesting paradigm though. As long as the underlying object can resolve the instance of check that we throw at it, then the methods can technically be used. Can you use this to your advantage? Of course. All elements in the IE world derive from the same top level class. That includes SVG, XML, XHTML, etc... This means that an HTMLElement prototype method can technically be used on an SVG or XML element type because the DNA is shared between the objects. So let's move the popular classList API from HTMLElement down to Element and then use it on some SVG objects.

Conclusion

For Interop and Compat reasons the initial shape of the DOM is sort of fixed to the expectations of the general web. But that shouldn't stop you from molding it to the needs of your application. While removing things from the DOM can cause mashable components to stop working, adding things can often cause them to START working, especially if there is a lack of Interop in that area or if the APIs in question are new. Also, some of the mutations you might make, may already be in scope for the Browser vendors. If you look at classList for instance, most browsers already support this on the Element prototype, so you can imagine any remaining Browser's might follow suit there.

Saturday, January 31, 2015

ES 5 Properties and the Browser Object Model

In ES 5 a very powerful concept was added to the JavaScript language. The concept was properties. A property is a getter/setter pair of functions that is executed depending on the syntax used when working with the property. This is very similar to other languages such as C# and some C++ CX extensions. Actually it is very similar to almost every other language since properties are kind of a mainstream thing.

Behind the scenes, even in ES 3, the language itself had the capability built in (in the form of something we called built-ins ;-). You probably saw this when you manipulated the length property of an Array instance. You could see a side effect by setting the property versus simply retrieving it. Behind the scenes two different pieces of code were running. Also, in the browser, all of the DOM objects behaved the same, as if they had separate code running when you performed a get versus a set.

But the vanilla objects you wrote yourself didn't have this cool functionality. In fact, it was non-trivial to determine when your objects changed. For this reason, tons of patterns were introduced to simply have two named functions such as getFoo and setFoo along with a backing field to store the actual value. If the value was computed then the backing field wasn't necessary. This still led to unnatural syntax though as the familiar property syntax no longer worked and instead you had to perform a function call. But still, only for vanilla objects, not for the script engine OR the browser...

Fields versus Properties

In ES 5 your vanilla object was configured as a set of fields. Each name represents a storage location on the object that can store any other JavaScript object. That might be a number, string, function or other object. The platform and browser on the other hand only had properties with no concept of a field. Fields, it turns out, had much better performance (memory accesses) versus properties (find and execute code to do memory accesses). So something really needed to change. Since really we needed fields and properties and we needed both concepts everywhere.

Enter ES 5 and IE 9. When implementing ES 5 in the browser, we worked closely with the Chakra team who was upgrading their object model as well. Instead of built-ins, we instead stored our type system in property descriptors, another ES 5 concept and could now create fields when performance was critical and there were no side effects when setting the value. We could also wrap our getter and setter in a pair of JavaScript function objects.

A side benefit of the property descriptor approach is that we also enabled features like configurability (changing the getters/setters for a property or swapping it to a field), sealing (changing the extensibility and locking objects down), and most importantly, wrapping. In actuality the most IMPORTANT reason for web developers and the primary reason we did all of this, was to enable proper wrapping, which is a critical component for proper extensibility.

Wrapping and Extensibility

So what is this wrapping thing and why does it matter? It matters because the browser has a default behavior for a property that already exists. Take something like the <img> tag which as a src property. It also has an attribute accessible through setAttribute, but for now lets pretend that doesn't exist. The only way to change the source of an image is to use the src property.


So now imagine that we want to sniff and provide some new functionality for the src property? What if we wanted to find some way to simulate the data protocol or some new protocol by rewriting the underlying src to something else? Well, to do that we can just configure the property. We can configure the property for all images or just for the instance. The following demonstrates how to do this for IE. FireFox should work here, but Chrome will not since they define "src" to be a field descriptor on the instance of the image. Ouch, that's two major infractions making it hard for developers to provide their own customized functionality.


Notice how we just BROKE images ;-) Wouldn't it be nicer if we could just get the browser's default implementation and use that instead? Basically just do a pass through from our function to the original browser function. Also, if someone else comes along, since we've added our getter/setter functions to the object or prototype the wrapping can work many levels deep allowing true composition of the object model. Here is the last example for wrapping that puts everything together.


More Type System Benefits

There are many more benefits to this type system construction. For instance, in IE today, as of IE 9 mode, you can inspect our entire type system. We don't hide anything. By walking the prototype objects and constructors on the window object you can discover every method and property that we support. You can determine if they are read-only or writable. And there is basically nothing that is defined on the instances, meaning you can easily rewrite the entire type system and create the object model that you want for your site.

To see how powerful this is we can write some code that walks the entire IE DOM and determines how many browser specific features we implement. A similar script is how I created the data for this Tweet showing our reduction in ms properties and our increase in webkit properties.


I was hoping to do that for Chrome as well and show the interoperability convergence of our two browsers, but unfortunately putting everything on instances defeats this type of simple inspection.

ES 6 Futures

As we move towards ES 6, web developers are going to discover properties and their useful characteristics because we are making them easier and easier to use. New syntax in the language removes the structured data syntax that you have to write today in combination with defineProperty/defineProperties. TypeScript has offered some of this for a while, but native support in the browser without intermediate compilers is really required for broader adoption.

It will become increasingly more important for the browser defined objects to behave precisely like the native script objects. Divergences like what we have in Chrome where everything is a field on the instance, no matter what the reasoning might be, will simply be unacceptable. Both FireFox and IE have set the bar here and produced a highly extensible object model and proven it could be done.

Saturday, January 10, 2015

The Perl Black Book and Adding Value through Curated Content

I've been holding off for some time wondering how I was going to start this particular blog. After all, its called the HTML 5 Black Book. Its meant to tell hundreds if not thousands of tiny little details about the JavaScript language, HTML 5, CSS 3 and other web technologies. The inspiration for this blog is also quite daunting. Its the Perl Black Book which undertook to explain Perl bit by bit in 1 page articles and samples. Over the course of a thousand or more of these articles you quickly grasp how to get stuff done in Perl, without having to read extensive chapters on very specific subjects. Its truly an amazing book and for me it got me from 0 to Hero on Perl over the course of a few months. Well that, and the fact that I wanted better log parsing for my Half Life server. Never underestimate how much learning can transpire when games are involved.

That all aside, the concepts that make the Black Book so successful are the brevity of the articles and the fact that from each you learn something. They in turn build on each other so that all of the small things you've learned build on each other until you have a working understanding. So that is precisely where and how I'll start for the layout of my content.

For the content itself, just as I've been posting things that I'm learning about on my personal blog, here I'll post more on things I tend to know something about or have a strong opinion for. Things that should help you evolve your prowess in using HTML 5. For example, some areas that I find are not well covered when I browse the web or review questions on Stack Overflow:
  1. Code organization and protections that are now available to ES 5 and where possible how they can degrade gracefully to ES 3 (though most of us should not worry about such degradation).
  2. Useful poly-fills for new web features that still have not achieved a great deal of interoperability.
  3. Various performance notes that I've collected over time.
  4. And hopefully some real life interactions that I have with people around me on the subject. In fact if you run into something interesting that you think deserves a black book entry then contact me since I'd be interested in covering it.
I'm also looking to fill out a few key references pages so if you have some strong recommendations for people whom I should follow or reference materials that are just too good to pass up then please feel free to forward those along. An example, great page, might be the Learning WebGL Lessons though I'll likely share links like that on my HTML 5 Game Gems blog instead ;-)