Securing jQuery against unintended XSS

We, like an overwhelming majority of the Internet, use jQuery on the Box web application. We use it primarily to make our lives easier as it effectively abstracts away cross-browser API differences. The plugin infrastructure makes it easy to extend and fill in any gaps we may have. For all that jQuery offers, though, there is one downside: jQuery makes XSS easier than if you use native methods.

The Problem

The main problem is that jQuery will evaluate <script> tags when an HTML string is passed into it. For example, the following will display an alert:
$('#myDiv').html('<script>alert("Hi!");</script>');
However, do the same thing using the browser's native innerHTML, and nothing happens:
var myDiv = document.getElementById('myDiv');
myDiv.innerHTML = '<script>alert("Hi!");</script>';
That's because the native browser functionality is to ignore <script> elements that added via an HTML string. It's a security feature and one that jQuery effectively circumvents to introduce a new security hole. This same security hole exists in all places that accept HTML in jQuery. Here's a sampling:
$('#myDiv').html('<script>alert("Hi!");</script>');
$('#myDiv').before('<script>alert("Hi!");</script>');
$('#myDiv').after('<script>alert("Hi!");</script>');
$('#myDiv').append('<script>alert("Hi!");</script>');
$('#myDiv').prepend('<script>alert("Hi!");</script>');
$('<script>alert("Hi!");</script>').appendTo('#myDiv');
$('<script>alert("Hi!");</script>').prependTo('#myDiv');
The jQuery team is aware of the problem but, as with any popular API, it's hard to make a change that breaks backwards compatibility. There are a lot of applications that depend on this functionality to work, so what can you do?

Securing html() and friends

The first few methods, html(), before(), after(), append(), and prepend() all basically do the same thing in different ways: they take an HTML string and insert it into the DOM. Methods other than html() also accept a selector string, DOM element, or jQuery reference as well, but those are less problematic. We decided to treat all five of these methods the same way - we overwrite them with a custom function that filters the input and then passes the result into the original function. The code looks like this:
// our custom purifier
function purifyHTML(html) { 
/* code that returns purified HTML */ 
}

// store references to original methods
var original = {
html: $.fn.html,
before: $.fn.before,
after: $.fn.after,
append: $.fn.append,
prepend: $.fn.prepend
};

/**
 * Purifies each item in an array. Any array item that is a string is
 * purified and all others are passed through as-is. If an array item
 * is an array, then recursively purify that too.
 * @param {Array} values The values to check.
 * @return {Array} A new array with purified strings.
 */
function purifyArray(values) {

var purified = [];

for (var i= 0, len=values.length; i < len; i++) {

// if it is an array, check the array items
if ($.isArray(values[i])) {
purified.push(purifyArray(values[i]));
} else {
// if the argument is a string, purify it
purified.push(
(typeof values[i] === 'string') ?
purifyHTML(values[i]) :
values[i]
);
}

}

return purified;
}

/**
 * Creates a function that purifies any string elements and then calls the
 * original method.
 * @private
 * @param {String} originalName The original method name to call.
 * @return {Function} The function to take the original's place.
 */
function createPurifiedFunction(originalName) {
return function() {
return original[originalName].apply(this, purifyArray(arguments));
};
}

// purified versions of native jQuery methods
$.fn.extend({
html: createPurifiedFunction('html'),
before: createPurifiedFunction('before'),
after: createPurifiedFunction('after'),
append: createPurifiedFunction('append'),
prepend: createPurifiedFunction('prepend')
});
With this code applied, the five methods all have their arguments passed through out custom purifyHTML() function. This function reads in the HTML and then returns a version that has been purified - meaning that any offensive code has been removed or otherwise rendered inert. The createPurifiedFunction() function is used to produce the same basic pattern for each method, which simply intercepts the arguments and purifies any strings that are found. The purifyHTML() method is smart enough to not touch CSS selectors, so we don't bother filtering those out. Each method also accepts any number of arguments, and any argument can be an array, so the purifyArray() function recursively purifies arrays. The created replacement for the jQuery methods pass the purified arguments into the original method, preserving the standard jQuery behavior.

Securing prependTo() and appendTo()

The prependTo() and appendTo() methods are a bit of a different story because you don't pass HTML directly into them. Instead, they are most problematic when HTML is passed into $(). Under the covers, jQuery takes that HTML and converts it into a DOM - meaning that the dangerous part happens long before prependTo() or appendTo() is ever called. To secure this took a bit of digging, but upon inspecting the jQuery source code, we found this:
// from jQuery
jQuery = function( selector, context ) {
// The jQuery object is actually just the init constructor 'enhanced'
return new jQuery.fn.init( selector, context, rootjQuery );
},

// 
It turns out that jQuery.fn.init() really does the heavy lifting in this case. So all we really needed to do what overwrite this method to intercept the first argument (the only one that can have HTML). We did so like this:
var original = {
init: $.fn.init
};

$.fn.extend({
init: function(text, context, rootjQuery) {

if (typeof text === 'string' && text.indexOf(" -1) {
text = purifyHTML(text);
}

return new original.init(text, context, rootjQuery);
}
});
In this case, we just purify the first argument if it's a string and has a less-than symbol inside. Then, we continue on to call the original method.

Conclusion

jQuery is a great tool but has some serious security concerns related to inserting new HTML into the DOM. We've gone through several iterations of purifying before we arrived at this approach. We hope that future versions of jQuery will make it easier to filter HTML that is passed into any method to more easily and completely lockdown XSS attack vectors. As it stands, we had to do a bunch of digging into the jQuery source to make a solution that covered all of the cases we wanted to cover. We'll need to repeat this process every time we upgrade to make sure there are no new ways that HTML is being processed. We've already started discussions with the jQuery team around making this easier in the future and hopefully we'll have some positive movement around this area soon.

Further Reading

  1. Secure Coding Practices by David Tong