Skip to content

Using the Library: Custom Whitelists

Kevin Cheng edited this page Jan 20, 2018 · 3 revisions

The default whitelist for tags and attributes are based on the w3school's exhaustive HTML Element Reference page. Only those tags and attributes deemed dangerously prone to xss vulnerabilities are excluded, for example the SCRIPT tag or the onclick attribute. If this list is too exhaustive for a specific need, it is possible to define a custom list. There are two ways to accomplish this.

The first way is for special one-off needs. The SanitizeHtml() function supports overloads, accepting List<String> values containing your custom lists. When using custom whitelists in this way, both the tags and attributes need to be defined.

Given the below sample, the default whitelists are used.

String inputValue = "<a href=\"www.google.com\">Click Me</a>";
String cleanValue = inputValue.SanitizeHtml();
Console.Writeline(cleanValue);

The output is

<a href="www.google.com">Click Me</a>

With the following custom whitelists (note the new arguments passed to the function)

var myTags = new List<String>() { "a", "strong", "p" };
var myAttributes = new List<String>() { "href", "src" };

String inputValue = "<a href=\"www.google.com\">Click Me</a>";
String cleanValue = inputValue.SanitizeHtml(myTags, myAttributes);
Console.Writeline(cleanValue);

The output is still the same, because the tag a is whitelisted, as well as the attribute href.

<a href="www.google.com">Click Me</a>

If the custom attributes whitelist has no elements

var myTags = new List<String>() { "a", "strong", "p" };
var myAttributes = new List<String>();

String inputValue = "<a href=\"www.google.com\">Click Me</a>";
String cleanValue = inputValue.SanitizeHtml(myTags, myAttributes);
Console.Writeline(cleanValue);

then the output becomes <a>Click Me</a> because while the a tag is whitelisted, there are no whitelisted attributes, hence in effect all attributes are rejected.

Given the modified sample below

var myTags = new List<String>() { "a", "strong", "p" };
var myAttributes = new List<String>() { "src" };

String inputValue = "<a href=\"www.google.com\">Click Me</a>";
String cleanValue = inputValue.SanitizeHtml(myTags, myAttributes);
Console.Writeline(cleanValue);

the output is still <a>Click Me</a>, because while we have given a custom attribute, href is not included.

Checking Attributes for Scripts

Another overload allows defining the attributes to be inspected for the presence of known scripting patterns. For example, the href attribute is typically used to define the target url of an a tag, but can also contain Javascript. The link is desirable, but the script is not. Telling MarkupSanity the list of attributes that should be inspected adds an additional check whenever these attributes are found.

var myTags = new List<String>() { "a", "strong", "p" };
var myAttributes = new List<String>() { "src" };
var myScriptableAttributes = new List<String>() { "href", "src" };

String inputValue = "<a href=\"www.google.com\">Click Me</a><a href=\"javascript=\"alert('gotcha!');\"\">Click me too</a>";
String cleanValue = inputValue.SanitizeHtml(myTags, myAttributes, myScriptableAttributes);
Console.Writeline(cleanValue);

The above example returns <a href="www.google.com">Click Me</a><a>Click me too</a>. The first a passes completely because the href is not a script, but the second a has its href removed because it contains a script.

A Warning

Defining a custom attributes whitelist can bypass the removal of tags like script and event attributes like onclick if these are added to the whitelist. This gives the flexibility of providing exemptions for special cases where these are desired.