Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GHSA-5jfw-gq64-q45f] HTML Cleaner allows crafted scripts in special contexts like svg or math to pass through #5031

Open
wants to merge 1 commit into
base: byt3n33dl3/advisory-improvement-5031
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,12 @@
"aliases": [
"CVE-2024-52595"
],
"summary": "HTML Cleaner allows crafted scripts in special contexts like svg or math to pass through",
"details": "### Impact\n\nThe HTML Parser in lxml does not properly handle context-switching for special HTML tags such as `<svg>`, `<math>` and `<noscript>`. This behavior deviates from how web browsers parse and interpret such tags. Specifically, content in CSS comments is ignored by lxml_html_clean but may be interpreted differently by web browsers, enabling malicious scripts to bypass the cleaning process. This vulnerability could lead to Cross-Site Scripting (XSS) attacks, compromising the security of users relying on lxml_html_clean in default configuration for sanitizing untrusted HTML content.\n\n### Patches\n\nUsers employing the HTML cleaner in a security-sensitive context should upgrade to lxml 0.4.0, which addresses this issue.\n\n### Workarounds\n\nAs a temporary mitigation, users can configure lxml_html_clean with the following settings to prevent the exploitation of this vulnerability:\n* `remove_tags`: Specify tags to remove - their content is moved to their parents' tags.\n* `kill_tags`: Specify tags to be removed completely.\n* `allow_tags`: Restrict the set of permissible tags, excluding context-switching tags like `<svg>`, `<math>` and `<noscript>`.\n\n### References\n\n* https://github.com/fedora-python/lxml_html_clean/pull/19\n* https://github.com/fedora-python/lxml_html_clean/pull/19/commits/c5d816f86eb3707d72a8ecf5f3823e0daa1b3808\n",
"summary": "LXML HTML Cleaner allows crafted scripts in special contexts like svg or math to pass through",
"details": "### Impact\n\nThe HTML parser in lxml-html-clean fails to handle context-switching for certain special HTML tags (<svg>, <math>, <noscript>) in a manner consistent with web browser parsing standards. This discrepancy allows malicious actors to exploit the vulnerability to craft HTML payloads that bypass the cleaning process, leading to Cross-Site Scripting (XSS) vulnerabilities.\n\nThe issue arises specifically in the treatment of CSS comments and embedded scripts. While browsers correctly ignore these when context dictates, lxml-html-clean inadvertently processes them. This discrepancy permits attackers to insert malicious scripts, posing a significant security risk for users relying on `lxml-html-clean` for sanitizing `untrusted` HTML content.\n\n## Exploit Scenario\n\nAn attacker could craft an HTML payload containing obfuscated scripts embedded within <svg> or <math> tags, bypassing the cleaning logic due to improper handling of context-switching by the lxml-html-clean library. A possible payload might look like this\n\n```\n<svg><style><!--/*<![CDATA[*/body{background:url(javascript:alert('XSS'))}/*]]>*/--></style></svg>\n```\n\nWhen this payload is passed through lxml-html-clean, the malicious script may not be sanitized properly. Upon rendering in the browser, it triggers the embedded JavaScript, enabling attackers to execute arbitrary scripts in the victim's browser.\n\n### Patches\n\nUsers employing the HTML cleaner in a security-sensitive context should upgrade to lxml 0.4.0, which addresses this issue.\n\nand Users should upgrade to version 0.4.0 or later, which implements stricter handling of special HTML tags and aligns the library's behavior with web browser parsing standards.\n\n### Workarounds\n\n```\ncleaner = Cleaner(allow_tags=['p', 'a', 'div', 'span'], kill_tags=['svg', 'math', 'noscript'])\n```\n\nAs a temporary mitigation, users can configure lxml_html_clean with the following settings to prevent the exploitation of this vulnerability:\n* `remove_tags`: Specify tags to remove - their content is moved to their parents' tags.\n* `kill_tags`: Specify tags to be removed completely.\n* `allow_tags`: Restrict the set of permissible tags, excluding context-switching tags like `<svg>`, `<math>` and `<noscript>`.\n\n### References\n\n* https://github.com/fedora-python/lxml_html_clean/pull/19\n* https://github.com/fedora-python/lxml_html_clean/pull/19/commits/c5d816f86eb3707d72a8ecf5f3823e0daa1b3808\nGitHub Issue: [fedora-python/lxml_html_clean#19](https://github.com/fedora-python/lxml_html_clean/issues/19)\nGitHub Commit: [fedora-python/lxml_html_clean@c5d816f](https://github.com/fedora-python/lxml_html_clean/commit/c5d816f)\nCVE ID: [CVE-2024-52595](https://nvd.nist.gov/vuln/detail/CVE-2024-52595)\nOWASP XSS Prevention Cheat Sheet: OWASP\n\n\n## Additional Context\nWhy This Matters\n\nImproper sanitization of HTML content remains one of the most prevalent vulnerabilities in web applications. Attackers often exploit libraries with discrepancies in parsing logic to inject malicious code, leading to significant security breaches.\n\nThis vulnerability is especially critical in applications that:\n\n Accept and display user-generated content.\n Operate in multi-tenant environments.\n Have stringent security requirements, such as financial or healthcare systems.\n\nSimilar Known Issues\n\n Improper handling of <iframe> tags in older versions of sanitize-html.\n Inadequate sanitization of nested tags in the Bleach library prior to version 3.0.0.",
"severity": [
{
"type": "CVSS_V3",
"score": "CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:L/A:H"
"score": "CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H"
}
],
"affected": [
Expand Down Expand Up @@ -44,10 +44,18 @@
"type": "ADVISORY",
"url": "https://nvd.nist.gov/vuln/detail/CVE-2024-52595"
},
{
"type": "WEB",
"url": "https://github.com/fedora-python/lxml_html_clean/issues/19"
},
{
"type": "WEB",
"url": "https://github.com/fedora-python/lxml_html_clean/pull/19"
},
{
"type": "WEB",
"url": "https://github.com/fedora-python/lxml_html_clean/commit/c5d816f"
},
{
"type": "WEB",
"url": "https://github.com/fedora-python/lxml_html_clean/commit/c5d816f86eb3707d72a8ecf5f3823e0daa1b3808"
Expand Down