-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds sanitize_html
, a whitelist based HTML sanitizer.
#171
Conversation
I am unsure how to fix this. |
* * attribute_whitelist_json: a json_encode()'d list of HTML attributes to allow in the final string. | ||
* * tag_whitelist_json: a json_encode()'d list of HTML tags to allow in the final string. | ||
*/ | ||
#define rustg_sanitize_html(text, attribute_whitelist_json, tag_whitelist_json) RUSTG_CALL(RUST_G, "sanitize_html")(text, attribute_whitelist_json, tag_whitelist_json) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semicolon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand, am I missing something here?
* * attribute_whitelist_json: a json_encode()'d list of HTML attributes to allow in the final string. | ||
* * tag_whitelist_json: a json_encode()'d list of HTML tags to allow in the final string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interface should take a list and json_encode in itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do this so that you can store pre-encoded global lists to save on perf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that mean it's encoding on every call? The thing is that this will likely be called many times with only one or a few lists, so this introduces extra overhead.
.link_rel(Some("noopener")) // https://mathiasbynens.github.io/rel-noopener/ | ||
.url_schemes(prune_url_schemes) | ||
.generic_attributes(attribute_whitelist) | ||
.tags(tag_whitelist) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't it make sense to keep this around rather than build it anew on every invocation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have to hash the arguments and such and that's out of my skill set presently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks about right.
looks about right :+2: |
mods? mergies? @ZeWaka |
Adds a customizable HTML sanitizer function using the Ammonia crate. Out of the box, it will:
By providing json encoded lists, you can whitelist given attributes or tags to not be pruned. I have included a curated tag list in the dm source file for this module that will whitelist most safe CSS attributes.
It occured to me that alot of servers run things like old papercode, which does not sanitize on the server side before being viewable by a client. Sanitizing strings with DM would be an absolute performance nuke, assuming you could even make it bulletproof in the first place.
Here is a recommended default tag whitelist