Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use markupsafe for escaping? #107

Open
aarondewindt opened this issue Jan 17, 2021 · 2 comments
Open

Use markupsafe for escaping? #107

aarondewindt opened this issue Jan 17, 2021 · 2 comments

Comments

@aarondewindt
Copy link

aarondewindt commented Jan 17, 2021

Because of issue #106 I forked this project and changing it so it uses markupsafe for escaping. This not only allows marking safe text, but it also deals with any object with an __html__ function. This function is used by a some projects out there instead of _repr_html_. markupsafe can also deal with the python2/3 string type shenanigans, so it'll help with issue #63. Although I think IPython will still be needed for the event stuff.

The issue is that I don't think I can make these changes completely backwards incompatible. At this point I think I only have two options, neither of which are backwards compatible.

1. Escape text when rendering (in _repr_html_)

The issue here is that the DOM will have both safe and unsafe text, which is not an issue as long as they stay as python objects. The problem is when they are converted to JSON. The safe marking will be lost, since both string types will be written as is into json.
One way to solve this is by escaping all text in to_dict() and then assume all text handled in from_dict() is already escaped. However the current implementation assumes the text passed to from_dict() is unescaped.

2. Escape text and objects while initializing a VDOM instance

The advantage is that I can then always assume that all text in the DOM is already escaped. It also makes it possible to safely handle objects implementing __html__ (and possibly _repr_html_), since they will be evaluated at that instance instead of at a later moment when the object could have changed.
The issue is again with the dictionary and JSON serialization. The current implementation of from_dict() assumes the text is unescaped.

Plan

I would like to take the second approach. The reason is so it's able to reliably handle any object implementing __html__ and _repr_html_.

Any opinions?

@aarondewindt
Copy link
Author

Ok, this is more complicated than I thought. From what I can see Jupyter uses the dictionary to render the VDOM objects instead of the result from .to_html(). It does this by using the @nteract/transform-vdom npm package, whose repository link is broken. My guess is, this project (vdom) is dead by now?

So in order to apply my changes I would have the modify @nteract/transform-vdom so it assumes strings are already escaped or expand the vdom spec to have a marker for safe text.

For now I'll tread my fork as a separate project and remove 'application/vdom.v1+json' from _repr_mimebundle_, which I presume breaks all event related features.

@rgbkrk
Copy link
Member

rgbkrk commented Mar 3, 2021

@nteract/transform-vdom is now inside the outputs repository (not part of the nteract/nteract monorepo): https://github.com/nteract/outputs/tree/master/packages/transform-vdom

I've stepped away from these repos for a bit so you'll have to bear with me as I'm diving back in again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants