Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put data URL dedicated workers in their own agent cluster #5476

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

annevk
Copy link
Member

@annevk annevk commented Apr 21, 2020

Tests:

Fixes #5254.

  • At least two implementers are interested (and none opposed):
    • Firefox, I guess? This effectively falls out of prior discussion, but I'm not sure we all realized what we were in for.
  • Tests are written and can be reviewed and commented upon at:
    • See above.
  • Implementation bugs are filed:

(See WHATWG Working Mode: Changes for more details.)


💥 Error: Wattsi server error 💥

PR Preview failed to build. (Last tried on Jan 15, 2021, 8:00 AM UTC).

More

PR Preview relies on a number of web services to run. There seems to be an issue with the following one:

🚨 Wattsi Server - Wattsi Server is the web service used to build the WHATWG HTML spec.

🔗 Related URL

Parsing MDN data...
Parsing...



If you don't have enough information above to solve the error by yourself (or to understand to which web service the error is related to, if any), please file an issue.


/webappapis.html ( diff )
/workers.html ( diff )

@annevk
Copy link
Member Author

annevk commented Apr 21, 2020

cc @nhiroki @hiroshige-g

source Outdated

<li><p>Let <var>createsCluster</var> be true if <var>is shared</var> is true or <var>url</var>'s
<span data-x="concept-url-scheme">scheme</span> is "<code data-x="">data</code>"; otherwise
false.</p></li>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Longer term this should move to after the fetch happened so that the response can also force an opaque origin through sandboxing. For now it shouldn't matter (a redirect to a data URL results in a network error).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth including the latter part in a note. E.g.

It is OK to check the request URL url here, instead of waiting for the fetch to complete and checking the response URL, because it is impossible to redirect to a data: URL.

@annevk
Copy link
Member Author

annevk commented Apr 21, 2020

Question: should we update the examples in the agent clusters section to talk about non-data-URL-dedicated workers and data-URL-dedicated workers or some such? Or just leave them be as they are generally correct?

@annevk annevk added the needs implementer interest Moving the issue forward requires implementers to express interest label Apr 21, 2020
annevk added a commit to web-platform-tests/wpt that referenced this pull request Apr 21, 2020
@annevk
Copy link
Member Author

annevk commented Apr 21, 2020

Per web-platform-tests/wpt#23137 implementations do not implement this as written (Chrome and Firefox fail both tests). However, that means different sites end up in the same agent cluster, which is somewhat fundamentally wrong. Or does anyone want to argue that data URLs are special enough that they can be an exception to that rule?

cc @whatwg/security

@annevk

This comment has been minimized.

Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with improvement, but it sounds like we should get some implementer discussion first...

source Outdated

<li><p>Let <var>createsCluster</var> be true if <var>is shared</var> is true or <var>url</var>'s
<span data-x="concept-url-scheme">scheme</span> is "<code data-x="">data</code>"; otherwise
false.</p></li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth including the latter part in a note. E.g.

It is OK to check the request URL url here, instead of waiting for the fetch to complete and checking the response URL, because it is impossible to redirect to a data: URL.

@nhiroki
Copy link
Contributor

nhiroki commented Apr 23, 2020

Let me comment as a Chromium implementer. This spec change looks reasonable. I'm interested in implementing this change, but not sure if I or someone in my team can start working on this soon. Anyway, I created a crbug.

@annevk
Copy link
Member Author

annevk commented Apr 24, 2020

@hiroshige-g posted in w3c/webappsec-csp#279 suggesting that Chrome already supports sandboxing in workers. Properly specifying that would be helped by #5362. Because then the model really needs to be

  1. Go to in parallel land
  2. Fetch something
  3. Based on the response, instantiate a new agent/agent cluster.

Is everyone okay with landing this still and deferring that to a follow-up? I think Chrome/Firefox both want all of this so implementation support isn't lacking, but I'm not sure I want to sign up for all of this work myself.

For data URL dedicated workers I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1632837 and for sandboxed workers I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1632840.

cc @elkurin

@annevk annevk removed the needs implementer interest Moving the issue forward requires implementers to express interest label Apr 24, 2020
@annevk annevk requested a review from domenic April 24, 2020 13:47
@nhiroki
Copy link
Contributor

nhiroki commented Apr 27, 2020

How do we treat Worklets that have an opaque origin but should be in the owner's agent cluster? (e.g., AudioWorklets need to be in the same agent cluster to access SharedArrayBuffer)

@annevk
Copy link
Member Author

annevk commented Apr 27, 2020

That's a good question. I don't think they need to have their own origin as they do not have access to networking or storage. I think I did push for it at the time, but I had not really considered what the proper interaction between agent clusters and origins ought to be.

However, we might also want to consider the different types of worklets differently. For worklets that do not have messaging, it might well make sense to define them as being in their own agent cluster, to give maximum flexibility to implementations.

To bring this back to dedicated workers, while I could see putting data URL dedicated workers in the same agent cluster as their owner, I cannot see that for sandboxed dedicated workers. And I think we want data URL and sandboxed dedicated workers to be more or less aligned, but if you see that differently I could be pursuaded I suppose.

@wanderview
Copy link
Member

That's a good question. I don't think they need to have their own origin as they do not have access to networking or storage. I think I did push for it at the time, but I had not really considered what the proper interaction between agent clusters and origins ought to be.

If we make worklets no longer opaque-origin, then I think we would need to do something about clients.matchAll(). In theory they should start showing up in that list if we changed them to be same-origin.

I also seem to recall that the discussion around whether to treat worklet script loads as subresource requests of the parent or as non-subresource loads hinged on this. At least for myself I was ok with them being subresource loads since they would not show up as clients.

@annevk
Copy link
Member Author

annevk commented Apr 27, 2020

I think worklets should remain subresources and therefore not show up there. I don't think that's related to their origin.

@wanderview
Copy link
Member

Well, I've always thought whether something created a global as being more relevant to the "subresource vs non-subresource" question. Is something creates a global its pretty clearly not a subresource to me. I gave into that point on worklets because they were opaque origins and couldn't be observed by service workers.

For example, if we change the origin, it seems really weird now that worklet create a global but don't get their own service worker controller.

And then if we make that exception I start wondering about all the other things that get inherited to subresources (CSP, etc) that we have agreed do not get inherited to dedicated workers because they are not subresources, etc.

@annevk
Copy link
Member Author

annevk commented Apr 27, 2020

I think for dedicated workers we could have gone both ways, but I think consistency with shared workers makes more sense for them as otherwise folks get really surprised with shared workers.

Tests:

* There should be a test for this where you create a data URL worker and try to message it SAB and WebAssembly.Module and check that it results in messageerror events.
* web-platform-tests/wpt#21146 and web-platform-tests/wpt#22928 test that inside a data URL worker you can share memory with a further nested data URL worker that is same-origin with the opaque origin of the data URL worker.

Fixes #5254.
@annevk annevk force-pushed the annevk/data-url-dedicated-workers branch from 12842ad to 70a35c4 Compare March 11, 2023 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Data URL dedicated worker needs its own agent cluster
4 participants