Add explainer for FileSystemHandle.getUniqueId() #46

a-sully · 2022-08-10T22:49:10Z

At least two implementers are interested (and none opposed):
- Chrome
- …
Tests are written and can be reviewed and commented upon at:
- https://github.com/web-platform-tests/wpt/blob/master/fs/script-tests/FileSystemBaseHandle-getUniqueId.js
Implementation bugs are filed:
- Chrome: https://crbug.com/1342961
- Firefox: …
- Safari: …

domenic · 2022-08-11T03:43:36Z

Minor nit: should be Id, not ID. https://w3ctag.github.io/design-principles/#casing-rules

martinthomson

Why "Id" and not "Uri"?

UniqueID.md

a-sully · 2022-08-17T01:09:57Z

Why "Id" and not "Uri"?

This identifier the browser provides is not "universal", since it will provide different strings to different sites for the same file. "Id" is a bit more generic, so I went with that.

Happy to consider other names for this method, though!

martinthomson · 2022-08-17T05:45:26Z

The "universality" in URIs does not uniqueness in correspondance, either from the reference to the referred object or the converse. file:///etc/passwd is universal, even if the same reference refers to different objects. And it is very common to see multiple references (or aliases) for the same resource on the web.

a-sully · 2022-08-20T01:50:02Z

I'm not well-versed in this debate so bear with me here, but my intuition is that "URI" implies it can be used to fetch the resource, which is not the case here. The browser provides the identifier, but after that it's only useful as a tool for the developer to structure their databases.

If we planned to expose a getHandleFrom{Id|Uri}() method, then I agree that "URI" might be the right term. But that's explicitly a non-goal of this proposal...

annevk · 2022-08-29T12:20:12Z

This repository is not a good place to host explainers I think. At least that's not something we've traditionally done for WHATWG standards.
If for some reason we wanted this field to be a URL, we shouldn't call it URI: https://url.spec.whatwg.org/#url-apis-elsewhere.

a-sully · 2022-08-29T16:13:52Z

This repository is not a good place to host explainers I think. At least that's not something we've traditionally done for WHATWG standards.

Happy to move the explainer elsewhere. Where would you prefer it be hosted?

If for some reason we wanted this field to be a URL, we shouldn't call it URI: https://url.spec.whatwg.org/#url-apis-elsewhere.

Since this method provides a completely opaque identifier, I would assume we don't want it called URL or URI? (which to me implies use for fetching)

annevk · 2022-08-31T07:43:59Z

I'm not sure. Maybe @domenic has thoughts on this. Personal repository seems fine to me. Eventually the examples and specification text should become part of the specification through a PR.

(As for URL vs ID, I was mainly replying to the earlier conversation. I'm not sure I see the use for a URL either.)

domenic · 2022-08-31T08:33:10Z

I personally think explainers are fine in WHATWG repos, e.g. we have them in whatwg/streams. They can be a crutch, however, leading to people avoiding putting the useful text into the spec itself; so please do avoid that failure mode.

IMO it should be up to the repo editors to decide on things like that.

jesup · 2022-08-31T18:27:42Z

So... "IDs should not be stable after clearing browsing data. In other words, this
should not be a truly persistent, unclearable identifier"

But hashing the output of resolve() provides a unique identifier already (a full path to the root of the OPFS store), and would remain the same regardless of clearing.

We should be clear about what we're buying with this, and what it actually accomplishes. Obfuscating the hash value here doesn't actually add any security/privacy, at least in the OPFS case. Adding restrictions that don't actually help users just complicates the implementations with no benefit. Are there any risks here?

Also, if we were to expose our internal IDs (with or without hashing/salting/etc) for objects, that would lock us into the existing ID scheme since these need to be stable; to be safe we'd implement an entirely independent ID scheme which could be guaranteed to be stable for all time (or we'd have to blow away older OPFS files on an internal ID scheme change). But that's more complexity yet (though we could defer the complexity there until we need to do the first change to the ID scheme, if ever)

domenic · 2022-08-31T23:43:59Z

Expose our internal IDs

If at all possible, I would recommend that the standard require IDs to be returned as v4 (random) UUIDs, which act as a facade over any internal IDs. This guarantees a cross-browser-compatible format which developers can't take any dependency on.

a-sully · 2022-09-01T23:36:45Z

You're correct that in the OPFS case this doesn't add anything. The scenarios where this does matter are for files outside OPFS, where this adds significant new capabilities in a privacy-preserving way (i.e. without exposing the full file path).

So... "IDs should not be stable after clearing browsing data. In other words, this should not be a truly persistent, unclearable identifier"

But hashing the output of resolve() provides a unique identifier already (a full path to the root of the OPFS store), and would remain the same regardless of clearing.

As mentioned in the explainer, it's currently not possible to distinguish between two files of the same name in different directories for files outside of the OPFS (since the site may not have access to a shared ancestor directory).

We should be clear about what we're buying with this, and what it actually accomplishes. Obfuscating the hash value here doesn't actually add any security/privacy, at least in the OPFS case. Adding restrictions that don't actually help users just complicates the implementations with no benefit. Are there any risks here?
Also, if we were to expose our internal IDs (with or without hashing/salting/etc) for objects, that would lock us into the existing ID scheme since these need to be stable; to be safe we'd implement an entirely independent ID scheme which could be guaranteed to be stable for all time (or we'd have to blow away older OPFS files on an internal ID scheme change). But that's more complexity yet (though we could defer the complexity there until we need to do the first change to the ID scheme, if ever)

The idea is that the hash is based on things that are stable: the handle type and path, the origin, and some salt that hangs around until the user clears site data. The biggest risk I see is that some future change (something analogous to the Origin -> StorageKey migration, say) would necessitate changing the ID to incorporate this new information... but users can invalidate the ID by clearing site data anyways. For sites using the ID in a client-side database, this is behavior they'll have to be resilient against regardless. So this doesn't feel like a huge risk?

Your concerns do make sense. That said, we do have a need for this for outside-OPFS scenarios. I appreciate your thinking through this with pragmatic suggestions like deferring the complexity until we do need to change the scheme. But before we commit to anything, do you have alternate suggestions?

a-sully · 2022-09-01T23:39:27Z

If at all possible, I would recommend that the standard require IDs to be returned as v4 (random) UUIDs, which act as a facade over any internal IDs. This guarantees a cross-browser-compatible format which developers can't take any dependency on.

Seems reasonable to me. I'll add this to the explainer soon if no one objects

jesup · 2022-09-30T16:50:40Z

random UUIDs could help avoid accidental exposure of info, and as mentioned avoid accidental dependence on a particular hashing format by developers using this.

Does this buy any advantage to spec and implement for OPFS? As you say,

You're correct that in the OPFS case this doesn't add anything. The scenarios where this does matter are for files outside OPFS, where this adds significant new capabilities in a privacy-preserving way (i.e. without exposing the full file path).

So what is the reasoning to add this to OPFS?

Is it "developers can assume getUniqueId() is available to use whether using OPFS or WICG FS Access?" Does that usecase matter to developers?

a-sully · 2022-10-04T18:30:58Z

Thinking about this a bit more, I do think there's value to this method within the OPFS. At least there might be, depending on what we decide getUniqueId() returns a unique ID to. As mentioned here #39 (comment), it's unspecified at this point whether a FileSystemHandle is conceptually a path or a reference...

If it's just a path, getUniqueId() provides no value within the OPFS, since calling resolve() on the root directory generates a unique ID (the file path).

If it's a reference, getUniqueId() could track a file even it's moved.

Is it "developers can assume getUniqueId() is available to use whether using OPFS or WICG FS Access?" Does that usecase matter to developers?

I can't speak for all developers, but I would imagine this does. At the very least it would be surprising if the method was only available for handles outside of the OPFS. By adding the getUniqueId() to the WICG spec, the method would be available on all handles, including those in the OPFS.

add explainer for unique ID

97bcadc

a-sully added the addition/proposal New features or enhancements label Aug 10, 2022

a-sully changed the title ~~Add explainer for FileSystemHandle::getUniqueID()~~ Add explainer for FileSystemHandle::getUniqueId() Aug 11, 2022

getUniqueID -> getUniqueId

0d30916

This was referenced Aug 12, 2022

Update security-privacy-questionnaire.md for getUniqueId() WICG/file-system-access#383

Merged

FileSystemHandle Unique ID w3ctag/design-reviews#764

Closed

martinthomson reviewed Aug 17, 2022

View reviewed changes

UniqueID.md Outdated Show resolved Hide resolved

Update example

9063e17

a-sully mentioned this pull request Oct 10, 2022

What is a FileSystemHandle? #59

Closed

a-sully mentioned this pull request Nov 29, 2022

Add FileSystemHandle.move method #10

Open

3 tasks

a-sully mentioned this pull request Mar 16, 2023

Moving local files with the File System Access API w3ctag/design-reviews#805

Closed

1 task

a-sully changed the title ~~Add explainer for FileSystemHandle::getUniqueId()~~ Add explainer for FileSystemHandle.getUniqueId() Mar 16, 2023

tomayac mentioned this pull request Mar 30, 2023

Permission to get file paths in order to create hyperlinks WICG/file-system-access#407

Closed

a-sully mentioned this pull request Apr 3, 2023

Meta: Add proposals/ folder #108

Merged

jimmywarting mentioned this pull request May 31, 2023

Add cloud file handling proposal WICG/file-system-access#411

Merged

a-sully mentioned this pull request Jun 21, 2023

Support Use Case : store FileSystemDirectoryHandle or FileSystemHandle in Redux Store WICG/file-system-access#368

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add explainer for FileSystemHandle.getUniqueId() #46

Add explainer for FileSystemHandle.getUniqueId() #46

a-sully commented Aug 10, 2022 •

edited

Loading

domenic commented Aug 11, 2022

martinthomson left a comment

a-sully commented Aug 17, 2022

martinthomson commented Aug 17, 2022

a-sully commented Aug 20, 2022

annevk commented Aug 29, 2022

a-sully commented Aug 29, 2022 •

edited

Loading

annevk commented Aug 31, 2022

domenic commented Aug 31, 2022

jesup commented Aug 31, 2022

domenic commented Aug 31, 2022

a-sully commented Sep 1, 2022

a-sully commented Sep 1, 2022

jesup commented Sep 30, 2022

a-sully commented Oct 4, 2022

Add explainer for FileSystemHandle.getUniqueId() #46

Are you sure you want to change the base?

Add explainer for FileSystemHandle.getUniqueId() #46

Conversation

a-sully commented Aug 10, 2022 • edited Loading

domenic commented Aug 11, 2022

martinthomson left a comment

Choose a reason for hiding this comment

a-sully commented Aug 17, 2022

martinthomson commented Aug 17, 2022

a-sully commented Aug 20, 2022

annevk commented Aug 29, 2022

a-sully commented Aug 29, 2022 • edited Loading

annevk commented Aug 31, 2022

domenic commented Aug 31, 2022

jesup commented Aug 31, 2022

domenic commented Aug 31, 2022

a-sully commented Sep 1, 2022

a-sully commented Sep 1, 2022

jesup commented Sep 30, 2022

a-sully commented Oct 4, 2022

a-sully commented Aug 10, 2022 •

edited

Loading

a-sully commented Aug 29, 2022 •

edited

Loading