Discovery makes implementation assumptions for listing TDs via pagination parameters #278

thjaeckle · 2022-02-24T15:36:39Z

I just had a look of what to do in order to provide a WoT Discovery compliant Listing of TDs and wanted to provide feedback.

I think it is very problematic that this chapter has quite some bias of how a TD Directory implements its persistence:

the limit / offset based pagination is a very inefficient mechanism for pagination (e.g. see articles here or here) which nowadays is replaced by cursor based pagination instead
- I don't see a need to specify how pagination is concretely done, it must only be possible to provide a "next" page somehow
the "canonical" Link header is specified to include an additional mandatory etag
- which also is only needed for "limit/offset" based pagination - where adding a new TD at e.g. the "first" position shifts the result array
- in a cursor based pagination approach, the cursor for the next page would include an identifier for the first TD of the next page in its "where clause" - which would not be affected by adding new TDs during pagination
- also, specifying the required etag puts quite a lot of effort to the TD Directory, e.g. it would have to calculate a hash of all matching TDs for each "next" page, which could be a very expensive operation

In my opinion, the specification forces the implementer to choose an inefficient persistence.
Does it add much benefit of specifying something like limit and offset query parameters?
Or would it not be sufficient to define (which even already is defined) that a next link header must be returned when there are more results to load than returned in the response? And that this link must be followed to load the next page?

One other thing I noticed:
The query parameters which have to be provided by an implementation are very hidden in the text, I would have expected a table of the required/optional query parameters to support for the "Listing" API.

The text was updated successfully, but these errors were encountered:

mmccool · 2022-04-11T14:32:03Z

@farshidtz can you comment? I think at this point we are out of time to make big changes in how this is done, but we should definitely look into improving this in Discovery 2.0 (which we def need to do anyway for other reasons, i.e. to intercept the JSONPath spec when it is published by IETF).

mmccool · 2022-08-23T13:38:36Z

I'm also wondering if we can fix this in a way that preserves compatibility. Suppose we introduce a new way to do pagination, but keep the old API around. That gives compatibility, but forces a system to implement pagination twice, and one way is inefficient. Do we do that for 1.1 then deprecate the "old way" for 2.0? Or do we just drop the old way and go straight for a new, cursor-based approach and publish a 2.0 spec?

farshidtz · 2022-09-05T08:52:03Z

The reason for using limit and offset was to allow programmatic generation of page links and allow parallel queries. Some databases have O(n) complexity for offset, and are more efficient when paginating with cursor-based queries. But there are other issues:

With cursor-based, the client has to receive and process the request to get the link and issue a follow up request. This adds significant latency when querying many pages.
Cursor-based pagination only works reliably if the identifier is unique. We use id by default but allow sorting by other fields too.

For the etag, calculating the hash of the whole dataset is inefficient, but the spec is suggesting something different:

The etag value could be a revision number, timestamp, or UUID Version 4, set whenever the TD collection changes in a way that affects the ordering of the TDs. The clients may rely on the etag value to know whether the collection remains consistent across paginated retrieval of the collection. For example, creation or deletion of TDs or update of TD fields used for ordering may shift the calculated paging window.

With that said, we could make the use of limit, offset query arguments, and etag in canonical link header optional if there is consensus.

mmccool added the defer to Discovery 2.0 label Apr 11, 2022

mmccool removed the defer to Discovery 2.0 label Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discovery makes implementation assumptions for listing TDs via pagination parameters #278

Discovery makes implementation assumptions for listing TDs via pagination parameters #278

thjaeckle commented Feb 24, 2022

mmccool commented Apr 11, 2022

mmccool commented Aug 23, 2022 •

edited

Loading

farshidtz commented Sep 5, 2022

Discovery makes implementation assumptions for listing TDs via pagination parameters #278

Discovery makes implementation assumptions for listing TDs via pagination parameters #278

Comments

thjaeckle commented Feb 24, 2022

mmccool commented Apr 11, 2022

mmccool commented Aug 23, 2022 • edited Loading

farshidtz commented Sep 5, 2022

mmccool commented Aug 23, 2022 •

edited

Loading