Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow spread operators in to-many relationships #3640

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ This project adheres to [Semantic Versioning](http://semver.org/).
- #2858, Performance improvements when calling RPCs via GET using indexes in more cases - @wolfgangwalther
- #3560, Log resolved host in "Listening on ..." messages - @develop7
- #3727, Log maximum pool size - @steve-chavez
- #3041, Allow spreading one-to-many and many-to-many embedded resources - @laurenceisla
+ The selected columns in the embedded resources are aggregated into arrays

### Fixed

Expand Down
46 changes: 39 additions & 7 deletions docs/references/api/aggregate_functions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,8 @@ You will then get the summed amount, along with the embedded customer resource:
.. note::
The previous example uses a has-one association to demonstrate this functionality, but you may also use has-many associations as grouping columns, although there are few obvious use cases for this.

.. _aggregate_functions_embed_context:

Using Aggregate Functions Within the Context of an Embedded Resource
--------------------------------------------------------------------

Expand Down Expand Up @@ -228,13 +230,13 @@ Continuing with the example relationship between ``orders`` and ``customers`` fr

In this example, the ``amount`` column is summed and grouped by the ``order_date`` *within* the context of the embedded resource. That is, the ``name``, ``city``, and ``state`` from the ``customers`` table have no bearing on the aggregation performed in the context of the ``orders`` association; instead, each aggregation can be seen as being performed independently on just the orders belonging to a particular customer, using only the data from the embedded resource for both grouping and aggregation.

Using Columns from a Spreaded Resource
--------------------------------------
Using Columns from a To-One Spreaded Resource
---------------------------------------------

When you :ref:`spread an embedded resource <spread_embed>`, the columns from the spreaded resource are treated as if they were columns of the top-level resource, both when using them as grouping columns and when applying aggregate functions to them.
When you :ref:`spread a to-one embedded resource <spread_embed>`, the columns from the spreaded resource are treated as if they were columns of the top-level resource, both when using them as grouping columns and when applying aggregate functions to them.

Grouping with Columns from a Spreaded Resource
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Grouping with Columns from a To-One Spreaded Resource
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For instance, assume you want to sum the ``amount`` column from the ``orders`` table, using the ``city`` and ``state`` columns from the ``customers`` table as grouping columns. To achieve this, you may select these two columns from the ``customers`` table and spread them; they will then be used as grouping columns:

Expand All @@ -259,8 +261,8 @@ The result will be the same as if ``city`` and ``state`` were columns from the `
}
]

Aggregate Functions with Columns from a Spreaded Resource
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Aggregate Functions with Columns from a To-One Spreaded Resource
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Now imagine that the ``customers`` table has a ``joined_date`` column that represents the date that the customer joined. You want to get both the most recent and the oldest ``joined_date`` for customers that placed an order on every distinct order date. This can be expressed as follows:

Expand All @@ -286,3 +288,33 @@ The result will be the same as if the aggregations were applied to columns from
"min": "2016-02-11"
}
]

Using Aggregates in a To-Many Spread Resource
---------------------------------------------

Unlike the to-one spreads, the columns inside a :ref:`to-many spread relationship <spread_to_many_embed>` are not treated as if they were part of the top-level resource.
The aggregates will be done :ref:`in the context of the to-many spread resource <aggregate_functions_embed_context>`.
For example:

.. code-block:: bash

curl "http://localhost:3000/customers?select=name,city,state,...orders(amount.sum(),order_date)"

.. code-block:: json

[
{
"name": "Customer A",
"city": "New York",
"state": "NY",
"sum": [215.22, 905.73],
"order_date": ["2023-09-01", "2023-09-02"]
},
{
"name": "Customer B",
"city": "Los Angeles",
"state": "CA",
"sum": [329.71, 425.87],
"order_date": ["2023-09-01", "2023-09-03"]
}
]
79 changes: 76 additions & 3 deletions docs/references/api/resource_embedding.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1150,7 +1150,19 @@ For example, to arrange the films in descending order using the director's last
Spread embedded resource
========================

On many-to-one and one-to-one relationships, you can "spread" the embedded resource. That is, remove the surrounding JSON object for the embedded resource columns.
The ``...`` operator lets you "spread" an embedded resource.
That is, it removes the surrounding JSON object for the embedded resource columns.

.. note::

The spread operator ``...`` is borrowed from the Javascript `spread syntax <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Spread_syntax>`_.

.. _spread_to_one_embed:

Spread One-To-One and Many-To-One relationships
-----------------------------------------------

Take the following example:

.. code-block:: bash

Expand Down Expand Up @@ -1196,6 +1208,67 @@ You can use this to get the columns of a join table in a many-to-many relationsh
}
]

.. note::
.. _spread_to_many_embed:

The spread operator ``...`` is borrowed from the Javascript `spread syntax <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Spread_syntax>`_.
Spread One-To-Many and Many-To-Many relationships
-------------------------------------------------

The spread columns in these relationships will show the data in arrays.

.. code-block:: bash

# curl -g "http://localhost:3000/directors?select=first_name,...films(film_titles:title,film_years:year)&first_name=like.Quentin*"

curl --get "http://localhost:3000/directors" \
-d "select=first_name,...films(film_titles:title,film_years:year)" \
-d "first_name=like.Quentin*"
steve-chavez marked this conversation as resolved.
Show resolved Hide resolved

.. code-block:: json

[
{
"first_name": "Quentin",
"film_titles": [
"Pulp Fiction",
"Reservoir Dogs"
],
"film_years": [
1994,
1992
]
}
]

Note that there is no "films" array of objects.

By default, the order of the values inside the resulting array is unspecified.
But `it is safe to assume <https://www.postgresql.org/message-id/15950.1491843689%40sss.pgh.pa.us>`_ that all the columns return the values in the same unspecified order.
From the previous result, we can say that "Pulp Fiction" premiered in 1994 and "Reservoir Dogs" in 1992.
You can still order all the resulting arrays explicitly. For example, to order by the release year:

.. code-block:: bash

# curl -g "http://localhost:3000/directors?select=first_name,...films(film_titles:title,film_years:year)&first_name=like.Quentin*&films.order=film_years"

curl --get "http://localhost:3000/directors" \
-d "select=first_name,...films(film_titles:title,film_years:year)" \
-d "first_name=like.Quentin*" \
-d "films.order=film_years"

.. code-block:: json

[
{
"first_name": "Quentin",
"film_titles": [
"Reservoir Dogs",
"Pulp Fiction"
],
"film_years": [
1992,
1994
]
}
]

Note that the field must be selected in the spread relationship for the order to work.
Comment on lines +1258 to +1274
Copy link
Member Author

@laurenceisla laurenceisla Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the json_agg(col) aggregate is done outside of the subquery selection (to avoid cases like json_agg(sum(col))), we cannot order the json_agg by columns that are not selected in that subquery. Here's the generated query for this example:

Query

WITH pgrst_source AS
  
  -- Subquery for the current example
  (SELECT "public"."directors"."first_name",
          "directors_films_1"."film_titles",
          "directors_films_1"."film_years"
   FROM "public"."directors"
   LEFT JOIN LATERAL
     (SELECT json_agg("directors_films_1")::jsonb AS "directors_films_1",
             COALESCE(
               json_agg("directors_films_1"."film_titles" ORDER BY "directors_films_1"."film_years")
               ,'[]'
             )::jsonb AS "film_titles",
             COALESCE(
               json_agg("directors_films_1"."film_years" ORDER BY "directors_films_1"."film_years")
               ,'[]'
             )::jsonb AS "film_years"
      FROM
        (SELECT "films_1"."title" AS "film_titles",
                "films_1"."year" AS "film_years"
         FROM "public"."films" AS "films_1"
         WHERE "films_1"."director_id" = "public"."directors"."id") AS "directors_films_1") AS "directors_films_1" ON TRUE
   WHERE "public"."directors"."first_name" LIKE $1)
   --

SELECT NULL::bigint AS total_result_set,
       pg_catalog.count(_postgrest_t) AS page_total,
       coalesce(json_agg(_postgrest_t), '[]') AS body,
       nullif(current_setting('response.headers', TRUE), '') AS response_headers,
       nullif(current_setting('response.status', TRUE), '') AS response_status,
       '' AS response_inserted
FROM
  (SELECT *
   FROM pgrst_source) _postgrest_t

Maybe selecting all the columns in the non-aggregated subquery could be an alternative? (computed columns still won't work, I think).

Just noticed there's also an issue when using aliases in the columns. In the example, order=film_years (the alias) works, but order=year does not. This needs to be fixed.

Copy link
Member

@wolfgangwalther wolfgangwalther Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any fundamental reason this can't become SELECT jsonb_agg(... ORDER BY ...) FROM public.films WHERE .., i.e. without the subquery in FROM?

Edit: Ah, this, I think:

(to avoid cases like json_agg(col.sum()))

Not sure whether I understand that part, yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(to avoid cases like json_agg(col.sum()))

Not sure whether I understand that part, yet.

No, I don't. I'm especially confused by the mixed syntax of SQL and PostgREST-request here. Why exactly did you decide to use the subquery?

Copy link
Member Author

@laurenceisla laurenceisla Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: Ah, this, I think:

Yes. For example, ...films(years.max()), would try to do this:

SELECT json_agg(max(years)) FROM public.films WHERE ...

Which returns ERROR: calls to aggregate functions cannot be nested.

Edit: Fix syntax 🤦

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, ...films(max(years)), would try to do this:

This time the syntax was mixed again, but the other way around :D

So, I guess you mean: ...films(years.max()).

Ok, I see that now, yes. It makes sense to treat the spread as another query layer, so I guess the requirement to have the columns selected for ordering is OK.

4 changes: 0 additions & 4 deletions docs/references/errors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -241,10 +241,6 @@ Related to the HTTP request elements.
| | | there is no many-to-one or one-to-one relationship between |
| PGRST118 | | them. |
+---------------+-------------+-------------------------------------------------------------+
| .. _pgrst119: | 400 | Could not use the spread operator on the related table |
| | | because there is no many-to-one or one-to-one relationship |
| PGRST119 | | between them. |
+---------------+-------------+-------------------------------------------------------------+
| .. _pgrst120: | 400 | An embedded resource can only be filtered using the |
| | | ``is.null`` or ``not.is.null`` :ref:`operators <operators>`.|
| PGRST120 | | |
Expand Down
1 change: 0 additions & 1 deletion src/PostgREST/ApiRequest/Types.hs
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,6 @@ data ApiRequestError
| PutLimitNotAllowedError
| QueryParamError QPError
| RelatedOrderNotToOne Text Text
| SpreadNotToOne Text Text
| UnacceptableFilter Text
| UnacceptableSchema [Text]
| UnsupportedMethod ByteString
Expand Down
10 changes: 1 addition & 9 deletions src/PostgREST/Error.hs
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,6 @@ instance PgrstError ApiRequestError where
status PutLimitNotAllowedError = HTTP.status400
status QueryParamError{} = HTTP.status400
status RelatedOrderNotToOne{} = HTTP.status400
status SpreadNotToOne{} = HTTP.status400
status UnacceptableFilter{} = HTTP.status400
status UnacceptableSchema{} = HTTP.status406
status UnsupportedMethod{} = HTTP.status405
Expand Down Expand Up @@ -176,12 +175,6 @@ instance JSON.ToJSON ApiRequestError where
(Just $ JSON.String $ "'" <> origin <> "' and '" <> target <> "' do not form a many-to-one or one-to-one relationship")
Nothing

toJSON (SpreadNotToOne origin target) = toJsonPgrstError
ApiRequestErrorCode19
("A spread operation on '" <> target <> "' is not possible")
(Just $ JSON.String $ "'" <> origin <> "' and '" <> target <> "' do not form a many-to-one or one-to-one relationship")
Nothing

toJSON (UnacceptableFilter target) = toJsonPgrstError
ApiRequestErrorCode20
("Bad operator on the '" <> target <> "' embedded resource")
Expand Down Expand Up @@ -628,7 +621,7 @@ data ErrorCode
| ApiRequestErrorCode16
| ApiRequestErrorCode17
| ApiRequestErrorCode18
| ApiRequestErrorCode19
-- | ApiRequestErrorCode19 -- no longer used (used to be mapped to SpreadNotToOne)
| ApiRequestErrorCode20
| ApiRequestErrorCode21
| ApiRequestErrorCode22
Expand Down Expand Up @@ -677,7 +670,6 @@ buildErrorCode code = case code of
ApiRequestErrorCode16 -> "PGRST116"
ApiRequestErrorCode17 -> "PGRST117"
ApiRequestErrorCode18 -> "PGRST118"
ApiRequestErrorCode19 -> "PGRST119"
ApiRequestErrorCode20 -> "PGRST120"
ApiRequestErrorCode21 -> "PGRST121"
ApiRequestErrorCode22 -> "PGRST122"
Expand Down
Loading
Loading