Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add table_kwargs context manager to make pandas/Dask support CrateDB's special SQL DDL options #139

Merged
merged 1 commit into from
Jun 24, 2024

Conversation

amotl
Copy link
Member

@amotl amotl commented Jun 23, 2024

Problem

In certain cases where SQLAlchemy orchestration is implemented within a framework, like at this spot 1 in pandas' SQLTable._create_table_setup, it is not easily possible to forward SQLAlchemy dialect options at table creation time.

Idea

Unlock SQLAlchemy ORM's __table_args__ on the pandas/Dask to_sql() interface, in order to support CrateDB's special SQL DDL options.

Solution

In order to augment the SQL DDL statement to make it honor database-specific dialect options, the only way to work around the unfortunate situation is by monkey-patching the call to sa.Table() at runtime, relaying additional dialect options through corresponding keyword arguments in their original <dialect>_<kwarg> format 2.

Synopsis

Using a context manager incantation like with table_kwargs(crate_partitioned_by="time") will render a PARTITIONED BY ("time") SQL clause, without touching the call site of sa.Table(...).

from sqlalchemy_cratedb.support import table_kwargs

# Load data into database, using Dask.
ddf = dd.from_pandas(df, npartitions=npartitions)
with table_kwargs(crate_partitioned_by="time"):
    return ddf.to_sql(
        tablename,
        uri=dburi,
        index=index,
        chunksize=chunksize,
        if_exists=if_exists,
        method=method,
        parallel=True,
    )

Documentation

Preview: https://sqlalchemy-cratedb--139.org.readthedocs.build/support.html#context-manager-table-kwargs

References

Backlog

  • Software tests.
  • Documentation.

Footnotes

  1. https://github.com/pandas-dev/pandas/blob/v2.2.2/pandas/io/sql.py#L1282-L1285

  2. https://docs.sqlalchemy.org/en/20/core/foundation.html#sqlalchemy.sql.base.DialectKWArgs.dialect_kwargs

@amotl amotl force-pushed the amo/polyfill branch 2 times, most recently from 11076e8 to 5e39bbf Compare June 23, 2024 23:08
@amotl amotl force-pushed the amo/support-table-kwargs branch 2 times, most recently from b9af9da to 7c9bd96 Compare June 23, 2024 23:59
@amotl amotl changed the title Add table_kwargs context manager to make pandas/Dask support dialect Add table_kwargs context manager to make pandas/Dask support CrateDB's special SQL DDL options Jun 24, 2024
@amotl amotl requested review from seut, matriv and surister June 24, 2024 00:07
@amotl amotl force-pushed the amo/support-table-kwargs branch 3 times, most recently from 96d10a3 to a8a61b9 Compare June 24, 2024 00:40
@amotl amotl marked this pull request as ready for review June 24, 2024 00:42
Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, thx!

CHANGES.md Outdated Show resolved Hide resolved
tests/test_support_pandas.py Outdated Show resolved Hide resolved
@amotl amotl requested a review from matriv June 24, 2024 08:09
tests/test_support_pandas.py Outdated Show resolved Hide resolved
Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx!

@amotl amotl force-pushed the amo/polyfill branch 3 times, most recently from ad9b870 to a2898da Compare June 24, 2024 14:15
Base automatically changed from amo/polyfill to main June 24, 2024 14:27
Unlock SQLAlchemy ORM's `__table_args__` on the pandas/Dask `to_sql()`
interface, in order to support CrateDB's special SQL DDL options.

Co-authored-by: Marios Trivyzas <5058131+matriv@users.noreply.github.com>
@amotl amotl merged commit 67b2e32 into main Jun 24, 2024
27 checks passed
@amotl amotl deleted the amo/support-table-kwargs branch June 24, 2024 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants