Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely long delay grabbing type info for string array (and likely other types) on CockroachDB #1158

Open
JamesHutchison opened this issue Jun 5, 2024 · 13 comments

Comments

@JamesHutchison
Copy link

JamesHutchison commented Jun 5, 2024

I was having atrocious and unacceptable delays in my production environment that I wasn't seeing locally, using CockroachDB cloud. Found the cause was the introspection of types. I'm using the latest version of the CockroachDB drivers.

recursion-statements

I wrote this hack to work around the issue. It caches the result in local memory and also caches it to redis so that new instances don't see it. You can change the key for the redis cache using an environment variable so that new versions aren't locked to old values.

If someone wants to turn it into part of the product, please be my guest. I won't have time for it. In the mean time. here's the hack that does monkey patching:

introspection_result_cache: dict[tuple[str, int, str], Any] = {}

orig_introspection_types = asyncpg.Connection._introspect_types

INTROSPECTION_KEY = os.environ.get(
    "ASYNCPG_INTROSPECTION_CACHE_KEY", "ASYNCPG_INTROSPECTION_CACHE_KEY"
)

introspection_lock = asyncio.Lock()


class FauxResult:
    _binary_fields = ("kind", "elemdelim")
    column_order = [
        "oid",
        "ns",
        "name",
        "kind",
        "basetype",
        "elemtype",
        "elemdelim",
        "range_subtype",
        "attrtypoids",
        "attrnames",
        "depth",
        "basetype_name",
        "elemtype_name",
        "range_subtype_name",
    ]

    def __init__(self, row=None, data: dict | None = None) -> None:
        if row:
            self.data = dict(row)
        else:
            assert data
            self.data = data

    def __getattr__(self, name: str) -> Any:
        return self.data[name]

    def __getitem__(self, idx_or_column_name: int | str) -> Any:
        if isinstance(idx_or_column_name, int):
            return self.data[self.column_order[idx_or_column_name]]
        return self.data[idx_or_column_name]

    def for_serialization(self) -> dict:
        result = copy.copy(self.data)
        for field in self._binary_fields:
            if (value := self.data.get(field)) is not None:
                result[field] = value.decode()
        return result

    @classmethod
    def from_serialization(cls, data: dict) -> Self:
        for field in cls._binary_fields:
            if (value := data.get(field)) is not None:
                data[field] = value.encode()
        return cls(data=data)


class FauxPreparedStatementState:
    def __init__(self, name) -> None:
        self.name = name


async def to_redis_cache(
    host: str, port: int, database: str, inspection_types: tuple[list, Any]
) -> None:
    pss = FauxPreparedStatementState(inspection_types[1].name)
    results = [FauxResult(row) for row in inspection_types[0]]
    await redis_client().set(
        INTROSPECTION_KEY + f"-{host}-{port}-{database}",
        orjson.dumps([[result.for_serialization() for result in results], pss.name]),
    )


async def from_redis_cache(host: str, port: int, database: str) -> tuple[list, Any] | None:
    data = await redis_client().get(INTROSPECTION_KEY + f"-{host}-{port}-{database}")
    if data is None:
        return None
    results, pss_name = orjson.loads(data)
    pss = FauxPreparedStatementState(pss_name)
    return [FauxResult.from_serialization(row) for row in results], pss


def apply_introspection_caching():

    async def new_introspect_types(self, *args, **kwargs) -> Any:
        host: str
        port: int
        database: str
        host, port = self._addr
        database = self._params.database
        if (cached_val := introspection_result_cache.get((host, port, database))) is not None:
            return cached_val
        async with introspection_lock:
            redis_cached_value = await from_redis_cache(host, port, database)
        if redis_cached_value is not None:
            introspection_result_cache[host, port, database] = redis_cached_value
            return redis_cached_value
        result = await orig_introspection_types(self, *args, **kwargs)
        await to_redis_cache(host, port, database, result)
        return result

    asyncpg.Connection._introspect_types = new_introspect_types
@elprans
Copy link
Member

elprans commented Jul 17, 2024

The introspection query isn't that complex and typically runs in under a millisecond on regular Postgres. This is something you might want to raise with Cockroach Labs people.

@tamis-laan
Copy link

tamis-laan commented Jul 25, 2024

@elprans

I also get a 7 seconds delay on each connection everytime it's opened. This is really annoying given I use a connection pool. This only happends with cockroachdb it doesn't happen with postgresql.

On the other hand when using sql alchemy this problem does not occur. So I think the problem is on the side of the asyncpg library.

@JamesHutchison
Copy link
Author

I reached out to CockroachDB about this.

@inata4
Copy link

inata4 commented Jul 25, 2024

@JamesHutchison could you please reach out to Cockroach DB support via our ticketing system: https://support.cockroachlabs.com/hc/en-us This way we can provide you with the best support.
It looks like you emailed another organization within CRL.
Thank you!

@JamesHutchison
Copy link
Author

Created ticket

@mohan-crdb
Copy link

@JamesHutchison - I can see the ticket has been created but we are unsure about the organization that you are using, could you please share the organization details in the existing ticket or create the ticket with the correct organization?

@tamis-laan
Copy link

Any new information on this? I still get delays everywhere in my app.

@JamesHutchison
Copy link
Author

They are aware of the issue

cockroachdb/cockroach#113292

My recommendation is to use my workaround in the meantime

@tamis-laan
Copy link

tamis-laan commented Aug 12, 2024

@JamesHutchison
I can't I get a 7 seconds delay on each connection opened to cockroachdb not just on grabbing type info.
Not sure if we have the same problem, but they are probably related.

@tamis-laan
Copy link

So apparently this issue might be related to asyncpg introspection queries:
cockroachdb/cockroach#128774

_TYPEINFO_13: typing.Final = '''\
(
SELECT
t.oid AS oid,
ns.nspname AS ns,
t.typname AS name,
t.typtype AS kind,
(CASE WHEN t.typtype = 'd' THEN
(WITH RECURSIVE typebases(oid, depth) AS (
SELECT
t2.typbasetype AS oid,
0 AS depth
FROM
pg_type t2
WHERE
t2.oid = t.oid
UNION ALL
SELECT
t2.typbasetype AS oid,
tb.depth + 1 AS depth
FROM
pg_type t2,
typebases tb
WHERE
tb.oid = t2.oid
AND t2.typbasetype != 0
) SELECT oid FROM typebases ORDER BY depth DESC LIMIT 1)
ELSE NULL
END) AS basetype,
t.typelem AS elemtype,
elem_t.typdelim AS elemdelim,
range_t.rngsubtype AS range_subtype,
(CASE WHEN t.typtype = 'c' THEN
(SELECT
array_agg(ia.atttypid ORDER BY ia.attnum)
FROM
pg_attribute ia
INNER JOIN pg_class c
ON (ia.attrelid = c.oid)
WHERE
ia.attnum > 0 AND NOT ia.attisdropped
AND c.reltype = t.oid)
ELSE NULL
END) AS attrtypoids,
(CASE WHEN t.typtype = 'c' THEN
(SELECT
array_agg(ia.attname::text ORDER BY ia.attnum)
FROM
pg_attribute ia
INNER JOIN pg_class c
ON (ia.attrelid = c.oid)
WHERE
ia.attnum > 0 AND NOT ia.attisdropped
AND c.reltype = t.oid)
ELSE NULL
END) AS attrnames
FROM
pg_catalog.pg_type AS t
INNER JOIN pg_catalog.pg_namespace ns ON (
ns.oid = t.typnamespace)
LEFT JOIN pg_type elem_t ON (
t.typlen = -1 AND
t.typelem != 0 AND
t.typelem = elem_t.oid
)
LEFT JOIN pg_range range_t ON (
t.oid = range_t.rngtypid
)
)
'''

Is there a way to rollback to a version without this issue until the problem is fixed?

@JamesHutchison
Copy link
Author

If you're using redis or really anything that could store the serialized result of the query, my hack works pretty well.

@tamis-laan
Copy link

tamis-laan commented Aug 30, 2024

@JamesHutchison
Not using Redis I'm afraid, hope this issue gets resolved soon. It surprises me that there are not more people with this issue.

@JamesHutchison
Copy link
Author

Just save it to a file or something. Whatever (non-DB persistence you have available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants