Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Crash] use TiDB Vector Store in LlamaIndex, tidb_vector throw exception #58

Open
wph95 opened this issue Aug 21, 2024 · 8 comments
Open

Comments

@wph95
Copy link

wph95 commented Aug 21, 2024

Exception ignored in: <function TiDBVectorClient.__del__ at 0x1693e63e0>
Traceback (most recent call last):
  File "/Users/justwph/Library/Caches/pypoetry/virtualenvs/concave-KQBUMHDg-py3.11/lib/python3.11/site-packages/tidb_vector/integrations/vector_client.py", line 177, in __del__
AttributeError: 'NoneType' object has no attribute 'Connection'

might help info

my tidb_connect_url

TIDB_PASSWORD = os.environ.get("TIDB_PASSWORD", "")
TIDB_USER = os.environ.get("TIDB_USER", "")

TIDB_DATABASE_URL = f"mysql+pymysql://{TIDB_USER}.root:{TIDB_PASSWORD}@gateway01.us-west-2.prod.aws.tidbcloud.com:4000/test?ssl_ca=/etc/ssl/cert.pem&ssl_verify_cert=false&ssl_verify_identity=false"

@wph95
Copy link
Author

wph95 commented Aug 22, 2024

A simple local fix for this issue was applied using Concave. (AI autonomous Software Engineering)
https://github.com/concave-ai/fleet/blob/main/test/logs/tidb_vector_python_58/agents/IssueAnalysis/2024-08-21T18%3A09%3A41.864974.json#L13

For your reference.

Option1:

    def __del__(self) -> None:
        if self._bind is not None:
            if hasattr(self._bind, 'close') and callable(getattr(self._bind, 'close', None)):
                self._bind.close()

Option2:

from sqlalchemy.engine import Connection
...
    def __del__(self) -> None:

        if self._bind is not None and isinstance(
                self._bind, Connection
        ):
            self._bind.close()

@wph95 wph95 changed the title [Crash] use TiDB Vector Store, tidb_vector throw exception [Crash] use TiDB Vector Store in LlamaIndex, tidb_vector throw exception Aug 22, 2024
@wph95
Copy link
Author

wph95 commented Aug 22, 2024

same problem report
#51 (comment)

@IANTHEREAL
Copy link
Collaborator

@wph95 Will upgrading tidb-vector to version 0.0.11 resolve this error?

@wph95
Copy link
Author

wph95 commented Aug 23, 2024

@wph95 Will upgrading tidb-vector to version 0.0.11 resolve this error?

no , i used tidb-vector 0.0.11, raise this bug

@IANTHEREAL
Copy link
Collaborator

IANTHEREAL commented Aug 23, 2024

@wph95 It's wired, in 0.0.11 version, we check the self._bind is not none (also in https://github.com/pingcap/tidb-vector-python/pull/59/files).

To solve this issue, I try to reproduce the error, but no luck. Can you provide your script and your env info to help me to reproduce it?

@wph95
Copy link
Author

wph95 commented Aug 24, 2024

https://github.com/concave-ai/tidb_vector_client_bug_reproduction

I have written an MVP that consistently reproduces the bug (based on version 0.0.11
https://github.com/concave-ai/tidb_vector_client_bug_reproduction/blob/main/poetry.lock#L2004).

[tool.poetry.dependencies]
python = "^3.11"
llama-index-embeddings-voyageai = "^0.1.4"
llama-index = "^0.10.65"
llama-index-vector-stores-tidbvector = "^0.1.2"

This issue seems to occur in all versions of Llama Index 0.10.x. However, upgrading to 0.11.x resolves the problem.

Please note, the crash is not due to self._bind being None.

The actual issue occurs in __del__ where sqlalchemy.engine is None, causing the sqlalchemy.engine.Connection read to fail.

This also implies that #59 will likely encounter the same problem, as it still exists in Llama Index 0.10.x.

@IANTHEREAL
Copy link
Collaborator

Thanks a lot.

This issue seem to be related to garbage collection. For example, when I attempt to print self._bind after raising an error, the following error occurs. When the program executes the line at https://github.com/zzzeek/sqlalchemy/blob/main/lib/sqlalchemy/engine/url.py#L628, the quote function has already been destroyed (set to None), leading to errors like TypeError: 'NoneType' object is not callable. The related similar error logs include:

  • ImportError: sys.meta_path is None, Python is likely shutting down
  • UnboundLocalError: cannot access local variable 'quote' where it is not associated with a value

The reason is that during the shutdown of the Python interpreter, the resource cleanup mechanism starts unloading modules and global variables. If the __del__ method of an object relies on these modules or functions (such as quote), and these resources have already been cleaned up by the interpreter, trying to call them again will result in these errors. This ultimately leads to the quote function not being callable, causing the exception.

After upgrading Llama Index to version 0.11.1, this issue was handled. However, it's puzzling because I couldn't find any related modifications in Llama Index.

Given this situation, I plan to remove the __del__ method to avoid conflicts during cleanup.

@IANTHEREAL
Copy link
Collaborator

@wph95 we released 0.0.12, would you like to have a try?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants