Skip to content

Releases: chr1st1ank/narrow-down

Release v1.1.0

01 May 10:58
Compare
Choose a tag to compare

Changed

  • Remove dependency on the protobuf package by using a Rust implementation for serialization

Fixed

  • Tests were failing because of a breaking change in Nox

Release v1.0.1

30 Apr 14:31
Compare
Choose a tag to compare

Changed

  • Update pre-commit hooks and dependencies
  • Allow to use also protobuf 4

Release v1.0.0

17 May 07:49
Compare
Choose a tag to compare

Added

  • The storage backends do have now a method query_documents() to leverage economies of scale when
    querying multiple documents at once.

Changed

  • The public interface of the library is declared stable, hence it is ready for version 1.0.
  • char_ngrams() is now fully implemented in Rust, giving a speedup of 2x.
  • minhash LSH uses the new query_documents() of the storage backends instead of running concurrent
    queries.

Fixed

  • Wrong operator precedence in minhash implementation, which lead to incorrect results.
  • Incorrect parsing of tokenize argument for SimilarityStore for char_ngrams without padding.

Release v0.10.0

08 May 18:06
Compare
Choose a tag to compare

Changed

  • Improved performance of SimilarityStore.query_top_n()

Release v0.9.3

04 Apr 23:54
Compare
Choose a tag to compare

Fixed

  • Fixes #63 which led to Exceptions in case of empty documents.

Release v0.9.2

29 Mar 11:59
Compare
Choose a tag to compare

Fixed

  • Fixes #62 which led to TypeErrors in case of multiple identical results.

Release v0.9.1

25 Mar 08:54
Compare
Choose a tag to compare

Changed

  • Minimum number of hash permutations for Minhash LSH set to 16 to avoid artifacts as described
    in #61.

Release v0.9.0

13 Mar 19:59
Compare
Choose a tag to compare

Added

  • ScyllaDBStore now accepts a table_prefix setting.

Changed

  • The classes in narrow_down.data_types were moved to narrow_down.storage.
  • The initialize() method of the storage backends can now be called multiple times without issues.

Fixed

  • A use of collections.Counter as typehint broke mypy checks.

Release v0.8.0

23 Feb 22:31
Compare
Choose a tag to compare

Added

  • Direct InMemoryStore file serialization in the Rust backend.
    This avoids a memory peak and also improves the performance of the operation compared to
    (de-)serialization via the detour of a Python bytes object.

Release v0.7.0

06 Feb 01:29
Compare
Choose a tag to compare

Added

  • InMemoryStore can be serialized to and deserialized from MessagePack.
  • SimilarityStore.top_n_query() now allows to find a limited number of most similar documents.
  • SimilarityStore offers the option to validate the similarity score if the document is available
    to avoid false positives.

Changed

  • SimilarityStore objects can now be created by a factory coroutine create() instead of
    calling first __init__() and then initialize(). This makes the usage of the class more
    straight-forward.
  • The exact_part of a document is now also stored in storage level "Document".
  • The InMemoryStore no longer uses Python dictionaries as storage, but rather a class in the Rust
    extension to reduce the memory footprint by a lot.

Fixed

  • The number of partitions is now stored in the database for the SQLite backend. This way the DB
    is self-contained and the user doesn't have to keep the number elsewhere.