Releases: chr1st1ank/narrow-down
Releases · chr1st1ank/narrow-down
Release v1.1.0
Changed
- Remove dependency on the protobuf package by using a Rust implementation for serialization
Fixed
- Tests were failing because of a breaking change in Nox
Release v1.0.1
Changed
- Update pre-commit hooks and dependencies
- Allow to use also protobuf 4
Release v1.0.0
Added
- The storage backends do have now a method query_documents() to leverage economies of scale when
querying multiple documents at once.
Changed
- The public interface of the library is declared stable, hence it is ready for version 1.0.
- char_ngrams() is now fully implemented in Rust, giving a speedup of 2x.
- minhash LSH uses the new query_documents() of the storage backends instead of running concurrent
queries.
Fixed
- Wrong operator precedence in minhash implementation, which lead to incorrect results.
- Incorrect parsing of tokenize argument for SimilarityStore for char_ngrams without padding.
Release v0.10.0
Changed
- Improved performance of SimilarityStore.query_top_n()
Release v0.9.3
Fixed
- Fixes #63 which led to Exceptions in case of empty documents.
Release v0.9.2
Fixed
- Fixes #62 which led to TypeErrors in case of multiple identical results.
Release v0.9.1
Changed
- Minimum number of hash permutations for Minhash LSH set to 16 to avoid artifacts as described
in #61.
Release v0.9.0
Added
- ScyllaDBStore now accepts a
table_prefix
setting.
Changed
- The classes in narrow_down.data_types were moved to narrow_down.storage.
- The
initialize()
method of the storage backends can now be called multiple times without issues.
Fixed
- A use of collections.Counter as typehint broke mypy checks.
Release v0.8.0
Added
- Direct InMemoryStore file serialization in the Rust backend.
This avoids a memory peak and also improves the performance of the operation compared to
(de-)serialization via the detour of a Python bytes object.
Release v0.7.0
Added
- InMemoryStore can be serialized to and deserialized from MessagePack.
- SimilarityStore.top_n_query() now allows to find a limited number of most similar documents.
- SimilarityStore offers the option to validate the similarity score if the document is available
to avoid false positives.
Changed
- SimilarityStore objects can now be created by a factory coroutine
create()
instead of
calling first__init__()
and theninitialize()
. This makes the usage of the class more
straight-forward. - The exact_part of a document is now also stored in storage level "Document".
- The InMemoryStore no longer uses Python dictionaries as storage, but rather a class in the Rust
extension to reduce the memory footprint by a lot.
Fixed
- The number of partitions is now stored in the database for the SQLite backend. This way the DB
is self-contained and the user doesn't have to keep the number elsewhere.