Releases: GlobalMaksimum/sadedegel
Length Summarizer & Final Release before Due
0.11.3
- FIX: Fix a severe bug causing sadedegel summmarizers not to preserve sentence order in
sadedegel.server
calls. - ADD: Add Community Contributors section to reference
- ADD: Freeze dependency versions in
requirements.txt
- UPDATE: If SBD couldn't find a sentence assume whole document as a sentences.
- UPDATE: Reduce precision in summarizer evaluation table.
0.11.1
- ADD: Evaluation results of yet another baseline
LengthSummarizer
👍 👍 👍 - ADD: Several less critical pending pull requests
- FIX: Documentation errors
Code Coverage & Code Organization
While we are adding lots of things we significantly massed up
- Code organization
- Unit test coverage (drop down to 69% 👎 )
A sequence of maintenance releases on top of base minor release 0.10 helped us to climb back in coverage and tidy up (Current coverage is above 90% 👍 )
A few more things
- ADD: A Notebook Explanining Basic Summarization with sadedeGel
- CHANGE: We introduced/relocate two new modules called
sadedegel.bblock
(sadedeGel Building Blocks) andsadedegel.ml
and move all basic things intosadedegel.bblock
.sadedegel.ml
turns out to be a requirement to prevent cyclic reference.
Introduction of ML Based Unsupervised Summarizers
Another big release
- ADD: 3 summarization algorithm based on clustering of BERT embeddings
- ADD: Related capabilities are added to
Doc
andSent
classes to support those algorithms.
Summarizer Performance Evaluation
- ADD:
info
subcommand forsadedegel
command. This returns currently installed version, highest available version and web service version on Heroku - ADD: Introduction of Summarizer Performance Evaluation
- ADD:
sadedegel.summarize evaluate
is introduced to regenerate results we provide. - UPDATE: Summarizer abstract class is introduced.
Introduction to Annotated Corpus and several integrations
Major feature introduced with version 0.7 is the GA of human annotated summary corpus in sadedegel.dataset
- ADD: Annotated json file corpus for validating various summarizers
- ADD: Corpus validation for annotated corpus
- UPDATE:
file_paths
insadedegel.dataset
now works for all corpus types.
Other changes in this release
Document Summary Service on sadedegel.server
ADD: /doc/statistics
service to calculate various document metrics to be used by sadedeGel Chrome Extension
REST API & Public Heroku API @ Free-tier
- ADD:
sadedegel.server
is now hosted on Heroku - UPDATE: We have significantly improved our
sadedegel.server
services. - ADD:
wpm
(word per minute) based duration calculation to calculate total number of sentences to be filtered out by summarizer services. - ADD: Several service routes
/api/info
: Returningsadedegel
metadata information/
: Redirection request to sadedegel.ai- UPDATED:
/api/summarizer/random
: New route forRandomSummarizer
- UPDATED:
/api/summarizer/firstk
: New route for firstK (PositionSummarizer
) /api/summarizer/rouge1
: First non-baseline summarizer. Rouge1Summarizer is an unsupervised summarizer using rouge1 score of sentences to obtain a score for each sentence in a document.
- UPDATE:
sadedegel
now supports Python 3.6+ because offastapi
dependency
Standard Interface for Summarizers & Extended Dataset
ADD: Github Action for master branch tests all supported Python versions.
ADD: codeconv badge is added.
ADD: Features are documented.
ADD: Extended dataset (sadedegel.dataset.extended
) now has more 35K documents from various resources.
UPDATE: Extraction based summarizers now share a common prototype returning a score for each sentences in a given document.
0.3.3: Added CORS to sadedegel server
- CORS added to sadedegel server
- Test cases for sadedegel server
Maintenance Release
Integrating sadedegel with Github Actions revealed several issues which we haven't detected due to lack of a CI flow
ADD: Github Actions integration.
ADD: sadedeGel HTTP server is introduced.
ADD: ROUGE1 score is added into sadedegel.metrics
.
ADD: CONTRIBUTING.md
, borrowed from SpaCy project, shows our guidelines for contributing sadedeGel.
FIX: Missing development and production dependencies are added.
FIX: NLTK ML based model dowload before unit testing.