Added notebook to showcase quantization of Sentence Transformers model #955

AlexKoff88 · 2024-10-18T13:41:02Z

No description provided.

optimum/intel/openvino/modeling_diffusion.py

HuggingFaceDocBuilderDev · 2024-10-18T13:54:54Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…ers_notebook

notebooks/openvino/sentence_transformer_quantization.ipynb

helena-intel

Thanks @AlexKoff88 this is a great example.

notebooks/openvino/sentence_transformer_quantization.ipynb

helena-intel · 2024-10-22T08:46:35Z

notebooks/openvino/sentence_transformer_quantization.ipynb

+   ],
+   "source": [
+    "# FP32 baseline model\n",
+    "!benchmark_app -m all-MiniLM-L6-v2/openvino_model.xml -shape \"input_ids[1,384],attention_mask[1,384],token_type_ids[1,384]\" -api sync -niter 200"


This reshapes the model to static shapes, which has great improvement for especially INT8. But most people will not use static shapes in practice, and padding/truncating to 384 is not always desired. IMO it is fairer to compare performance by looping over a dataset (e.g. modifying the evaluate function to add timings) but then there is not as much of a performance difference. If we keep benchmark_app, it would be good to at least explain the static shapes. (Using data_shape instead of shape in benchmark_app does not reshape the model, but you still use the same shape length everywhere, so still not a standard use case)

Thanks, Helena. I would not agree in this specific case as the tokenizer truncates data anyway so it is about static shape. But I can add information about it.

Co-authored-by: Helena Kloosterman <helena.kloosterman@intel.com>

l-bat · 2024-10-24T11:18:49Z

Why is the squad dataset needed?

DATASET_NAME = "squad"
dataset = datasets.load_dataset(DATASET_NAME)

AlexKoff88 · 2024-10-24T11:42:31Z

Why is the squad dataset needed?

DATASET_NAME = "squad"
dataset = datasets.load_dataset(DATASET_NAME)

Thanks @l-bat. Fixed

AlexKoff88 · 2024-10-24T14:35:33Z

PR is ready.

echarlaix

Looks great, thanks for the addition @AlexKoff88. Could also be added to https://github.com/huggingface/optimum-intel/blob/v1.20.0/notebooks/openvino/README.md

AlexKoff88 · 2024-10-25T08:52:13Z

Looks great, thanks for the addition @AlexKoff88. Could also be added to https://github.com/huggingface/optimum-intel/blob/v1.20.0/notebooks/openvino/README.md

will do in the follow-up PR

Added notebook to showcase quantization of Sentence Transformers model

ed04ddf

AlexKoff88 commented Oct 18, 2024

View reviewed changes

optimum/intel/openvino/modeling_diffusion.py Outdated Show resolved Hide resolved

Update optimum/intel/openvino/modeling_diffusion.py

c3a3aea

AlexKoff88 commented Oct 18, 2024

View reviewed changes

optimum/intel/openvino/modeling_diffusion.py Outdated Show resolved Hide resolved

AlexKoff88 and others added 2 commits October 18, 2024 17:45

Update optimum/intel/openvino/modeling_diffusion.py

0bf2325

Style

b6220d5

AlexKoff88 added 2 commits October 19, 2024 09:36

Fixed small issue. Results are the same.

6de610c

Merge remote-tracking branch 'origin/main' into ak/sentence_transform…

27e7493

…ers_notebook

AlexKoff88 requested review from echarlaix, helena-intel and eaidova October 21, 2024 12:59

eaidova reviewed Oct 22, 2024

View reviewed changes

notebooks/openvino/sentence_transformer_quantization.ipynb Outdated Show resolved Hide resolved

eaidova reviewed Oct 22, 2024

View reviewed changes

notebooks/openvino/sentence_transformer_quantization.ipynb Show resolved Hide resolved

eaidova approved these changes Oct 22, 2024

View reviewed changes

Added description to the sections of the notebook

bae9772

helena-intel approved these changes Oct 22, 2024

View reviewed changes

AlexKoff88 and others added 5 commits October 23, 2024 13:20

Update notebooks/openvino/sentence_transformer_quantization.ipynb

8b2b912

Co-authored-by: Helena Kloosterman <helena.kloosterman@intel.com>

Update notebooks/openvino/sentence_transformer_quantization.ipynb

4d23b99

Co-authored-by: Helena Kloosterman <helena.kloosterman@intel.com>

Update notebooks/openvino/sentence_transformer_quantization.ipynb

6f025b4

Co-authored-by: Helena Kloosterman <helena.kloosterman@intel.com>

Fixed issue. Added info about benchmarking

e279b70

Fixed paths to models

f6756c9

Removed unused code

c7ffcf0

echarlaix approved these changes Oct 25, 2024

View reviewed changes

echarlaix merged commit fe82729 into main Oct 25, 2024
23 checks passed

echarlaix deleted the ak/sentence_transformers_notebook branch October 25, 2024 08:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added notebook to showcase quantization of Sentence Transformers model #955

Added notebook to showcase quantization of Sentence Transformers model #955

AlexKoff88 commented Oct 18, 2024

HuggingFaceDocBuilderDev commented Oct 18, 2024

helena-intel left a comment

helena-intel Oct 22, 2024

AlexKoff88 Oct 23, 2024

l-bat commented Oct 24, 2024

AlexKoff88 commented Oct 24, 2024

AlexKoff88 commented Oct 24, 2024

echarlaix left a comment

AlexKoff88 commented Oct 25, 2024

Added notebook to showcase quantization of Sentence Transformers model #955

Added notebook to showcase quantization of Sentence Transformers model #955

Conversation

AlexKoff88 commented Oct 18, 2024

HuggingFaceDocBuilderDev commented Oct 18, 2024

helena-intel left a comment

Choose a reason for hiding this comment

helena-intel Oct 22, 2024

Choose a reason for hiding this comment

AlexKoff88 Oct 23, 2024

Choose a reason for hiding this comment

l-bat commented Oct 24, 2024

AlexKoff88 commented Oct 24, 2024

AlexKoff88 commented Oct 24, 2024

echarlaix left a comment

Choose a reason for hiding this comment

AlexKoff88 commented Oct 25, 2024