instant-id: added deleting of full precision pipeline before quantization #2539

nikita-savelyevv · 2024-11-19T17:30:15Z

Changes:

In order to reduce peak memory footprint, full precision OV pipeline is now deleted right after calibration dataset is collected. This way it does not take up additional memory during quantization.
Comparing inference speed of original and optimized pipelines is now optional (disabled by default) for the same reason.

After applying the changes and updating to openvino-nightly, the peak memory is observed to drop from 120 GB to 60 GB.

Related ticket: 146016

review-notebook-app · 2024-11-19T17:30:20Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

nikita-savelyevv · 2024-11-19T17:33:27Z

@l-bat , it may be beneficial to consider logic like this when implementing quantization of large pipelines in the future

eaidova · 2024-11-20T05:22:46Z

@nikita-savelyevv please fix code formatting

notebooks/instant-id/instant-id.ipynb

Add deleting of full precision pipeline before quantization

32cae92

Spellchecker

2c603c6

eaidova approved these changes Nov 20, 2024

View reviewed changes

Black

0da3259

eaidova reviewed Nov 20, 2024

View reviewed changes

notebooks/instant-id/instant-id.ipynb Show resolved Hide resolved

nikita-savelyevv added 2 commits November 20, 2024 10:05

Added path arguments to create_ov_pipe function

998ffdc

Fix to_quantize default value

0e7dd3c

eaidova merged commit 8d79475 into openvinotoolkit:latest Nov 20, 2024
16 checks passed

Provide feedback