Use staged builds to minimize final image sizes #1031

eero-t · 2024-10-25T17:33:05Z

Description

Staged image builds so that final images do not have redundant things like:

Git tool and its deps (e.g. Perl)
Git repo history
Test directories

And drop explicit installation of:

langchain_core: GenAIComps installs langchain which already depends on that
jemalloc & GLX: nothing uses them (in any of the ChatQnA services), and for testing[1] it's trivial to create separate image adding those on top
File descriptor limit increase for ~/.bashrc (as these images run Python programs directly, not through Bash scripts)

=> This demonstrates that only 2-3 lines in the Dockerfiles are unique, and everything preceding those could be removed with a common base image.

[1] I assume those files were there to test this: https://pytorch.org/tutorials/recipes/recipes/tuning_guide.html#switch-memory-allocator

Issues

Fixes: #225

Type of change

Others (image size improvement / Dockerfile cleanup)

Dependencies

n/a (this removes redundant Git, Perl, jemalloc, GLX dependencies from final images)

Tests

This is draft / example for fixing #225

I have not tested it apart from verifying that images still build.

Notes

In a proper fix, non-unique part of the Dockerfiles would be a separate base image, generated with GenAIComps repo Dockerfile, and Dockerfiles in this repository would depend on that image instead of python-slim.

However, that requires co-operation between these two repositories (unless components base image Dockerfile is also in this repo) and:

CI handling this dependency i.e. building the base image first, when relevant
That base image being in a repository accessible for building the application images
- E.g. in OPEA Docker hub project

(I.e. it needs to be done by a member of this project, I cannot do it.)

eero-t · 2024-10-25T21:24:08Z

None of the test failures are due to my changes.

CodeGen Gaudi test TGI fail is due to it trying to load HuggingFace model it has no rights for:

Access to model meta-llama/CodeLlama-7b-hf is restricted and you are not in the authorized list.
Visit https://huggingface.co/meta-llama/CodeLlama-7b-hf to ask for access.

CodeGen Xeon test TGI seems to fail due to: Could not import SGMV kernel from Punica, which may be similar issue.

VisualQnA Gaudi & Xeon tests fail is due to NPM dependency conflict for it's Node.js Svelte UI container build (which spec is not touched by this PR).

eero-t · 2024-11-14T13:35:31Z

Rebased this example to latest main, on assumption that CI issues have been fixed in the meanwhile.

Note: I did not update the Dockerfiles for applications that were added after this PR was created:

GraphRAG
EdgeCraftRAG

eero-t · 2024-11-18T11:46:59Z

Updated also the new ChatQnA/Dockerfile.wrapper to staged build.

Rebased to latest main, as previously used main failed in CI.

eero-t · 2024-11-22T10:26:47Z

No idea why guardrails times out:

Waiting for deployment "chatqna-tgi" rollout to finish: 0 of 1 updated replicas are available...
deployment "chatqna-tgi-guardrails" successfully rolled out
error: deployment "chatqna-tgi" exceeded its progress deadline
+ echo 'Timeout waiting for chatqna_guardrail pod ready!'
+ exit 1
Timeout waiting for chatqna_guardrail pod ready!

And translation fails:

curl: (18) transfer closed with outstanding read data remaining
Validate Translation failure!!!

As CI does not provide enough information.

So that redundant things do not end in final image: - Git repo history - Test directories - Git tool and its deps And drop explicit installation of: - jemalloc & GLX: nothing uses them (in ChatQnA at least), and for testing it's trivial to create image adding those on top: https://pytorch.org/tutorials/recipes/recipes/tuning_guide.html#switch-memory-allocator - langchain_core: GenAIComps install langchain which already depends on that This demonstrates that only 2-3 lines in the Dockerfiles are unique, and everything before those can be removed with a common base image. Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

eero-t · 2024-11-22T10:52:36Z

Rebased to main, and updated also GraphRAQ Dockerfile.

EdgeCraftRAG was not updated because it's using comps-base package from pip, instead of cloning Comps repo.

eero-t · 2024-11-22T10:59:19Z

@lvliang-intel CI seems to be in rather bad state, as CMake is segfaulting on image builds:

 [vllm build 5/7] RUN --mount=type=cache,target=/root/.cache/pip     --mount=type=cache,target=/root/.cache/ccache     --mount=type=bind,source=.git,target=.git     VLLM_TARGET_DEVICE=cpu python3 setup.py bdist_wheel &&     pip install dist/*.whl &&     rm -rf dist:
...
4.853 subprocess.CalledProcessError: Command '['cmake', '/workspace/vllm', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DVLLM_TARGET_DEVICE=cpu', '-DCMAKE_C_COMPILER_LAUNCHER=ccache', '-DCMAKE_CXX_COMPILER_LAUNCHER=ccache', '-DCMAKE_CUDA_COMPILER_LAUNCHER=ccache', '-DCMAKE_HIP_COMPILER_LAUNCHER=ccache', '-DVLLM_PYTHON_EXECUTABLE=/usr/bin/python3', '-DVLLM_PYTHON_PATH=/workspace/vllm:/usr/lib/python310.zip:/usr/lib/python3.10:/usr/lib/python3.10/lib-dynload:/usr/local/lib/python3.10/dist-packages:/usr/lib/python3/dist-packages:/usr/local/lib/python3.10/dist-packages/setuptools/_vendor', '-DFETCHCONTENT_BASE_DIR=/workspace/vllm/.deps', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=152']' returned non-zero exit status 1.
5.371 Segmentation fault (core dumped)

eero-t requested a review from lvliang-intel as a code owner October 25, 2024 17:33

eero-t marked this pull request as draft October 25, 2024 17:33

eero-t mentioned this pull request Oct 25, 2024

Why containers use hundreds of MBs for Vim/Perl/OpenGL? #225

Open

eero-t force-pushed the staged-images branch from 734bfd0 to 3e49050 Compare October 25, 2024 18:39

eero-t force-pushed the staged-images branch from 3e49050 to 07051a7 Compare November 14, 2024 13:30

eero-t force-pushed the staged-images branch from 07051a7 to f43ab84 Compare November 18, 2024 11:44

eero-t force-pushed the staged-images branch from f43ab84 to 95e0c76 Compare November 22, 2024 10:50

ashahba self-assigned this Nov 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use staged builds to minimize final image sizes #1031

Use staged builds to minimize final image sizes #1031

eero-t commented Oct 25, 2024 •

edited

Loading

eero-t commented Oct 25, 2024 •

edited

Loading

eero-t commented Nov 14, 2024

eero-t commented Nov 18, 2024

eero-t commented Nov 22, 2024

eero-t commented Nov 22, 2024

eero-t commented Nov 22, 2024

Use staged builds to minimize final image sizes #1031

Are you sure you want to change the base?

Use staged builds to minimize final image sizes #1031

Conversation

eero-t commented Oct 25, 2024 • edited Loading

Description

Issues

Type of change

Dependencies

Tests

Notes

eero-t commented Oct 25, 2024 • edited Loading

eero-t commented Nov 14, 2024

eero-t commented Nov 18, 2024

eero-t commented Nov 22, 2024

eero-t commented Nov 22, 2024

eero-t commented Nov 22, 2024

eero-t commented Oct 25, 2024 •

edited

Loading

eero-t commented Oct 25, 2024 •

edited

Loading