OPEA Release Notes v1.0

What’s New in OPEA v1.0

Highlights
- Improve the RAG performance through microservice optimizations (e.g., Hugging Face TGI, vLLM) and megaservice tuning
- Provide the experimental LLM model training support, includes full fine-tuning and parameter-efficient fine-tuning (PEFT)
- Improve RAG with Knowledge Graph based on Neo4j
- Improve VisualQnA and provide multi-modality RAG support
- Faster microservice launch through removal of some dispatch overhead
- Enable Gateway with guardrail, and integrate nginx with CORS protection and data preparation
- Enable HorizontalPodAutoscaler (HPA) for better resource management
- Define the metrics of RAG performance and enable accuracy evaluation for more GenAI examples
- Further improvement on documentation and developer experience
Other features
- Enable OpenAI compatible format on applicable microservices
- Support microservice launch from ModelScope to address China ecosystem need
- Support Red Hat OpenShift Container Platform (RHOCP)
- Refactor the code and CI/CD pipeline to provide better support for contributors
- Improve Docker versioning to avoid the potential conflict
- Enhance GenAI Microservice Connector (GMC), including improvements such as router performance optimizations and other updates
- Introduce Memory Bandwidth Exporter that integrates with Kubernetes Node Resource Interface
Learn more about OPEA at
- Getting Started: https://opea-project.github.io/latest/index.html
- Github: https://github.com/opea-project
- Docker Hub: https://hub.docker.com/u/opea
Release Documentation:
- Landing Page: https://opea.dev/
- Release Notes: https://github.com/opea-project/docs/tree/main/release_notes

Details

GenAIExamples

Deployment
- Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)
- K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)
- Update mount path in xeon k8s(2a6af64)
- Add Nginx - k8s manifest in CodeTrans(6a679ba)
- Add Nginx - docker in CodeTrans(cc84847)
- watch more docker compose files changes(4b0bc26)
- Add chatQnA UI manifest(758d236)
- Revert the LLM model for kubernetes GMS(f5f1e32)
- [ChatQnA] Update retrieval & dataprep manifests(6730b24)
- [ChatQnA]Update manifests(3563f5d)
- [ChatQnA] Update benchmarking manifests(36fb9a9)
- [ChatQnA] udate OOB & Tuned manifests(ac34860)
- Add nginx and UI to the ChatQnA manifest(05f9828)
- [ChatQnA] Update OOB with wrapper manifests.(933c3d3)
- [Translation] Support manifests and nginx(1e13031)
- update V1.0 benchmark manifest (e5affb9)
- update image name(e2a74f7)
- K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)
- Change megaservice path in line with new file structure(5ab27b6)
- Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)
- Add chatQnA UI manifest(758d236)
- Yaml: add comments to specify gaudi device ids.(63406dc)
- add tgi bf16 setup on CPU k8s.(ba17031)
Documentation
- [ChatQnA] Update README for ModelScope(aebc23f)
- Update README.md(4bd7841)
- [ChatQnA] Update README for without Rerank Pipeline(6b617d6)
- [ChatQnA] Update Benchmark README for w/o rerank(4a51874)
- Fix readme for nv gpu(43b2ae5)
- [ChatQnA] Update Benchmark README to Fix Input Length(55d287d)
- Refine ChatQnA README for TGI(afc3341)
- Add default model for VisualQnA README(07baa8f)
- Update readme for manifests of some examples(adb157f)
- doc: use markdown table in supported_examples(9cf1d88)
- doc: remove invalid code block language(c6d811a)
- add AudioQnA readme with supported model(f4f4da2)
- add more code owners(7f89797)
- doc: fix headings(7a0fca7)
- [Codegen] Refine readme to prompt users on how to change the model.(814164d)
- Update README.md and remove some open-source details(2ef83fc)
- Add issue template(84a781a)
- doc: fix headings and indenting(67394b8)
- Add default model in readme for FaqGen and DocSum(d487093)
- Change docs of kubernetes for curl commands in README(4133757)
- Update v0.9 RAG release data(947936e)
- Explain Default Model in ChatQnA and CodeTrans READMEs(2a2ff45)
- Update docker images list.(a8244c4)
- refactor the network port setting for AWS(bc81770)
- Add validate microservice details link(bd811bd)
- [ChatQnA] Add Nginx in Docker Compose and README(6c36448
- [Doc] Update CodeGen and Translation READMEs(a09395e)
- [Doc] Refine READMEs(372d78c)
- Remove marketing materials(d85ec09)
- doc PR to main instead of of v1.0r(dc94026)
- Update README.md for Multiplatforms(b205dc7)
- Refine the quick start of ChatQnA(3b70fb0)
- Update supported_examples(96d5cd9)
- [Doc] doc improvement(e0b3b57)
- Fix README issues(bceacdc)
- doc: fix broken image reference and markdown(d422929)
- doc: give document meaningful title(a3fa0d6)
- doc: fix incorrefine readme for reorg(d2bab99)
- doc: fix incorrect path to png image files (d97882e)
- update doc according to comments(f990f79)
- doc: fix headings and indenting(67394b8)
- Update README.md(4bd7841)
- refine readme for reorg(d2bab99)
- Update README with new examples(2d28beb)
- README: fix broken links(ff6f841)
- Update v0.9 RAG release data(947936e)
- Update README.md of pdf file(87e51d5)
- [ChatQnA] Update README for ModelScope(aebc23f)
- Add table to list port, endpoint, framework, model, serving, and hardware for each microservice in ChatQnA(1a934af)
- Update SearchQnA document and compose.yaml(5c67204)
- Update invalid link(7b2194f)
- AgentQnA: Fix erroneous link in the README(1144fae)
- Fix Xeon reference per its trademark(e1b8ce0)
- Provide the method to get nke-10k-2023.pdf(a2745b2)
- adopted tech writing style(558ea3b)
- Improve ChatQnA flowchat according to feedback(375ea7a)
- Fix BACKEND_SERVICE_ENDPOINT variable value in the VideoQnA instructions(79e947e)
- [Doc] Refine ChatQnA README(7eaab93)
Functionalities and Bug Fix
- Fix refactor bug(7c13f2c)
- Provide the method to get nke-10k-2023.pdf(a2745b2)
- Integrate visualQnA backend(fa12083)
- Enable nginx for VisualQnA(def19b4)
- Add Settings and Update system Prompt option(1d1e1f9)
- Refactor folder to support different vendors(d73129c)
- Add rerank finetuning example(71857f5)
- remove logs for benchmark(e0bc5f2)
- update image build for 2 new examples(0869029)
- fix comps/nginx image build content(22d066a)
- react-ui: Add support to display Chinese(8c40204)
- [VisualQnA] Update compose.yaml to fix the endpoint url issue in UI(fbaa024)
- Add megaservice definition without microservice wrappers(ebe6b47)
- Add instruction tuning example(4c78f8c)
- fix token name(1e47444)
- Modify the handling of detected warnings to only prompt.(e6f5d13)
- Always upload scan artifacts(6f3e54a)
- Update ChatQnA env (32afb65)
- Yinghu5 patch 1(beda609)
- Update ollama run command(10c81f1)
- weekly update images tag(035f39f)
- Fix port conflict in llava-tgi-service in VisualQnA(993688a)
- Remove 'vim' from all Dockerfiles(1874dfd)
- enhance image publish action(5fde666)
- Update port in set_env.sh for TGI endpoint(e5ec38c)
- move evaluation scripts(f04f061)
- Handle uncontrolled data path for MultimodalQnA v1.0 release(872e93e)
- Align parameters for "max_token, repetition_penalty,presence_penalty,frequency_penalty"(2f03a3a)
- Remove useless folder.(88829c9)
- Enable nginx for VisualQnA(def19b4)
- Refactor folder to support different vendors(d73129c)
- fix path bug for reorg(264759d)
- fix reorg bug(504228e)
- update image build for 2 new examples(0869029)
- Add megaservice definition without microservice wrappers(ebe6b47)
- Add hyperlinks picture paths validation.(0611707)
- Added gaudi example for rerank model finetuning(edcc50f)
- Add VideoRAGQnA as MMRAG usecase in Example(2dd69dc)
- Agent example for v1.0 release(262a6f6)
- Fix issues with the VisualQnA instructions (bc4bbfa)
- Made cogen react ui to use runtime environment variables(b84c989)
- add image build for new examples(3f2e7b7)
- fix image build issue on push(88fde62)
- Add Settings and Update system Prompt option(1d1e1f9)
- [ChatQnA] Add no_wrapper benchmarking and update legacy manifests(06696c8)
- ProviIntegrate visualQnA backend(fa12083)
- Integrate visualQnA backend(fa12083)
- Add imagePrompt to display default image hint(e48532e)
- BUGFIX: rename videoragqna to videoqna to align with other examples(e102291)
- Fix megaservice ulimit issue under high concurrency(4112fd0)
CI/CD/UT
- Add new test cases for VisualQnA(995a62c)
- docker image cd workflow enhance (675ea4a)
- optimize image scan cd workflow(dba908a)
- Refine code scan output and remove opea_release_data.md.(21e215c)
- Fix other repo issue.(412a0b0)
- [DocIndexRetriever] Add xeon test and fix gaudi test (62dbb6d)
- watch more docker compose files' changes(4b0bc26)
- fix typo in test script in AgentQnA(10fe3c6)
- Fix InstructionTuning and RerankFinetuning tests(be8e283)
- Fix issue(0bb0abb)
- print image build test commit(3ce3955)
- Fix SearchQnA tests bug(daf2a4f)
- [ProductivitySuite] Fix CD Issue(d55a33d)

GenAIComps

Cores
- Optimize mega flow by removing microservice wrapper(0bb69ac)
- Fix guardrails out handle logics for space linebreak and quote(e38ed6d)
- fix mismatched response format w/wo streaming guardrails(b6c0785)
Fine-tuning/Pre-training
- Added finetuned model deployment tutorial in readme(2931147)
- Add LLM pretraining support(58e9972)
- updates to containers for finetuning composite(f4d123c)
- enable embedding finetuning(7e1a2e5)
- update finetuning doc(7d2cd6b)
- Support rerank model finetuning(7d9265f)
- remove Update checkpoint format(8369fbf)
- finetuning models limitation.(a924579)
- Update checkpoint format(8369fbf)
- update upload_training_files format(3367b76)
- refine logging code.(5b3053f)
- Added finetuned model deployment tutorial in readme(2931147)
- enable embedding finetuning(7e1a2e5)
LVM/Video RAG
- Fix lvms videl-llama code issue(38abaab)
- Fix LVM streaming issue(fb4b8d2)
- Add schema to Redis initialization & Improve LVM-TGI For Multimodal Retriever Microservice(23cc3ea)
- Retriever and lvm update for multimodal rag on videos(1513998)
- BUG FIX: LVM security fix(3e548f3)
- Add Megaservice support for MMRAG VideoRAGQnA usecase(2c48bc8)
- Add local Rerank microservice for VideoRAGQnA(5fb4a38)
- Add Megaservice support for MMRAG - MultimodalRAGQnAWithVideos usecase(99be1bd)
- Bugfix for PR 496 to add format_video_name function(54aa943)
- Prediction Guard LVM component(1249c4f)
- Fix LVM streaming issue(fb4b8d2)
- Fix lvms videl-llama code issue(38abaab)
- Fix vLLM components images building(161c338)
- Add schema to Redis initialization & Improve LVM-TGI For Multimodal Retriever Microservice(23cc3ea)
LLM/Rerank/Retrieval
- fix vllm llamaindex stream bug(ca94c60)
- Support Llama index for llms native(2e41dcf)
- Prediction Guard LLM component(391c4a5)
- update vllm to latest version for hpu(599a58f)
- Align parameters for "max_token, repetition_penalty,presence_penalty,frequency_penalty"(3a31295)
- optimize rerank with backend ref(d76751a)
- add VDMS retriever microservice for v0.9 Milestone(445c9b1)
- Fix the Retriever README error(1d761fa)
- optimize rerank with backend ref(d76751a)
- unify default reranking model with BAAI/bge-reranker-base(48d4e53)
- Fix Ollama langchain upgrade issue(8adbcce)
- vllm langchain: Add Document Retriever Support(0f2c2b1)
- Support Llama index for vLLM(8e3f553)
- Changes to comps/llms/text-generation/README(18092f3)
- Fix security problem(a672569)
DataPrep/vector stores
- Fix the loading error of jsonl file(2fbce3e)
- To avoid port conflicts change port to others.(89197e5)
- Dataprep fetch page fix(01886fe)
- Multimodal dataprep(6d4b668)
- Refine Dataprep Milvus MS(7686cfa)
- dataprep: Fix issue in uploading docx with embedding image(b873cf8)
- add: Pathway vector store and retriever as LangChain component(2c2322e)
- adding lancedb to langchain vectorstores(2360e5a)
- adding dataprep support for CLIP based models for VideoRAGQnA example for v1.0(f84d91a)
- Fix the loading error of jsonl file(2fbce3e)
Other Components
- Fix intent detection code issue(4c0f527)
- clear some unnecessary scripts and Dockerfile commands.(824a7e2)
- Update CODEOWNERS(5537b7f)
- doc: fix heading levels in markdown content(a8a46bc)
- [Reorg] Reorg Folder to Support Different Vendors(bea9bb0)
- unify default reranking model with BAAI/bge-reranker-base(48d4e53)
- feedback_management: Remove 'vim' from Dockerfile(b2e64d2)
- switch to using upstream 'tgi-gaudi' on HuggingFace(90cc44f)
- Using Pip '--no-cache-dir' within all Dockerfiles(f1f866f)
- Change image tag.(2093558)
- add code owners(0379aeb)
- Remove revision for TEI Embedding(d609071)
- BUGFIX: fix SearchedMultimodalDoc in docarray(ed44b44)
- Feedback management microservice component(72123b2)
- bump version into v1.0(9a1af76)
- Add Scan Container.(0d49244)
- Remove 'vim' from all Dockerfiles(25174c0)
- update image build yaml(b541fd8)
- ollama: Update curl proxy.(f510b69)
- Embedding Runtime on NeuralSpeed(0292355)
- add microservice for intent detection(84a7e57)
- Update README.md for Multiplatforms(ef90fbb)
- doc: fix heading levels(f8f8854)
- Prediction Guard embeddings component(191061b)
- [ChatQnA] Support K8S Python Client to export ChatQnA E2E manifests(af4e0f8)
- Add Megaservice support for MMRAG VideoRAGQnA usecase(2c48bc8)
- replace langchain/langchain:latest with python:3.11-slim(6ce6551)
- Support for UI of MultimodalRAGWithVideos in GenAIExamples(7664578)
- [Reorg] Reorg Folder to Support Different Vendors(bea9bb0)
- Remove fixed version in requirements.txt(f416f84)
- Update README.md for broken/missing readme(00227b8)
- adding embedding support for CLIP based models for VideoRAGQnA example for v0.9(2a53e25)
- same PR as #694 but on main branch(4b5d85b)
- doc: Fix headings(f6ae4fa)
- Fix all the microservices which affected by langchain version upgrade(04385c9)
- update version freeze for requirements-runtime.txt(1e4c382)
- add contributing section to main readme(2ba3516)
- Update embedding svc test port number(574fecf)
- Enable GraphRAG with Neo4J(29fe569)
- Refine READMEs after reorg(7e40475)
- Support export megaservice yaml to docker compose file(cff0a4d)
- Rename videoragqna to videoqna to align with other examples(2b68323)
- Update example name into MultimodalQnA and update image names(2ca56f3)
- Fix Reorg Issues(a3da7c1)
- Move neuralspeed embedding rerank and vllm-xft to catalog(98c62a0)
- fix ragagent text generator bug(42cde68)
- Add Bias Detection Microservice(812c85c)
- Fix intent detection code issue(4c0f527)
- Update README.md of Table in markdown(849cac9)
- update dependency version(4eee716)
CI/CD/UT
- add PREDICTIONGUARD_API_KEY for CI(94eb60f)
- update CI test log achieve(960f66c)
- expand CI timeout(6c24078)
- image scan and publish cd enhance(341f97a)
- add resume finetuning checkpoint ut.(c718602)
- Bug_fix.(2a91903)
- Optimize the content of the alerts.(8a11413)
- Add compose file.(7a21d09)
- Remove duplicate code(8325d5d)
- Fix image build fail issue.(3ce387a)
- Bug fix(12fd97a)
- enhance image publish job(9007212)
- Dockerflie check(2705e93)
- Make the scanning method optional.(ae71eee)
- Modify output messages.(3e87c3b)
- minor fix for CI detect(1785149)
- Add OpenAI client access OPEA microservice UT cases(1b69897)
- optimize ci test scope(4165c7d)
- Fixed CI yaml(3ac391a)
- Move fintuning test script path(267fb02)
- Add E2E test for bias detection of guardrails(e29865e)
- Add hyperlinks and paths validation.(ccdd2d0)
- Update manual test.(2794abd)
- Opt filecheck(61b8fa9)
- add PREDICTIONGUARD_API_KEY for CI(94eb60f)
- update ci action(b4a7f26)
- update image build compose(3d00a33)
- Adding Bias Detection Container to CI(6617e22)
- update cd workflow(3c5fc80)
- update torch cpu installation(0458443)
- Fix error.(887ca75)
- temp remove dockerfile check(2d5130f)
- Bug_fix.(2a91903)
- add resume finetuning checkpoint ut.(c718602)
- Optimize the content of the alerts.(8a11413)

GenAIEvals

Accuracy
- add audioqna asr wer eval scripts(cf8bd83)
- update llm-as-judge doc.(102fcdd)
- [v1.0] Add docker metric support(cff0a36)
- fix issue because of ragas changes(6abbe40)
- Add README for codegen acc test.(77bb66c)
- Update chatqna input to fix input length(4f46a12)
- Support bigcode eval for codegen v0.1(02b60b5)
- Add FaqGen Accuracy scripts & Refine Ragas(4df6438)
- update rag_eval readme(425b423)
- fix bigcode version when python>=3.11(1d3a502)
- add acc tuning script.(a6fd418)
Performance
- [ChatQnA] Support the replica tuning for ChatQnA(484b69a)
- Fix rerank benchmark script(8edda1c)
- Support service-list for metrics collection in benchmark.py(58502c5)
- Support benchmark file for w/o rerank pipeline(17d35e3)
- Update configuration in benchmark README(514a6d6)
- Support P50, P90, P99 for next token latency(6ac555c)
- Support microservice level benchmark(626d269)
- Support stresscli for codegen(907dc19)
- Align llm microservice parameters with end to end test(476a327)
- Fix microservice level benchmark issue(211b560)
- Add benchmark part into top README(ac52f79)
- Add CRAG benchmark(a9b087f)
- [ChatQnA] Support the replica tuning for ChatQnA(484b69a)
- add file for w/o rerank(17d35e3)
- add bench-target as the prefix of output folder(3f0ceaf)
Others
- doc: fix headings and indents(65a0a5b)
- doc: add title to new FaqGen README(52a540d)
- add code owners(047c479)
- doc: fix heading level(d5dbbf0)
- doc: fix JSON example(7318fb8)
- Update CODEOWNERS(4db9fb3)
- doc: update platform optimization document(d982681)
- doc: add title to new FaqGen README(52a540d)
- remove examples.(340f507)
- Add hyperlinks and paths validation(df58fe5)
- Remove useless file(0af532a)

GenAIInfra

GMC
- GMC: Add a CR for switch mode on one NV GPU card(02412e7)
- Update the GMC README based on current changes.(6f7a24e)
- fix GMC crashes in e2e (5a2b306)
- Add unit test for new function in GMC router(0343a2f)
- GMC: add UT for reconcile filters(6442127)
- Enable gmc build workflow on push(19fe1a2)
- Doc: Fix some typos to run GMC more smoothly(59000c5)
- Improve the performance of GMC router(68a2011)
- GMC: enhance log(a18404e)
HelmChart
- e2e helm chart: Add ui for codegen/codetrans/docsum(267d828)
- helm: Add guardrails llama_guard support(8206a8c)
- Enable guardrail case in helm e2e tests(491c2e2)
- helm chart: add nginx to avoid CORS issue(353f3a5)
- helm-chart/common: Add logging config for service components(b80ae50)
- helm-chart/data-prep: Add the missing config for dataprep-redis(b70b914)
- helm: use latest image tag on main branch(65b04dc)
- helm/manifest: Update to release v0.9(182183e)
- Add topologySpreadConstraints support(af9e1b6)
- Add TGI additional options(bf10bdd)
- Add vLLM inference engine support(0094f52)
- Remove unused values and change GenAIExamples default(26f9b16)
- 'ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu' is intel cpu(c84ac4c)
Documentation
- add code owner(59ce505)
- doc: fix headings and indenting(c10bca1)
- doc: fix headings, spelling, inter-doc references(22d012e)
- doc: fix image references(0a3e006)
- Add docs for all 3 use cases of ChatQnA examples and change models for switch case(987870f)
- doc: restructure authN-authZ directory(b9bc034)
- Update README(9480afc)
- doc: fix markdown issues(a339a87)
- Doc: Fix broken links(032ddbc)
- Enhance helm chart repo usage in README(0de5535)
- Create troubleshooting.md(d55ded4)
Others
- Fix CI bug #417(56d7d5d)
- disable hpa-values test in chart e2e in CI(9b38302)
- Add unit test for memory bandwidth exporter.(43adcc6)
- Enable unit test for memory-bandwidth-exporter in CI(923c1f3)
- add Observability for OPEA(8d304ac)
- fix a badcommit in #383(406bbc2)
- Add dataprep CR for NV platform(fa9788d)
- Add memory bandwidth exporter for AI workload.(9107af9)
- authN-authZ: update configs(0f5cef1)
- E2E: exclude terminating pods when wait_util_all_pod_ready(39fb55e)
- Add gateway guardrails(b22fc52)
- fix #314(f9204f0)
- v0.9 charts release(b2328b8)
- Restructure the directory of config sample and update the e2e test(326a637)
- Enhance ut(96cd929)
- improve cd workflows and add release document(a4398b0)
- Add HPA support to ChatQnA(cab7a88)
- Add some NVIDIA platform support docs and scripts(cad2fc3)
- Expose options of memory bandwidth exporter in k8s manifests and docker for user configuration(2517e79)
- Update the image version for ChatQnA examples(593458c)
- Update top level README(b224b65)
- Enable OIDC based Authentication with apisix(ee907d6)
- HPA improvements(8d86fff)
- authn-authz: fix CORS issue and refine doc(994250c)
- Add hyperlinks and paths validation(d8cd3a1)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generative AI Examples v1.0 Release Notes

OPEA Release Notes v1.0

What’s New in OPEA v1.0

Details