Releases: deepjavalibrary/djl
DJL v0.31.0 Release
Key Changes
- Engine Updates:
- Added Android support for HuggingFace Tokenizers @naveen521kk in #3531
Enhancements
- [api] Use encoder/decoder for Segment anython2 translator by @frankfliu in #3487
- [api] alternative NDArray should not be closed in NDScope by @frankfliu in #3490
- [api] Adds sam2 model to onnxruntime model zoo by @frankfliu in #3492
- [api] Standardizes CV output format by @frankfliu in #3493
- [api] Visualize sam2 output for Sam2ServingTranslator by @frankfliu in #3494
- [api] Improve Sam2Translator for PyTorch traced model by @frankfliu in #3495
- [android] Update pytorch version to 2.4.0 by @xyang16 in #3474
- [tokenizers] Use tokenizers from rust.io by @xyang16 in #3476
- [rust] Remove unnecessary clone in cublaslt by @xyang16 in #3482
- [api] Makes Sam2 input consistent with other CV model by @frankfliu in #3498
- [api] Adds serving support for some CV models by @frankfliu in #3499
- HuggingFaceTokenizer: add support for Android by @naveen521kk in #3531
- [tokenizer] Updates tokenizer to 0.20.3 by @xyang16 in #3514
- [tokenizer] Updates tokenizer to 0.20.3 in libs.versions.toml by @xyang16 in #3515
- [pytorch] Adds Yolo11 model to model zoo by @xyang16 in #3516
- [pytorch] Updates PyTorch to 2.5.1 by @xyang16 in #3517
- [converter] Trim jit output token_str by @xyang16 in #3527
Bug Fixes
- [PyTorch] Fixes sam2 model version by @frankfliu in #3496
- [api] Fixes QaServingTranslator output format by @frankfliu in #3500
- [djl-convert] Fix huggingface converter by @xyang16 in #3505
Documentation
- [docs] Updates PyTorch engine README for 2.4.0 by @frankfliu in #3472
- Docs: Added a link by @operagxoksana in #3510
- [doc] Add blogposts index to docs by @xyang16 in #3519
- [doc] Add blogposts link to README by @xyang16 in #3520
- add lmi breaking changes document to docs site by @siddvenk in #3529
- Update troubleshooting.md - UnsatisfiedLinkError issue by @ThiloteE in #3512
CI/CD
- [fix][ci] configure aws creds manually to avoid node20 issues on inco… by @siddvenk in #3469
- [ci] Updates gradle to 8.10.1 by @frankfliu in #3470
- [ci] configure aws creds manually to avoid node20 issues for AL2 by @frankfliu in #3471
- [ci] Update tensorrt native build container by @xyang16 in #3477
- [ci] Fixes fasttext native build for nodejs20 issue by @xyang16 in #3478
- [ci] Fixes sentencepiece native build for nodejs20 issue by @xyang16 in #3479
- Increase build version to 0.31.0 by @xyang16 in #3475
- [mxnet] Fixes build error on JDK 22 by @frankfliu in #3485
- [ci] Fixes java 21 compile error by @frankfliu in #3497
- Delete serving publish from DJL repo by @brndysgit in #3501
- Bump com.google.protobuf:protobuf-java from 3.25.4 to 3.25.5 by @dependabot in #3502
- dependency updates for djl by @siddvenk in #3528
- Add CI for the hugging face tokenizers android by @naveen521kk in #3532
- CI: setup cargo-ndk for huggingface tokenizers android builds by @naveen521kk in #3535
- update DJL version to 0.31.0 in docs by @siddvenk in #3534
- Fix a typo in
android/tokenizer-native/build.gradle
by @naveen521kk in #3537 - fix dependency versions in examples/pom.xml by @siddvenk in #3539
- [android] fix djl version for android by @siddvenk in #3526
New Contributors
- @brndysgit made their first contribution in #3501
- @operagxoksana made their first contribution in #3510
- @ThiloteE made their first contribution in #3512
- @naveen521kk made their first contribution in #3531
Full Changelog: v0.30.0...v0.31.0
DJL v0.30.0 Release
Key Changes
- Engine Updates:
- Added mask generation task for SAM2 model #3450
- Text Embedding Inference:
- Added Mistral, Qwen2, GTE, Camembert embedding model support
- Added reranker model support
Enhancement
- [api] Avoid non-ascii characters by @frankfliu in #3395
- [djl-converter] Exit with error if convert model failed by @frankfliu in #3399
- [api] Support TEI input format to reranking model by @frankfliu in #3400
- [rust] Adds sigmoid and softmax operator for Rust engine by @frankfliu in #3407
- [test] Detect GPUs with specified engine by @frankfliu in #3409
- [api] Adds Criteria.isDownload() api by @frankfliu in #3403
- [rust] Build .so file for each cuda arch by @frankfliu in #3410
- [rust] Add mistral embedding model by @xyang16 in #3412
- [tokenizers] Add supported arch in djl-convert by @xyang16 in #3416
- [tokenizers] Replace pt file names to safetensors by @xyang16 in #3417
- [rust] Load model on given device by @xyang16 in #3419
- [rust] Add qwen2 model by @xyang16 in #3420
- [rust] Support pre-downloaded rust shared library by @frankfliu in #3421
- [pytorch] Adds pad operator by @frankfliu in #3423
- [rust] Provides better error message for unsupported ops by @frankfliu in #3424
- [api] Adds center fit image operation for Yolo by @frankfliu in #3425
- [rust] Add GTE and Gemma2 model by @xyang16 in #3422
- [djl-convert] Sets default max model size limit for importing by @frankfliu in #3428
- [djl-import] Includes requires version when importing model by @frankfliu in #3431
- [android] Upgrade DJL version to 0.30.0 by @xyang16 in #3432
- [rust] Make cublaslt wrapper non static by @xyang16 in #3434
- [djl-convert] Exclude models in includeTokenTypes by @xyang16 in #3435
- [rust] Make tensor contiguous in rotary embedding by @xyang16 in #3436
- [rust] Allows -1 dim for normalize() by @xyang16 in #3442
- Refactored Identifiers by @congyuluo in #3381
- [rust] Adds text classification models to Rust model zoo by @frankfliu in #3444
- [examples] Adds segment anything 2 example by @frankfliu in #3449
- [api] Refactor ImageFeatureExtractor by @frankfliu in #3455
- [api] Adds base64 image support for ImageTranslator by @frankfliu in #3456
- [djl-import] Improve model import speed by @frankfliu in #3457
- [api] Updates dependencies version to latest by @frankfliu in #3454
- [api] Optimized text embedding post processing performance by @frankfliu in #3459
- add drawMarks to android BitMapImageFactory by @sindhuvahinis in #3460
- [ci] moving to temporary iam credentials for publishing steps by @siddvenk in #3462
- [OnnxRuntime] Update debug log message by @frankfliu in #3463
- Increase DJL version to 0.30.0 by @xyang16 in #3465
- [examples] Adds gradle tasks for each example by @frankfliu in #3466
- Upgrade dependency versions by @xyang16 in #3467
- [tokenizers] Converting encoding to int32 NDList by @xyang16 in #3468
Bug Fixes
- [api] Fixes logging calling convention by @frankfliu in #3394
- [djl-converter] Fixes import text embedding model from local folder by @frankfliu in #3388
- [djl-converter] Fixes djl-convert command line return code by @frankfliu in #3406
- [rust] Fix camembert and distilbert model loading by @xyang16 in #3415
- [rust] Fix camembert model loading by @xyang16 in #3418
- [rust] Fixes memory leak by @frankfliu in #3433
- [djl-convert] Fixes huggingface model converter by @frankfliu in #3440
- [rust] Fix bert model classifier loading by @xyang16 in #3441
- [xgb] Fixes alternative NDArray conversion issue by @frankfliu in #3453
- [djl-import] Fixes missing arguments for onnx import by @frankfliu in #3458
- [ci][fix] use v2 for aws credentials due to glib issues with node 20 by @siddvenk in #3464
Documentation
- [examples] Moves nlp examples into nlp folder by @frankfliu in #3393
- [docs] Build versions.json before mike deploy by @Varun-Dutta in #3392
- [example] Enable PyTorch for some training example by @frankfliu in #3398
- [docs] Updates docs website url by @frankfliu in #3404
- [docs] Fixes broken links in markdown files. by @frankfliu in #3408
- [djl-import] Fixes missing trust-remote-code arg for import model zoo by @frankfliu in #3427
- [docs] Updates trace whisper model document by @frankfliu in #3426
- [tensorflow] Updates tensorflow document by @frankfliu in #3430
- [docs] Adds segment anything document by @frankfliu in #3451
CI/CD
- [ci] Fixes serving publish for awscurl release version by @frankfliu in #3411
- [ci] Remove no_response workflow by @xyang16 in #3429
Full Changelog: v0.29.0...v0.30.0
DJL v0.28.0 Release
Key Changes
- Upgrades for engines
- Enhancements for engines and API
- Adds experimental Rust engine #3078
Enhancement
- [api] Automatically detect translatorFactory based on task by @frankfliu in #3136
- [api] Adds OnesBlockFactory to make it easy for testing by @frankfliu in #3140
- Ensure the alternative ND manager can use GPUs by @david-sitsky in #3138
- [api] Tries to use the same device for alternative NDManager by @frankfliu in #3146
- [api] Supports serialize NaN in json by @frankfliu in #3156
- [rust] Add rust engine implemenation by @frankfliu in #3078
- [rust] Adds Rust model zoo by @frankfliu in #3132
- [rust] Support load DJL model for RsModel by @frankfliu in #3147
- [rust] RsModel delete model in close by @xyang16 in #3170
- [tokenizers] Updates tokenizer to 0.19.1 by @frankfliu in #3143
- [tokenizer] Allows use HF_TOKEN to access gated model by @frankfliu in #3150
- [tokenizers] Create djl_converter package by @xyang16 in #3172
- [tokenizer] Refactor djl_convert python code by @frankfliu in #3179
- Updates on djl_converter by @xyang16 in #3187
- [pytorch] Updates PyTorch to 2.2.2 by @frankfliu in #3155
- [pytorch] Update PyTorch engine README for version 2.2.2 by @frankfliu in #3165
- [pytorch] optimize memory copy cost for pytorch NDArray by @ewan0x79 in #3137
- [pytorch] Updates PyTorch to 2.3.0 by @frankfliu in #3192
- [sentencepiece] Updates sentencepiece to 0.2.0 by @frankfliu in #3163
- [huggingface] Adds more option to convert onnx model by @frankfliu in #3180
Bug Fixes
- [gitignore] Avoid checking binary files. by @frankfliu in #3134
- [api] Closes file stream by @frankfliu in #3130
- [api] Fixes logging invoke convention by @frankfliu in #3148
- [api] Fixes Criteria.toString() bug by @frankfliu in #3151
- [api] Fixes tarslip issue by @frankfliu in #3075
- [examples] Fixes TextGeneration EOS bug by @frankfliu in #3177
- [tokenizer] Fixes model zoo import script by @frankfliu in #3126
- [Lgbm] fix LgbmNDArray replaced.close() release data problem by @ewan0x79 in #3174
- [rust] Fixes compile warnings by @frankfliu in #3189
- [ci] Fixes pytorch jni build for 1.13.1 by @frankfliu in #3184
- [ci] Fixes awscurl publish location by @frankfliu in #3182
- [ci] Fixes build on macOS aarch64 machine by @frankfliu in #3191
- [ci] Fixes nightly pytorch jni build by @frankfliu in #3196
Documentation
- [examples] Re-organize CV examaples by @frankfliu in #3135
- [examples] Prepare for MXNet deprecation by @frankfliu in #3157
- [doc] Removes mention of future lab by @zachgk in #3154
- [docs] Updates docs for setup java on mac by @frankfliu in #3188
- [website] Remove live demo from djl.ai web page by @frankfliu in #3171
- Fixed Typo in Docs by @fensch in #3193
- Update README.md by @elect86 in #3195
CI/CD
- [ci] Update github action runner to macOS x86_64 instance by @frankfliu in #3144
- [ci] Updates google code formatter to 1.22.0 by @frankfliu in #3149
- [ci] Upgrades gradle to 8.5 by @frankfliu in #3153
- [ci] Updates dependencies version by @frankfliu in #3164
- [ci] Adds cuda version as github actions parameter for Pytorch JNI build by @frankfliu in #3185
New Contributors
- @david-sitsky made their first contribution in #3138
- @elect86 made their first contribution in #3195
Full Changelog: v0.27.0...v0.28.0
DJL v0.29.0 Release
Key Changes
-
Upgrades for engines
- Upgrades PyTorch engine to 2.3.1
- Upgrades TensorFlow engine to 2.16.1
- Introduces Rust engine CUDA support
- Upgrades OnnxRuntime version to 1.18.0 and added CUDA 12.4 support
- Upgrades javacpp version to 1.5.10
- Upgrades HuggingFace tokenizer to 0.19.1
- Fixes several issues for LightGBM engine
- Deprecated llamacpp engine
-
Enhancements for engines and API
- Adds Yolov8 segmentation and pose detection support
- Adds metric type to Metic class
- Improves drawJoints and drawMask behavior for CV model
- Improves HuggingFace model importing and conversion tool
- Improves HuggingFace NLP model batch inference performance
- Adds built-in ONNX extension support
- Adds several NDArray operators in PyTorch engine
- Adds fp16 and bf16 support for OnnxRuntime engine
- Adds CrossEncoder support for NLP models
Enhancements
- Adds metric type to Metic class by @frankfliu in #3244
- Improves drawJoints behavior by @frankfliu in #3305
- [api] Allows to control json pretty print with env var by @frankfliu in #3288
- [api] Avoid null dimensions for Metric by @frankfliu in #3246
- [api] Improve NDArray.toDebugString() output by @frankfliu in #3290
- [api] Loads native engine in deterministic order by @frankfliu in #3300
- [api] Refactor drawMask() for instance segmentation by @frankfliu in #3304
- [api] Refactor nms for yolo translator by @frankfliu in #3297
- add close method to all nd manager by @lanking520 in #3225
- ported tools/stats.gradle by @elect86 in #3219
- use standard GSON output by @lanking520 in #3284
- [enhancement] Optimize memory copy overhead to enhance performance. by @ewan0x79 in #3289
- Gradle Kotlin script plus other stuff by @elect86 in #3167
- Improved incremental build by @benjie332 in #3231
- Refactored Identifiers by @congyuluo in #3276
- Refactored Identifiers by @congyuluo in #3282
- [gradle] Remove unused gradle files by @frankfliu in #3280
- [jacoco] exclude spark extension since it doesnot contain test by @frankfliu in #3230
- [Lgbm] support multi classification by @ewan0x79 in #3234
- [Lgbm] support multi type prediction by @ewan0x79 in #3237
- [llamacpp] Removing llamacpp support in DJL by @frankfliu in #3312
- [mxnet-model-zoo] Adds missing translatorFactory in metadata by @frankfliu in #3279
- [onnx] Adds fp16 and bfp16 support for OnnxRuntime by @frankfliu in #3281
- [onnxruntime] Add debug message for OnnxRuntime by @xyang16 in #3217
- [onnxruntime] Adds yolov8n pose model for OnnxRuntime by @frankfliu in #3309
- [onnxruntime] Adds yolov8n-seg model to onnxruntime model zoo by @frankfliu in #3310
- [onnxruntime] Load onnx extenstion if available by @frankfliu in #3333
- [pytorch] Adds Yolov8n-seg model to model zoo by @frankfliu in #3308
- [pytorch] Adds back PyTorch 2.1.2 support by @frankfliu in #3285
- [pytorch] Adds yolov8n pose estimation model by @frankfliu in #3298
- [pytorch] Implements gammaln operator for PyTorch by @frankfliu in #3262
- [pytorch] Split maven publish into two parts by @frankfliu in #3273
- [rust] Add tokenizer cuda build workflow by @xyang16 in #3322
- [rust] Allows -2 as dims for sum() by @frankfliu in #3221
- [rust] Change loging level to debug by @xyang16 in #3336
- [rust] Download cu124 jni library for cuda by @xyang16 in #3327
- [rust] Remove 0-dimension tensor compare in NDArrayTests by @xyang16 in #3320
- [rust] Update gpu build pipeline to cu122 by @xyang16 in #3334
- [rust] Upgrade candle version by @xyang16 in #3248
- [rust] Use fused layer by @xyang16 in #3260
- [spark] Do not support model_url by @xyang16 in #3224
- [spark] Update dependency versions by @xyang16 in #3241
- [spark] Updates spark version to 3.5.1 by @frankfliu in #3240
- [spark] Use batch predict API by @xyang16 in #3242
- [text-embedding] Remove CrossEncoderTranslatorFactory in favor of TextEmbeddingTranslatorFactory by @frankfliu in #3239
- [tokenizer] Adds maxos-13 support back by @frankfliu in #3328
- [tokenizer] Ensure GPU is used in TextEmbeddingTranslator by @david-sitsky in #3212
- [tokenizer] Process text embedding input and output in stacked NDArray by @xyang16 in #3213
- [tokenizer] Recover accidentally deleted file by @frankfliu in #3311
- [tokenizer] Supports cross encoder for text classification model by @frankfliu in #3338
- [tokenizers] Download jni lib files for cuda by @xyang16 in #3326
Bug Fixes
- [api] Fix unitest in GPU docker running on CPU case by @frankfliu in #3228
- [api] Fixes IdEmbedding memory leak by @frankfliu in #3257
- [api] Fixes nightly tests on GPU machine by @frankfliu in #3302
- [api] Fixes unitest by @frankfliu in #3210
- [fix] fix lgbm bytebuffer native order by @ewan0x79 in #3258
- Fix
Application.of
missing some applications by @tadayosi in #3277 - [mxnet] Fixes GloveWordEmbeddingTranslator bug by @frankfliu in #3287
- [pytorch-model-zoo]: fix PtSsdTranslator.Builder.self() by @eversnarf in #3204
- [pytorch] Fixes PyTorch 2.3.1 windows dependencies by @frankfliu in #3269
- [pytorch] Fixes PyTorch 2.3.1 windows dependencies by @frankfliu in #3270
- [pytorch] Fixes uploadS3 gradle task by @frankfliu in #3263
- [rust] Fix NDArrayTests failure on cuda by @xyang16 in #3319
- [rust] Fix deleteModel error by @xyang16 in #3229
- [rust] Fix output tensor dtype by @xyang16 in #3249
- [rust] Fix tokenizer cuda pipeline name by @frankfliu in #3325
- [rust] Fixes test failure on GPU by @frankfliu in #3301
- [timeseries] Fixes contentLength issue for inference by @frankfliu in #3306
- [timeseries] Fixes duration format issue by @frankfliu in #3307
- [tensorrt] Fixes gradle biuld script by @frankfliu in #3253
- [tokenizer] Fixes detect include token type logic by @frankfliu in #3318
- [tokenizer] Fixes tokenizer build workflow by @frankfliu in #3323
- [tokenizers] Fixes huggingface build for Windows by @frankfliu in #3330
- [tokenizers] Fixes memory leak when there is overflowing tokens by @baldersheim in #3317
- [xgb] Fixes gradle build script by @frankfliu in #3254
Documentation
- [doc] add output formatter schema to LMI docs.djl.ai by @sindhuvahinis in #3268
- [doc] add release notes to docs.djl.ai by @sindhuvahinis in #3266
- [docs] Bump up DJL version to 0.28.0 by @frankfliu in #3247
- [docs] Update example reference by @emmanuel-ferdman in #3275
- [docs] add dark theme and fixed broken link by @Varun-Dutta in #3295
- [example] Adds PyTorch action recognition model to model zoo...
DJL v0.24.0 Release
Key Features
- Engine Upgrades
- SafeTensors support #2763
- YoloV8 Support #2776
Enhancement
- [spark] Update djl version in dockerfile by @sindhuvahinis in #2712
- [pytorch] Makes PyTorch 2.0.1 default version for DJL 0.24.0 by @frankfliu in #2710
- pytorch support inference on separate cuda stream by @jiyuanq in #2706
- [spark] Update javacv version to 1.5.9 for spark docker by @frankfliu in #2713
- [pytorch] Upgrade pytorch andorid to 2.0.1 by @frankfliu in #2717
- [api] Makes getNeuronDevices() public by @frankfliu in #2721
- [api] Log warning message if failed to load specified class by @frankfliu in #2724
- [api] Workaround detect neuron issue on SageMaker by @frankfliu in #2729
- Setup custom ft build for Llama support by @rohithkrn in #2732
- [api] Fixes NeuronUtils issue when running as non-root user by @frankfliu in #2735
- Adds Utils.getEnvOrSystemProperty with default by @zachgk in #2742
- Issue #2693 Implement PtNDArrayEx.multiBoxPrior with validation by @juliangamble in #2715
- [api] Implements NDArray.toType() for NDArrayAdapter by @frankfliu in #2746
- [onnxruntime] Upgrades onnxruntime version to 1.15.1 by @frankfliu in #2743
- [api] Output endPosition induced by reaching EOS token by @KexinFeng in #2730
- [api] Adds Safetensors support by @frankfliu in #2763
- [SentencePiece] Make SpProcessor public by @frankfliu in #2765
- [tokenizer] Print out warning in model_zoo_importer by @frankfliu in #2759
- To support Yolov8 by @SidneyLann in #2776
- [onnxruntime] Upgrades OnnxRuntime to 1.16.0 by @frankfliu in #2784
- Build FT for sm90 by @rohithkrn in #2785
- PtndArrayEx.multiboxDetection() implementation by @juliangamble in #2769
Bug fixes
- [api] Fixes ChunkedBytesSupplier read timeout bug by @frankfliu in #2716
- [fix] Set past_kv name for corner case. by @KexinFeng in #2722
- Fix AmazonReviews by @zachgk in #2725
- Fix issue with setPadding and setTruncation overriding configurations… by @siddvenk in #2741
- Fixes #2744, support onnx model for TextEmbeddingTranslator by @bryanktliu in #2749
- [api] Fixes NDArray.toDevice() missing name issue by @frankfliu in #2751
- [pytorch] Avoid toByteBuffer() crash for large tensor by @frankfliu in #2780
Documentation and Examples
- Update DJL version to 0.23.0 in documents by @sindhuvahinis in #2694
- [docs] Updates README for pytorch 2.0.1 by @frankfliu in #2705
- Update docs and Bump up version to 0.24.0 by @sindhuvahinis in #2708
- [docs] Updates troubleshooting README to remove outdated content by @frankfliu in #2734
- [docs] Update IntelliJ debug view image by @frankfliu in #2747
- [examples] Fixes whipser model on GPU machine by @frankfliu in #2752
CI
- [api] Restore Lm search unittest to recover coverage rate by @KexinFeng in #2723
- [ci] Fixes PMD warnings by @frankfliu in #2764
- [ci] Fixes gradle deprecation warnings by @frankfliu in #2774
New Contributors
- @jiyuanq made their first contribution in #2706
- @rohithkrn made their first contribution in #2732
- @SidneyLann made their first contribution in #2776
Full Changelog: v0.23.0...v0.24.0
DJL v0.9.0 release note
DJL 0.9.0 brings MXNet inference optimization, abundant PyTorch new feature support, TensorFlow windows GPU support and experimental DLR engine that support TVM models.
Key Features
- Add experimental DLR engine support. Now you can run TVM model with DJL
MXNet
- Improve MXNet JNA layer by reusing String, String[] and PointerArray with object pool which reduce the GC time significantly
PyTorch
- you can easily create COO Sparse Tensor with following code snippet
long[][] indices = {{0, 1, 1}, {2, 0, 2}};
float[] values = {3, 4, 5};
FloatBuffer buf = FloatBuffer.wrap(values);
manager.createCoo(FloatBuffer.wrap(values), indices, new Shape(2, 4));
- If the input of your TorchScript model need List or Dict type, we now add simple one dimension support for you.
// assum your torchscript model takes model({'input': input_tensor})
// you tell us this kind of information by setting the name
NDArray array = manager.ones(new Shape(2, 2));
array.setName("input1.input");
- we support loading ExtraFilesMap
// saving ExtraFilesMap
Criteria<Image, Classifications> criteria = Criteria.builder()
...
.optOption("extraFiles.dataOpts", "your value") // <- pass in here
...
TensorFlow
- Windows GPU is now supported
Several Engines upgrade
Engine | version |
---|---|
PyTorch | 1.7.0 |
TensorFlow | 2.3.1 |
fastText | 0.9.2 |
Enhancement
- Add docker file for serving
- Add Deconvolution support for MXNet engine
- Support PyTorch COO Sparse tensor
- Add CSVDataset, you can find a sample usage here
- Upgrade TensorFlow to 2.3.1
- Upgrade PyTorch to 1.7.0
- Add randomInteger operator support for MXNet and PyTorch engine
- Add PyTorch Profiler
- Add TensorFlow Windows GPU support
- Support loading the model from jar file
- Support 1-D list and dict input for TorchScript
- Remove the Pointer class being used for JNI to relieve Garbage Collector pressure
- Combine several BertVocabulary into one Vocabulary
- Add loading the model from Path class
- Support ExtraFilesMap for PyTorch model inference
- Allow both int32 & int64 for prediction & labels in TopKAccuracy
- Refactor MXNet JNA binding to reduce GC time
- Improve PtNDArray set method to use ByteBuffer directly and avoid copy during tensor creation
- Support experimental MXNet optimizeFor method for accelerator plugin.
Documentation and examples
- Add Amazon Review Ranking Classification
- Add Scala Spark example code on Jupyter Notebook
- Add Amazon SageMaker Notebook and EMR 6.2.0 examples
- Add DJL benchmark instruction
Bug Fixes
- Fix PyTorch Android NDIndex issue
- Fix Apache NiFi issue when loading multiple native in the same Java process
- Fix TrainTicTacToe not training issue
- Fix Sentiment Analysis training example and FixedBucketSampler
- Fix NDArray from DataIterable not being attaching to NDManager properly
- Fix WordPieceTokenizer infinite loop
- Fix randomSplit dataset bug
- Fix convolution and deconvolution output shape calculations
Contributors
Thank you to the following community members for contributing to this release:
Frank Liu(@frankfliu)
Lanking(@lanking520)
Kimi MA(@kimim)
Lai Wei(@roywei)
Jake Lee(@stu1130)
Zach Kimberg(@zachgk)
0xflotus(@0xflotus)
Joshua(@euromutt)
mpskowron(@mpskowron)
Thomas(@thhart)
DocRozza(@docrozza)
Wai Wang(@waicool20)
Trijeet Modak(@uniquetrij)
DJL v0.7.0 release notes
DJL 0.7.0 brings SetencePiece for tokenization, GravalVM support for PyTorch engine, a new set of Nerual Network operators, BOM module, Reinforcement Learning interface and experimental DJL Serving module.
Key Features
- Now you can leverage powerful SentencePiece to do text processing including tokenization, de-tokenization, encoding and decoding. You can find more details on extension/sentencepiece.
- Engine upgrade:
- MXNet engine: 1.7.0-backport
- PyTorch engine: 1.6.0
- TensorFlow: 2.3.0
- MXNet multi-gpu training now is boosted by MXNet KVStore by default, which saves lots of overhead by GPU memory copy.
- GraalVM are fully supported on both of regular execution and native image for PyTorch engine. You can find more details on GraalVM example.
- Add a new set of Neural Network operators that offers capability of full controlling over parameters for CV domain, which is similar to PyTorch nn.functional module. You can find the operator method in its Block class.
Conv2d.conv2d(NDArray input, NDArray weight, NDArray bias, Shape stride, Shape padding, Shape dilation, int groups);
- Bill of Materials (BOM) is introduced to manage the version of dependencies for you. In DJL, the engine you are using usually is tied to a specific version of native package. By easily adding BOM dependencies like this, you won’t worry about version anymore.
<dependency>
<groupId>ai.djl</groupId>
<artifactId>bom</artifactId>
<version>0.7.0</version>
<type>pom</type>
<scope>import</scope>
</dependency>
implementation platform("ai.djl:bom:0.7.0")
- JDK 14 now get supported
- New Reinforcement Learning interface including RIAgent, RlEnv, etc, you can see a comprehensive TicTacToe example.
- Support DJL Serving module. With only a single command, now you can enjoy deploying your model without bothering writing the server code or config like server proxy.
cd serving && ./gradlew run --args="-m https://djl-ai.s3.amazonaws.com/resources/test-models/mlp.tar.gz"
Documentation and examples
- We wrote the D2L book from chapter 1 to chapter 7 with DJL. You can learn basic deep learning concept and classic CV model architecture with DJL. Repo
- We launched a new doc website that hosts abundant documents and tutorials for quick search and copy-paste.
- New Online Sentiment Analysis with Apache Flink.
- New CTR prediction using Apache Beam and Deep Java Library(DJL).
- New DJL logging configuration document which includes how to enable slf4j, switch to other logging libraries and adjust log level to debug the DJL.
- New Dependency Management document that lists DJL internal and external dependencies along with their versions.
- New CV Utilities document as a tutorial for Image API.
- New Cache Management document is updated with more detail on different categories.dependency management.
- Update Model Loading document to describe loading model from various sources like s3, hdfs.
Enhancement
- Add archive file support to SimpleRepository
- ImageFolder supports nested folder
- Add singleton method for LambdaBlock to avoid redundant function reference
- Add Constant Initializer
- Add RMSProp, Adagrad, Adadelta Optimizer for MXNet engine
- Add new tabular dataset: Airfoil Dataset
- Add new basic dataset: CookingExchange, BananaDetection
- Add new NumPy like operators: full, sign
- Make prepare() method in Dataset optional
- Add new Image augmentation APIs where you can add to Pipeline to enrich your image dataset
- Add new handy fromNDArray to Image API for converting NDArray to Image object quickly
- Add interpolation option for Image Resize operator
- Support archive file for s3 repository
- Import new SSD model from TensorFlow Hub into DJL model zoo
- Import new Sentiment Analysis model from HuggingFace into DJL model zoo
Breaking changes
- Drop CUDA 9.2 support for all the platforms including linux, windows
- The arguments of several blocks are changed to align with the signature of other widely used Deep Learning frameworks, please refer to our Java doc site
- FastText is no longer a full Engine, it becomes a part of NLP utilities in favor of
FastTextWorkEmbedding
- Move the WarmUp out from existing Tracking and introduce new
WarmUpTracker
MxPredictor
now doesn’t copy parameters by default, please make sure to useNaiveEngine
when you run inference in multi-threading environment
Bug Fixes
- Fixing Validation Epoch Result bug
- Fix multiple process downloading the same model bug
- Fix potential concurrent write bug while downloading metadata.json
- Fix URI parsing error on Windows
- Fix multi-gpu training crash when the number of the batch size is smaller than number of devices
- Fix not setting number of inter-op threads for PyTorch engine
Contributors
Thank you to the following community members for contributing to this release:
Christoph Henkelmann, Frank Liu, Jake Cheng-Che Lee, Jake Lee, Keerthan Vasist, Lai Wei, Qing Lan, Victor Zhu, Zach Kimberg, aksrajvanshi, gstu1130, 蔡舒起
DJL v0.27.0 Release
Key Changes
- Upgrades for engines
- OnnxRuntime 1.17.1 #3019
- Enhancements for engines and API
Enhancement
- Suppress serial warning for JDK21 by @zachgk in #2935
- [api] Moves commons-compress dependency to standalone class. by @frankfliu in #2951
- [api] Allows to load .pt or .onnx file from jar url by @frankfliu in #2955
- [tokenizer] Return if exceed max token length by @frankfliu in #2957
- [tokenizer] Adds getters for HuggingfaceTokenizer by @frankfliu in #2958
- [pytorch] Upgrade android build to 0.26.0 by @frankfliu in #2975
- [pytorch] Avoid loading .lib file from PYTORCH_LIBRARY_PATH by @frankfliu in #2987
- [api] Adds utility method to Model for accessing properties by @frankfliu in #3007
- [api] Adds suffix to percentile metric name by @frankfliu in #3011
- [api] Adds dimension for prediction metric by @frankfliu in #3013
- Thread-safe FaceDetectionTranslator by @StefanOltmann in #3016
- [api] Upgrades commons compress to 1.26.0 for CVE by @frankfliu in #3018
- Avoid duplicated loading native library by @frankfliu in #3020
- [api] Allows to use relative jar uri for cache folder name by @frankfliu in #3026
- support includeTokenTypes in TextEmbeddingBatchTranslator by @morokosi in #3032
- [tokenizer] Adds includeTokenTypes for all translators by @frankfliu in #3035
- Updates dependencies version to latest by @frankfliu in #3040
- [pytorch] Allows to exclude certain DLL from pytorch directory by @frankfliu in #3043
- Update checkstyle tool version to 10.14.2 by @xyang16 in #3047
- Upgrade dependency version by @xyang16 in #3049
Bug Fixes
- [fix][ci] fix typo in publish metric workflow by @siddvenk in #2976
- [fix][ci] avoid early exit of script for failure case by @siddvenk in #2979
- [ci][fix] update path to android sdk manager cli by @siddvenk in #2980
- [dataset] Fixes broken link for mnist dataset by @frankfliu in #2984
- [database] Fixes mnist URL for local unitest by @frankfliu in #2988
- fix #2968 by @SidneyLann in #2986
- [dataset] Fixes wikitext-2 by @zachgk in #2996
- [spark] Fixes python tarslip security concern by @frankfliu in #2995
- Fixes failing CI by @ydm-amazon in #3001
- Fixes cases where the getEngine method in the EngineProvider class returns null when called concurrently. by @onaple in #3005
- [api] Fixes typo in CudaUtils by @frankfliu in #3008
- [model-zoo] Fixes typo in README by @fensch in #3009
- [ci] Fixes nightly build for onnx 1.17.1 by @frankfliu in #3021
- [pytorch] Fixes detecting wrong flavor on macOS issue by @frankfliu in #3027
- [bom] Fixes djl-serving packages in BOM by @frankfliu in #3039
Documentation
- Bump DJL version to 0.27.0 by @siddvenk in #2933
- [doc] include trtllm convert manual by @sindhuvahinis in #2941
- [docs] Updates README by @frankfliu in #2954
- [doc] Make LMI a separate tab and include I/O schema by @sindhuvahinis in #2960
- [docs] Fixes cuda version for pytorch native library by @frankfliu in #2963
- docs: add AWS Graviton3 PyTorch inference tuning details by @snadampal in #2982
- [docs] Update Huggingface tokenizer cache directory document by @frankfliu in #2994
- [docs] Disable progress bar for jupyter notebook convertion by @frankfliu in #3017
- [example] Adds document about how to trace gpt2 model by @frankfliu in #3028
- [docs] update mkdocs structure for new lmi documentation by @siddvenk in #3029
CI/CD
- removing pytorch 2.0.1 from 0.27.0 by @siddvenk in #2940
- Moves to Actions hosted M1 runner by @zachgk in #2948
- [ci] Disable run scheduled github actions in fork by @frankfliu in #2943
- [ci] add cloudwatch metrics for scheduled workflow failures by @siddvenk in #2966
- [ci] Upgrade github actions nodejs 16 to nodejs 2 by @frankfliu in #2967
- [ci] Upgrade codeql-actions to v3 by @frankfliu in #2973
- [ci] Upgrade aws-actions/configure-aws-credentials to v4 by @frankfliu in #2972
- [ci] refactor cloudwatch metric publishing to avoid needing changes i… by @siddvenk in #2974
- [ci] Downgrade github actions version for centos7 and amazonlinux by @frankfliu in #2977
- [ci] move cw publish step to github hosted runner by @siddvenk in #2978
- [CI] downgrade the version to V3 by @lanking520 in #2990
- [CI] change to cache v3 for the versions by @lanking520 in #2991
- Uses gradle dependency submission by @zachgk in #2983
- Excludes test dependencies from dependency submission by @zachgk in #2999
- Update continuous OSX to 13 by @zachgk in #3004
- Removes dependency submission by @zachgk in #3006
New Contributors
- @snadampal made their first contribution in #2982
- @ydm-amazon made their first contribution in #3001
- @onaple made their first contribution in #3005
- @fensch made their first contribution in #3009
- @StefanOltmann made their first contribution in #3016
- @morokosi made their first contribution in #3032
Full Changelog: v0.26.0...v0.27.0
DJL v0.26.0 Release
Key Changes
- LlamaCPP Support. You can use DJL to run supported LLMs using the LlamaCPP engine. See the Chatbot example here to learn more.
- Manual Engine Initialization. You can configure DJL to not load any engines at startup, and query/register engines programmatically at runtime
- Engine Updates:
- PyTorch 2.1.1
- Huggingface Tokenizers 0.15.0
- OnnxRuntime 1.16.3
- XGBoost 2.0.3
Enhancement
- Add erf and atan2 by @TalGrbr in #2842
- Add FFT2 and FFT2 inverse by @TalGrbr in #2845
- [tokenizer] Update import script for huggingface_hub api change by @frankfliu in #2850
- [tokenizer] Not returns overflow tokens by default by @frankfliu in #2857
- [pytorch] Updates PyTorch engine to 2.1.1 by @frankfliu in #2864
- Adds Device.getDevices() for all Device by @zachgk in #2820
- Creates DJL manual engine initialization by @zachgk in #2885
- [pytorch] Allows to load libstdc++.so.6 form different location by @frankfliu in #2929
- Add Evaluator support to update multiple accumulators by @petebankhead in #2894
- Adds llama.cpp engine by @bryanktliu in #2904
- Yelov8 Translator optimization by @gevant in #2908
- [pytorch] Adds Yolov8n model to pytorch model zoo. by @frankfliu in #2910
- [onnx] Adds yolov8n to model zoo by @frankfliu in #2909
- [llama.cpp] Adds unit-test and standardize input parameters by @frankfliu in #2905
- [llama.cpp] Adds llama.cpp huggingface model zoo by @frankfliu in #2911
- [XGBoost] Updates XGBoost to 2.0.3 by @frankfliu in #2915
- [pytorch] Upgrade pytorch andorid to 2.1.1 by @frankfliu in #2914
- add awscurl release by @lanking520 in #2917
- [awscurl] change build to jar by @lanking520 in #2918
- [bom] Adds llama engine to BOM by @frankfliu in #2916
- [api] Adds ModelZooResolver interface by @frankfliu in #2922
- [api] Use folk java process to avoid jvm consume GPU memory by @frankfliu in #2882
- [onnxruntime] Updates OnnxRuntime to 1.16.3 by @frankfliu in #2888
- Tokenizers: Updated huggingface_models.py to support Safetensors models as well as pytorch by @dameikle in #2880
- [tokenizer] Uses fp32 for TextembeddingTranslator clip() by @frankfliu in #2881
- [tokenizer] Updates huggingface tokenizer to 0.15.0 by @frankfliu in #2867
Bug Fixes
- [tokenizer] Fixes tokenizer bug by @frankfliu in #2843
- Fixes archiveBaseName in native builds by @zachgk in #2859
- [pytorch] Ensure shared library loading order for aarch64 by @frankfliu in #2892
- [api] Handles both JNA conflict and missing case by @frankfliu in #2896
- Minor fixes to improve Apple Silicon MPS support by @petebankhead in #2873
- [tokenizer] Handles import huggingface model zoo exception case by @frankfliu in #2872
- [api] Update offline property name to avoid conflict with other app. by @frankfliu in #2877
- [tensorflow] Revert InstanceHolder for TensorFlow engine by @frankfliu in #2884
- [pytorch] Revert InstanceHolder for PyTorch engine by @frankfliu in #2876
- [pytorch] Fixes windows load nvfuser_codegen bug by @frankfliu in #2868
Documentation
- [docs] Update serving configuration nav by @zachgk in #2853
- Updates DJL version to 0.25.0 by @frankfliu in #2860
- Bump up DJL version to 0.26.0 by @frankfliu in #2861
- [docs] Move jupyter notebooks to DJL Demo by @zachgk in #2854
- [docs] Include LMI documents by @sindhuvahinis in #2870
- [docs] Updates documents to use JDK 17 by @frankfliu in #2898
- Updates DJL version to 0.26.0 by @siddvenk in #2930
- update master branch on the website to have large model inference guide by @lanking520 in #2865
CI/CD
- [ci] Allows build project with JDK 21 by @frankfliu in #2903
- [ci] Fixes pytorch android build by @frankfliu in #2921
- [ci] Fix build failure for
build-pytorch-jni-linux
by @maaquib in #2920 - [ci] Fixes native ci build failure by @frankfliu in #2924
- [CI] Fixes flaky early stopping test by @zachgk in #2866
- [ci] Fixes flaky early stopping training test by @frankfliu in #2879
- [ci] Use JDK 17 for github actions workflow by @frankfliu in #2897
- [ci] Fixes github action for centos and amazonlinux by @frankfliu in #2913
- [ci] Use macos-13 to avoid flaky test by @frankfliu in #2927
- [test] Fixes EarlyStopping flaky test by @frankfliu in #2926
- [api] Updates dependencies to latest version by @frankfliu in #2928
- [api] Updates common-compress version to address CVE issues by @frankfliu in #2871
- only build triton binaries by @lanking520 in #2847
New Contributors
- @TalGrbr made their first contribution in #2842
- @petebankhead made their first contribution in #2873
- @dameikle made their first contribution in #2880
- @gevant made their first contribution in #2908
- @maaquib made their first contribution in #2920
Full Changelog: v0.25.0...v0.26.0
DJL v0.25.0 Release
Key Changes
- Engine Upgrades
- Early Stopping support for Training by @jagodevreede #2806
Enhancement
- [tokenizer] Allows import non-english model by @frankfliu in #2797
- [api] Allows cancel Input by @frankfliu in #2805
- [huggingface] Adds CrossEncoderTranslator by @frankfliu in #2817
- Creates MultiDevice by @zachgk in #2819
- [api] Refactor PublisherBytesSupplier.java by @frankfliu in #2831
- [api] Replace double-check singlton with lazy initialization by @frankfliu in #2826
Bug fixes
- [api] Fixed NDList decode numpy file bug by @frankfliu in #2804
Documentation and Examples
- Updates doc versions to 0.24.0 by @zachgk in #2829
- [docs] Fixes markdown headers by @zachgk in #2812
- Bump up DJL version to 0.25.0 by @frankfliu in #2809
- Update README with release update by @zachgk in #2823
CI
- [FT Deps] allow to just build for 1 flow by @lanking520 in #2798
- [ci] Fixes out of diskspace issue by @frankfliu in #2808
- Add Triton gpu flag build on by @lanking520 in #2815
New Contributors
- @jagodevreede made their first contribution in #2806
Full Changelog: v0.24.0...v0.25.0