-
Notifications
You must be signed in to change notification settings - Fork 67
Issues: deepjavalibrary/djl-serving
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Support Sagemaker async endpoint deployment.
bug
Something isn't working
#2567
opened Nov 17, 2024 by
yinsong1986
Support extra-parameters in vLLM openai_compatible_server APIs
enhancement
New feature or request
#2543
opened Nov 12, 2024 by
yinsong1986
TensorRT-LLM(TRT-LLM) LMI model format artifacts not found when deploying
bug
Something isn't working
#2498
opened Oct 28, 2024 by
joshight
AWQ with Marlin kernel erroring out while loading the model in DJL 0.29 with vllm
bug
Something isn't working
#2486
opened Oct 24, 2024 by
guptaanshul201989
[Doubt] Inflight batching support in T5
enhancement
New feature or request
#2417
opened Oct 3, 2024 by
vguruju
Upgrade to support latest vLLM version (max_lora_rank)
enhancement
New feature or request
#2389
opened Sep 16, 2024 by
dreamiter
docker 0.29.0-pytorch-inf2 with meta-llama/Meta-Llama-3.1-8B-Instructn failes
bug
Something isn't working
#2385
opened Sep 13, 2024 by
yaronr
NeuronX compiler: specify data type
enhancement
New feature or request
#2378
opened Sep 11, 2024 by
CoolFish88
Transformers NeuronX continuous batching support for Mistal 7b Instruct V3
enhancement
New feature or request
#2377
opened Sep 11, 2024 by
CoolFish88
Model conversion process failed. Unable to find bin files
bug
Something isn't working
#2365
opened Sep 5, 2024 by
joshight
Mistral7b custom inference with LMI not working: java.lang.IllegalStateException: Read chunk timeout.
bug
Something isn't working
#2362
opened Sep 5, 2024 by
jeremite
awscurl: Missing token metrics when -t option specified
bug
Something isn't working
#2340
opened Aug 25, 2024 by
CoolFish88
awscurl: WARN maxLength is not explicitly specified, use modelMaxLength: 512
bug
Something isn't working
#2339
opened Aug 25, 2024 by
CoolFish88
djl-inference:0.29.0-tensorrtllm0.11.0-cu124 regression: has no attribute 'to_word_list_format'
bug
Something isn't working
#2293
opened Aug 7, 2024 by
lxning
Llama 2 7b chat model output quality is low
bug
Something isn't working
#2093
opened Jun 21, 2024 by
ghost
Error running multimodel endpoints in sagemaker
bug
Something isn't working
#1911
opened May 15, 2024 by
Najib-Haq
document the /invocations endpoint
bug
Something isn't working
#1905
opened May 14, 2024 by
tenpura-shrimp
Better support prometheus metrics and/or allow custom prometheus metrics
enhancement
New feature or request
#1827
opened Apr 27, 2024 by
glennq
DJL-TensorRT-LLM Bug: TypeError: Got unsupported ScalarType BFloat16
bug
Something isn't working
#1816
opened Apr 25, 2024 by
rileyhun
DJL-TRTLLM: Error while detokenizing output response of teknium/OpenHermes-2.5-Mistral-7B on Sagemaker
bug
Something isn't working
#1792
opened Apr 20, 2024 by
omarelshehy
question to error model conversion process failed
bug
Something isn't working
#1785
opened Apr 17, 2024 by
geraldstanje
snap installer for djlbench doesn't work for arm64 platform
bug
Something isn't working
#1532
opened Feb 6, 2024 by
snadampal
Plan to use Attention Sinks?
enhancement
New feature or request
#1470
opened Jan 10, 2024 by
spring1915
Previous Next
ProTip!
Follow long discussions with comments:>50.