Concurrency limiter from Netflix to avoid throttling storage #2169

malonso1976 · 2018-08-18T10:01:05Z

No description provided.

…lector is either Kafka or RabbitMQ

codefromthecrypt

thanks for giving this a go.

first question: have you tried this successfully?

second comment: I think the real constraint is storage so this should be a property at global level not nested for each collector. otherwise your constraints won't apply if both http and Kafka spans are accepted.

we can work through other design things after consideration of above 2 things.

malonso1976 · 2018-08-18T10:59:19Z

Hi Adrian,
I just made a initial unit test (removed from code now) to see if ConcurrencyLimiter worked as expected. I want to test it in one of our environments under load to see if it works for us.
For your second comment, I did not take into consideration http collector... my fault. In our setup we do not received spans directly through http endpoint.

Do you have any design concern how shall concurrency limiter applied on http collector? The http collector is a special case, because http client knows synchronously whether collection succeeded or not, and current ConcurrentLimiter works as a pool of threads, processing requests asynchronously.

malonso1976 · 2018-08-18T16:33:43Z

Hi again,
just tried to move up the configuration and shared the concurrency limiter with the http collector. Please see if it is closer to what you suggested and I will try to test it in a real environment as soon as possible.

Regards,

codefromthecrypt · 2018-08-19T04:47:31Z

yeap on this being where I was hoping it would be (will apply to all collectors, not just http but good start)

once you have your results in, we can chat further. thanks so far.

shakuzen · 2018-10-24T00:11:22Z

@malonso1976 were you able to try this out in a real environment?

malonso1976 · 2018-10-24T05:10:46Z

Not really, but I am on it. I stopped working on it because I had a family emergency that took me away from work a couple of months. I am back and want to test it soon. El El mié, 24 oct 2018 a las 2:11, Tommy Ludwig <notifications@github.com> escribió:

…

@malonso1976 <https://github.com/malonso1976> were you able to try this out in a real environment? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2169 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFx3OsKlvOke_5BR9OqMV-CEbwJztqLIks5un7A_gaJpZM4WCgvm> .

shakuzen · 2018-10-24T05:43:34Z

@malonso1976 No problem at all. Let us know when you have a chance to test it out.

Logic-32 · 2019-04-08T18:19:55Z

pom.xml

@@ -64,6 +64,7 @@
    <!-- Be careful to set this as a provided dep, so that it doesn't interfere with dependencies
         from other projects. For example, cassandra and spring boot set guava versions -->
    <guava.version>19.0</guava.version>
+    <netflix.concurrency.limits.version>0.0.49</netflix.concurrency.limits.version>


@adriancole, concurrency-limits v0.2.0 is the latest tag but uses Java 8 features (such as Optional). Do you know of any current plans to get io.zipkin.zipkin2:zipkin off of 1.6 and up to at least 1.8? I saw #777 but it doesn't appear to have actually upped the main.signature.artifact.

It also looks like retrolambda does not support recompiling dependencies or I'd explore that option further.

Never mind, made a new artifact outside of zipkin-core so it's not subject to this limitation. Affords me some other luxuries around not-shading dependencies as well :)

codefromthecrypt · 2019-04-09T23:21:44Z

zipkin-collector is java 8 for this reason. we can shade some smaller deps if needed

…

On Wed, Apr 10, 2019, 7:53 AM Logic-32 ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In pom.xml <#2169 (comment)>: > @@ -64,6 +64,7 @@  <guava.version>19.0</guava.version> + <netflix.concurrency.limits.version>0.0.49</netflix.concurrency.limits.version> Never mind, made a new artifact outside of zipkin-core so it's not subject to this limitation. Affords me some other luxuries around not-shading dependencies as well :) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2169 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAD611-pgR227k6QeCjFbcjaT_sdWxVJks5vfRn8gaJpZM4WCgvm> .

Adding storage-throttle module/etc. to contain logic for wrapping other storage implementations and limiting the number of requests that can go through to them at a given time. Elasticsearch storage's maxRequests can be override by throttle properties if the throttle is enabled. Making sure RejectedExecutionExceptions are "first class" citizens since they are used to reduce the throttle. Removing HttpCall's Semaphore in favor of the throttle (same purpose, different implementations). Inspired by work done on openzipkin#2169.

codefromthecrypt · 2019-04-18T03:03:54Z

#2502 will address this use case. Thanks

Adding storage-throttle module/etc. to contain logic for wrapping other storage implementations and limiting the number of requests that can go through to them at a given time. Elasticsearch storage's maxRequests can be override by throttle properties if the throttle is enabled. Inspired by work done on openzipkin#2169.

Adding ThrottledStorageComponent/etc. to contain logic for wrapping other storage implementations and limiting the number of requests that can go through to them at a given time. Elasticsearch storage's maxRequests can be override by throttle properties if the throttle is enabled. Inspired by work done on openzipkin#2169.

codefromthecrypt · 2019-05-08T02:14:02Z

#2502 now includes test instructions. please give a try!

Adding ThrottledStorageComponent/etc. to contain logic for wrapping other storage implementations and limiting the number of requests that can go through to them at a given time. Elasticsearch storage's maxRequests can be override by throttle properties if the throttle is enabled. Inspired by work done on openzipkin#2169.

Adding ThrottledStorageComponent/etc. to contain logic for wrapping other storage implementations and limiting the number of requests that can go through to them at a given time. Elasticsearch storage's maxRequests can be override by throttle properties if the throttle is enabled. Inspired by work done on #2169.

…nzipkin#2502) Adding ThrottledStorageComponent/etc. to contain logic for wrapping other storage implementations and limiting the number of requests that can go through to them at a given time. Elasticsearch storage's maxRequests can be override by throttle properties if the throttle is enabled. Inspired by work done on openzipkin#2169.

codefromthecrypt · 2020-08-16T01:10:04Z

note: concurrency-limits-core has been abandoned by Netflix by over a year at this point. We may have to revisit this

codefromthecrypt · 2020-08-16T01:12:31Z

I opened this noticing no release since last july even if there were a couple more months of changes, nothing was released Netflix/concurrency-limits#157

anuraaga · 2020-08-16T01:37:04Z

I've heard the word resilience4j lately - never used it but just a reference

Logic-32 · 2020-08-16T17:15:57Z

Reslience4j and Failsafe are interesting options but, when evaluating them, make sure they solve the problem this is looking to address.

Namely, storage gets backed up so we need to slow down how much we're writing to it but do so in a way that doesn't cause the server to exhaust memory. In this particular case, a limited-size queue is used to buffer some data while we have a small bottleneck we're working with. Some simple retries with an appropriate backoff policy could address the same basic problem of slow storage but may be harder to tame with respect to pending actions. Meaning, I can't say how hard it is to prevent an "infinite queue" with the retry handlers.

FWIW, the ExecutorService that backs this implementation is pretty universal and supported by Java. The only part that is not is the one that adjusts how many threads are allowed to be active in it. To a certain extent, that logic could be directly incorporated into Zipkin instead of pulled in via Netflix.

anuraaga · 2020-08-17T04:18:36Z

@Logic-32 Thanks for summarizing the problem, it'll help if we look into any migration! I was indeed incorrectly thinking the RateLimiter in resilience4j would be relevant to us, but based on your explanation, my reading is that their bulkhead may provide what we need.

https://github.com/resilience4j/resilience4j#632-threadpoolbulkhead

Concurrency limiter from Netflix to avoid throttling storage when col…

cea7d1e

…lector is either Kafka or RabbitMQ

codefromthecrypt reviewed Aug 18, 2018

View reviewed changes

Fixed close when no limiter is in use

ac8e81a

malonso1976 added 2 commits August 18, 2018 15:16

Refactor configuration to be shared with http collector

3008167

Global configuration

476c1aa

shakuzen added feedback-needed enhancement collector labels Oct 24, 2018

codefromthecrypt mentioned this pull request Nov 13, 2018

non-blocking example cap5lut/cap5lut-ratelimits#1

Open

This was referenced Apr 6, 2019

Block on http calls to elastic search #2166

Closed

Buffer some requests in order to reduce "over capacity" errors without also killing Elasticsearch #2481

Closed

Logic-32 reviewed Apr 8, 2019

View reviewed changes

Logic-32 mentioned this pull request Apr 17, 2019

Adding storage-throttle module to address "over capacity" issues #2502

Merged

codefromthecrypt closed this Apr 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrency limiter from Netflix to avoid throttling storage #2169

Concurrency limiter from Netflix to avoid throttling storage #2169

malonso1976 commented Aug 18, 2018

codefromthecrypt left a comment

malonso1976 commented Aug 18, 2018

malonso1976 commented Aug 18, 2018

codefromthecrypt commented Aug 19, 2018

shakuzen commented Oct 24, 2018

malonso1976 commented Oct 24, 2018 via email

shakuzen commented Oct 24, 2018

Logic-32 Apr 8, 2019

Logic-32 Apr 9, 2019

codefromthecrypt commented Apr 9, 2019 via email

codefromthecrypt commented Apr 18, 2019

codefromthecrypt commented May 8, 2019

codefromthecrypt commented Aug 16, 2020

codefromthecrypt commented Aug 16, 2020

anuraaga commented Aug 16, 2020

Logic-32 commented Aug 16, 2020 •

edited

Loading

anuraaga commented Aug 17, 2020 •

edited

Loading

Concurrency limiter from Netflix to avoid throttling storage #2169

Concurrency limiter from Netflix to avoid throttling storage #2169

Conversation

malonso1976 commented Aug 18, 2018

codefromthecrypt left a comment

Choose a reason for hiding this comment

malonso1976 commented Aug 18, 2018

malonso1976 commented Aug 18, 2018

codefromthecrypt commented Aug 19, 2018

shakuzen commented Oct 24, 2018

malonso1976 commented Oct 24, 2018 via email

shakuzen commented Oct 24, 2018

Logic-32 Apr 8, 2019

Choose a reason for hiding this comment

Logic-32 Apr 9, 2019

Choose a reason for hiding this comment

codefromthecrypt commented Apr 9, 2019 via email

codefromthecrypt commented Apr 18, 2019

codefromthecrypt commented May 8, 2019

codefromthecrypt commented Aug 16, 2020

codefromthecrypt commented Aug 16, 2020

anuraaga commented Aug 16, 2020

Logic-32 commented Aug 16, 2020 • edited Loading

anuraaga commented Aug 17, 2020 • edited Loading

Logic-32 commented Aug 16, 2020 •

edited

Loading

anuraaga commented Aug 17, 2020 •

edited

Loading