Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High memory consumption #94

Open
i7an opened this issue Jun 19, 2023 · 8 comments
Open

High memory consumption #94

i7an opened this issue Jun 19, 2023 · 8 comments

Comments

@i7an
Copy link

i7an commented Jun 19, 2023

Hi,

I encountered some very strange behavior with knot resolver. For some reason this config causes the kresd process to bloat linearly (~10Mb / hour) and eat hundreds megabytes of memory even without any load:

cache.size = 100 * MB
cache.open(100 * MB, 'lmdb://./tmp/knot-cache')
cache.max_ttl(300)

But when I set max_ttl before opening a cache file the problem disappears and the memory footprint stays at ~17Mb:

cache.size = 100 * MB
cache.max_ttl(300)
cache.open(100 * MB, 'lmdb://./tmp/knot-cache')

Here is the docker file I used:

Dockerfile
FROM debian:11-slim

RUN apt update
RUN apt install -y wget

RUN wget https://secure.nic.cz/files/knot-resolver/knot-resolver-release.deb
RUN dpkg -i knot-resolver-release.deb
RUN apt update
RUN apt install -y knot-resolver

COPY config/knot-resolver/kresd.conf /etc/knot-resolver/kresd.conf

ENTRYPOINT ["kresd"]
CMD ["-c", "/etc/knot-resolver/kresd.conf", "-n"]

I would be grateful for any ideas and debug suggestions.

UPD Apparently the lower max_ttl the quicker RAM is consumed. Calling cache.clear() does nothing. Running kres-cache-gc does nothing.

@vcunat
Copy link
Member

vcunat commented Jun 19, 2023

cache.open() resets the TTL limits.

@i7an
Copy link
Author

i7an commented Jun 20, 2023

@vcunat could you please elaborate more on how it may cause constant memory growth. 5 mins ttl seems harmless to me.

@vcunat
Copy link
Member

vcunat commented Jun 20, 2023

No, the growth itself does sound like a bug. Reducing TTL will make resolver do more work, etc. but otherwise it's probably just some coincidence that it triggers that bug/growth.

I just wanted to point out that swapping the lines is basically the same as not changing the TTL limit.

@i7an
Copy link
Author

i7an commented Jun 20, 2023

Thanks for pointing that out. It was not obvious to me.

@vcunat
Copy link
Member

vcunat commented Jun 21, 2023

I see two plausible options:

  1. the allocator (jemalloc in this case) still does not like the resulting allocation patterns and results into a very sparse heap. (Lots of RAM taken from OS but only small percentage of that actually allocated by kresd.) https://gitlab.nic.cz/knot/knot-resolver/-/merge_requests/1353#note_265895

  2. a genuine leak (unreachable memory), but we haven't heard of any significant one so far (in terms of consumed amount of RAM).
    It will be probably easiest recognizable by setting variable MALLOC_CONF=prof_leak:true,lg_prof_sample:0,prof_final:true and possibly later inspecting details according to docs.

@i7an
Copy link
Author

i7an commented Jun 21, 2023

I'll definitely investigate your suggestions. Thanks for sharing. 🙇‍♂️

@vcunat But I am still puzzled by the fact that using such a simple setting as max_ttl causes this problem and it was not noticed before... Can you advise what else I can check to discard the possibility of a simple error in my configuration. As I mentioned in the UPD section I tried clearing the cache with cache.clear() and running kres-cache-gc with no affect on the memory footprint.

@vcunat
Copy link
Member

vcunat commented Jun 21, 2023

Cache size is unrelated; that's always exactly 100 MiB file, mapped to memory (according to your config).

@vcunat
Copy link
Member

vcunat commented Jun 21, 2023

I mean, the cache file will be part of the RAM usage that you see, but it has that hard upper limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants