Clean Expired Messages before Experiments Start #10578

nablabits · 2024-10-03T05:42:45Z

This is a follow up from:

dvc queue status doesn't report active workers #10427

where a minimal fix was shipped. As discussed in this comment, it may be a good idea to explore the possibility and implications of cleaning the expired messages before an experiment starts.

nablabits · 2024-10-16T05:48:37Z

Just a quick heads up in case someone lands in this issue and thinks that it's unattended, I'm working on it and I will put an update soon 🐌. Thanks for the patience

nablabits · 2024-10-22T06:34:42Z

Context

Let's do a bit of backup on this to make sure it still makes sense

As part of the previous issue we discovered that kombu will put messages in the processing directory located in .dvc/tmp/exps/celery/broker/in/ one second in the future (Source).
This may prevent some messages to be properly cleaned once the first worker has finished (Source).
We may not want to add an extra 2" to that check as that may potentially cause one worker to remove the shutdown message before the other worker has consumed it.
It may be a good idea to clean the directory upfront to remove any message that have remained because of above.

What I Have Discovered

I couldn't run any scenario where a message is not cleaned because of the 1" with the fix we shipped in the previous issue, but who knows, there may be similar situations.
There's an scenario where a significant amount of files may remain. It goes as follows:
- There's a big imbalance between experiments, say one experiment takes 1' more to complete than the other. For example --set-param "train.fine_tune_args.epochs=2,20"
- The short experiment is placed in the first worker. This is not always the case as they seem to be picked at random
- Because of attaching the --clean flag to the first worker (source), it will start cleaning right after shutdown while the first worker is still running.
- The second worker is effectively putting messages in .dvc/tmp/exps/celery/broker/in/, but they seem to be for the 1@localhost.celery.pidbox queue which suggests a fanout. These are the messages that remain.

Unknowns

It's not clear to me yet what the impact of these remaining files are in subsequent runs.
It's not clear either what are the implications of triggering clean over the messages in the second worker. By looking at dvc queue status it reads that both were successful.

What do you think @shcheklein, is there a strong case to clean upfront?

shcheklein · 2024-10-22T22:13:53Z

The second worker is effectively putting messages in .dvc/tmp/exps/celery/broker/in/, but they seem to be for the 1@localhost.celery.pidbox queue which suggests a fanout. These are the messages that remain.

@nablabits could you give a little bit more details please?

From you description it seems to be a problem that the first worker is cleaning up too early, not that there are files left? Or am I missing something?

What do you think @shcheklein, is there a strong case to clean upfront?

don't know yet, it well might be we don't need it. It would be great to get a bit more details.

nablabits · 2024-11-07T08:22:52Z

Just a tiny update, I'm trying to figure out the message lifecycle which will shed light on the two issues, namely, the impact of remaining files in subsequent runs and cleaning files while the second worker is still active.

The process is quite involved as Celery and Kombu are deeply intertwined and poorly documented/annotated so one has to reverse engineer the methods to fully make sense of the data structures. The fact that they operate in different threads of the processor is making the debugging awkward as I can't just put breakpoints —I'd really appreciate if anyone has a clever solution to this 🙂 —. With all, I'm enjoying the learning experience so far

nablabits · 2024-11-14T08:38:26Z

Hi @shcheklein, hope you are doing great!

TL;DR

So, there were a couple of unknowns that I'm now in a solid position to reply:

What are the implications of triggering clean over the messages in the second worker?
What is the impact of remaining files in subsequent runs?

Now I can confidently say that for the first there's a marginal chance to have some effect whereas for the second, there's no chance whatsoever. See details below from background onwards.

There are a number of options now that come to my mind:

Close this ticket hoping that no one will scratch their head trying to figure out what all those in files are all about
Clean the files upfront anyways as we can confidently assume that that won't affect experiments
Add docstrings in the clean ecosystem so future people won't need to reinvent the wheel and reverse engineer methods.

What do you think?

Background

We need to lay some ground first of all so as to be clear in the terms.

Exchanges:
Celery will create 3 exchanges —understand, mailboxes— to track its stuff:

celery: that holds the tasks themselves.
celery.pidbox: it holds messages that track the progress of the worker, smth like, hey Celery how are you getting along with this stuff?
reply.celery.pidbox: it holds the replies to above question

Pidbox Message Lifecycle:
As we will see, celery exchange won't be affected by the clean, but the control process, i.e. the pidbox, will. So, it would be a good idea to know how the pidbox message lifecycle works:

flowchart LR
a[Mailbox._publish] --> b[Channel.queue_bind] --> c[celery.worker.control.meth] --> d[Mailbox._publish_reply] --> e[Channel.queue_delete] --> a

A message for the celery.pidbox is published via Mailbox._publish() with a method, say, 'active' in the body message and a reply_to property. Mailbox._publish() sort of iterates through these celery control methods: active, schedule, ping & reserved which can be found here
The in directory is iterated looking for messages first for the celery.pidbox then for a reply queue that is created for that reply_to property which has a very unique name {uuid}.reply.celery.pidbox (source). If the message does not match it will be skipped (source)
The celery.pidbox message is found and consumed which tells kombu to use the active() as a callback (source) via Node.dispatch() (source)
A reply is _put with the outcome of that active callback via Mailbox._publish_reply().
Again, the in directory is iterated looking for messages first for the celery.pidbox then for the reply queue.
This time, there are no celery.pidbox messages but there's a reply on {uuid}.reply.celery.pidbox
The reply is consumed via Mailbox._collect.on_message which appears to have no effect as there's no callback in all the process (source). This is crucially important as clean will wipe these messages.
Finally the queue is simply deleted (source) and the cycle starts again with Mailbox._publish()

1. Implications of Triggering Clean Ahead of Time

Of course, @pmrowla knew what he was doing and wisely defined these two cleaning processes:

_gc which will clean expired messages the in and processed directories excluding the celery exchange, this is, the queued tasks. Then, it will remove files that either:
- Have an expired key: only celery.pidbox messages contain the expires key which as per the other issue is 1" in the future. These messages with the control method they contain will be lost. From my understanding, eventually messages will be created with these control methods as long as a shutdown hasn't been triggered. This is the marginal impact I talked about above.
- Are a ticket: it's not clear to me the meaning of this ticket thing, but only replies will contain it. Still, this is innocuous as we see above that replies don't trigger any action in the pidbox lifecycle.
clean_pidbox in turn will clean all the replies in the in directory by selecting the reply exchange. Again innocuous as we see above that replies don't trigger any action in the pidbox lifecycle.

2. Implications of Remaining Files in Subsequent Runs

The files that remain in in when one does not perform a clean are replies which as we saw in the lifecycle are quite specific due to their unique name {uuid}.reply.celery.pidbox therefore they are unlikely to be found when the in directory is iterated and even if they were found by some planetary alignment they won't have any effect due to the lack of callback.

😅 Phew!

shcheklein added p2-medium Medium priority, should be done, but less important A: experiments Related to dvc exp labels Oct 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clean Expired Messages before Experiments Start #10578

Clean Expired Messages before Experiments Start #10578

nablabits commented Oct 3, 2024

nablabits commented Oct 16, 2024

nablabits commented Oct 22, 2024

shcheklein commented Oct 22, 2024

nablabits commented Nov 7, 2024

nablabits commented Nov 14, 2024

Clean Expired Messages before Experiments Start #10578

Clean Expired Messages before Experiments Start #10578

Comments

nablabits commented Oct 3, 2024

nablabits commented Oct 16, 2024

nablabits commented Oct 22, 2024

Context

What I Have Discovered

Unknowns

shcheklein commented Oct 22, 2024

nablabits commented Nov 7, 2024

nablabits commented Nov 14, 2024

TL;DR

Background

1. Implications of Triggering Clean Ahead of Time

2. Implications of Remaining Files in Subsequent Runs