-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graceful stop in repeat mode - part 2 #833
base: master
Are you sure you want to change the base?
Conversation
doc/unison-manual.tex
Outdated
Unison would keep reading all the input looking for the stop condition. | ||
an existing regular file larger than a few tens of bytes or a device or | ||
a pipe that produces large quantities of data). In such cases, Unison | ||
would keep reading all the input looking for the stop condition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not following. What is the stop condition then.
This seems like a one-off ad hoc mechanism. It feels like the beginning of a control socket to be able to do things to a long-running background process that has no UI. There's a long history of that in various programs, and (unix-centric view I know) and the usual solution is some simple protocol on a unix-domain socket. That leads to wanting to write the pid, and figure out naming because one could run lots of copies.
I lean to adding a mechanism and using it vs a non-extendable mechanism. But really if you can send SIGINT twice, or SIGUSR2 once, that might be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not following. What is the stop condition then.
Byte 0x03 or 0x04 in the input (arbitrarily chosen, and 0x03 kind of goes against my previous comment about common idiom).
This seems like a one-off ad hoc mechanism. It feels like the beginning of a control socket to be able to do things to a long-running background process that has no UI. There's a long history of that in various programs, and (unix-centric view I know) and the usual solution is some simple protocol on a unix-domain socket. That leads to wanting to write the pid, and figure out naming because one could run lots of copies.
Yes, that's not a far fetched analogy because that's basically what this is.
I lean to adding a mechanism and using it vs a non-extendable mechanism. But really if you can send SIGINT twice, or SIGUSR2 once, that might be enough.
If it wasn't for Windows...
You may have noticed a common theme in my comments. Windows. A solution based on signals is fine but that leaves Windows completely out. Except for SIGINT which Windows knows about... but which is sent to all processes, so not really helpful. |
I guess the $64,000 question is: Are we willing to have unix do things the wrong way because there is no right way on windows. I will think more. |
I've removed the SIGQUIT method. It really was redundant and only made the code more complex. Now the diff is smaller and easier to review. In principle I'm also ok with allowing /dev/null as stdin but I didn't add the code just yet because it's going to make the code more complex again. I'm waiting with this until after the concept has been reviewed and agreed upon. |
Status update. What is currently implemented (not everything is merged yet). To interrupt/terminate immediately (already merged)
To stop the repeat mode without interrupting ongoing sync (this PR) Continue with any ongoing sync, just don't start the next repeat loop after this one.
|
All the changes are now pushed in this PR. |
@@ -310,9 +310,23 @@ or | |||
.Sy SIGTERM | |||
works the same way as described above and will additionally stop the continuous | |||
process. To stop only the continuous process and let the last synchronization | |||
complete normally, send signal | |||
complete normally, press | |||
.Sy Ctrl-D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this talking about sending a literal ^D 0x4, which one would do on BSD by ^V^D, or about typing the character coded as EOF to the tty so that the reader gets EOF?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that was a question, but maybe I mean "it would be nice if this were clearer, even though I realize being precise is harder to understand for many". Perhaps "Send an EOF by pressing the key configured for that, typically ^D. Or, send a literal 0x4."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try rewording it. Literal ^D and EOF, either will work
.Sy SIGUSR2 | ||
instead. | ||
Closing the input, receiving an EOF, or receiving a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really do not understand "closing the input". stdin is file descriptor zero and unison is in charge of closing/not-closing that. I think you mean closing the other half of some split pty/fifo special file to cause read(2) in unison to return EOF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right. There may have been some situation where a close on the other end would not produce EOF on the read end but I can't remember what it was. And thinking about it now, that doesn't seem realistic either. I may have been trying to make the language assume less about the reader knowing about conditions when EOF is generated. I'll see if I can rephrase it (no promises).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to speak precisely about how unison behaves based on return values from read(2) syscall on fd 0, and not try to have it make sense to people that do not understand, when that has any fuzziness.
man/unison.1.in
Outdated
.Sy ^D | ||
(0x04) from the input terminal or a redirected standard input all have the same | ||
effect. If the standard input is redirected from a regular file, is not open or | ||
is not open for reading already at Unison startup then it is ignored and only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"is not open" means presumably that read on fd 0 returns some errno that the fd is not valid. Which is a very odd way to invoke a program. And not open for reading I don't follow, do you mean read returns EPERM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"is not open" means presumably that read on fd 0 returns some errno that the fd is not valid. Which is a very odd way to invoke a program.
Yes, basically getting EBADF. If it is too much to be included in the manual like this, let me know. It is hard for my to judge myself.
And not open for reading I don't follow, do you mean read returns EPERM?
I don't remember the exact errno(s) anymore, but yes, this is the idea. This is something that some versions of nohup
do, for example. https://git.savannah.gnu.org/cgit/coreutils.git/tree/src/nohup.c?id=01755d36e7389aef48f712193976205e7381115b#n117
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So "In repeat mode, unison tries to read from stdin immediately on startup. If that results in EOF, or returns any other error, then stdin is ignored and it is not possible to cause a graceful stop via stdin."
any source providing a lot of input (such as a device or a pipe that produces | ||
large quantities of data). In such cases, Unison would keep reading all the | ||
input looking for the stop condition. | ||
.Ed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but it seems that whatever is provided for stdin should produce essentially nothing until it either sends EOF or 0x4, and anything else is odd. This phrasing hints at usages with moderate amounts of data making sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is somewhat unnecessary, yes. I'll try to rephrase or get rid of it completely. I thought about warning users not to shoot themselves in the foot but the same is true for any stdin redirection, so nothing special here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, just dropping it seems fine.
standard input is redirected from a regular file, is not open or is not open | ||
for reading already at Unison startup then it is ignored and only signals can | ||
be used to send the stop request. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to say really clearly that if stdin is /dev/null, then the stdin checking process is not used, as a guarantee.
| { st_kind = S_REG; _ } -> false | ||
| st -> begin try fdIsReadable Unix.stdin with Unix.Unix_error _ -> true end && notEOF st | ||
| exception Unix.Unix_error _ -> false | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the /dev/null check, that if you get EOF on the very first read, then it isn't used? I didn't figure that out from the manual, and it provides a race condition where termination might be signalled before unison does the first read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct. Yes, this is the check and yes, there is a race condition. However, I decided to accept this race condition because it is extremely unlikely to occur with "normal" usage. I don't know of another way of checking specifically for /dev/null and distinguishing it from something else that just happens to already return EOF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That may or may not be ok, but the documentation did not explain the behavior or caution, and did not hint at a race.
not interpreted as a stop request when standard input is redirected from a | ||
regular file. If the standard input is not open or is not open for reading | ||
already at Unison startup then it is ignored and only signals can be used to | ||
send the stop request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to have some idea that the regular file will be written, and that unison will be doing rewind and seeing the content again.
And, if there is an empty/32byte rule, it should be in the manual. Perhaps the manual should say that it must be a 0-byte file and stopping must write a single 0x4, if that's how it is, or maybe 0x4\n is ok too. The specification of what to do should be narrow. We also need to avoid UTF-8/encoding issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to have some idea that the regular file will be written, and that unison will be doing rewind and seeing the content again.
Need to improve the language. That's actually not what will happen; there will be no rewind, only appended content is read. (In fact, I thought about rewinding but it seems too much...)
And, if there is an empty/32byte rule, it should be in the manual. Perhaps the manual should say that it must be a 0-byte file and stopping must write a single 0x4, if that's how it is, or maybe 0x4\n is ok too. The specification of what to do should be narrow. We also need to avoid UTF-8/encoding issues.
Can add the 0-byte rule. Currently (can be changed, of course), only 0x4 is checked but any other input before and after that byte is ignored. So, 0x4\n will work but so will \n0x4. Encodings will cause any issues here because it's just the raw byte 0x4 that is checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So read(2) will return EOF, and then when a byte is written a further read(2) call with return that byte? Is that guaranteed?
Now that this has settled I looked through in more detail. I'm totally ok with merging the USR2 stuff as I see no downsides. I think the stdin stuff is ok, but I would like to see tighter interface specs to avoid limiting us later, and more clarity about the rules; I find "stdin is open" to be confusing. There is also some plan presumably (didn't read the code super carefully) to recheck a regular file via seek/read every N seconds. I hope this does not update atime, and in general I'm a little worried about the process doing things and using cpu and fs ops all the time when unison should be more or less quiescent. All of this makes me wonder about a command-line arg (not a preference) to enable the stdin checking behavior. |
Ok, I may very well split the PR to get the foundations out of the way.
Ok, let's work on this. I don't quite remember the origins of "stdin is open" (is this possible on POSIX at all?). It may be a Windows thing where it is possible not to have stdin at all but that only applies to GUI applications, which is clearly not the case here. If we conclude that stdin is always open (although, as you can see with
It's not as bad. select(2) is used, so no extra cpu or fs ops are consumed constantly/regularly. I specifically made sure that the process wouldn't do anything extra. If it does then it's a bug. I don't know about updating the atime but if you have redirected a file to stdin then you have requested yourself for that file to be read, so I'd say that's not a worry. Edit: select(2) doesn't work on regular files -- of course... -- so there is a loop with 1 second interval to read from stdin if it's a regular file. That loop is active only during the wait between sync cycles; but this wait can be several hours long. |
Please do split USR2 and we can merge that right away and maybe get some user testing. It strikes me that the stdin mechanism is ending up a bit awkward. I see the reason for each step but it is still complicated. I wonder if there is a way (that works on windows) to instead of the stdin mechanism, do:
and have that under the covers (but the above is the interface spec) open a fifo (or regular file if it has to be on windows) in $UNISONHOME. It doesn't have the "fg^D" simplicity, but it is very scriptable and straighforward. It avoids the race condition (assuming the 2nd command waits until the fifo appears, or gives up after a minute). |
Lwt_unix does not implement all functions on Win32. Make it possible to differentiate the Lwt_unix implementation at run time to avoid calling unimplemented functions. Previously this differentiation was done by probing the compiler OS platform (Win32? with Cygwin?) but this is not reliable because it does not indicate the actual Lwt_unix implementation that's linked in.
Add a new stop request method to terminate repeat mode. See commit e72c1c5. Monitor stdin to read any of the several stop conditions. Using stdin for checking the repeat mode stop request opens up a wide range of possibilities for the user, including manual and automated, local and remote. This method is usable only if stdin is an interactive terminal or is redirected from a readable source that is not a regular file.
Improve the method from commit a1bc7be to allow Unison run as a background job in an interactive terminal. Reading from stdin while not a foreground process will normally result in the process being stopped (or receiving EIO if ignoring the stop signal). Do not attempt to read from stdin while not in foreground. Guard the process against being stopped when trying to read from stdin at the wrong moment.
Add a new stop request method to terminate repeat mode. See commit e72c1c5. Enable the method from commit a1bc7be to work with stdin redirected from a regular file. Regular files differ from other sources: - Since a regular file is always ready for reading, limit the reads to once a second. - Reaching EOF in a regular file is not considered a stop request. This enables the process to be started with input redirected from an empty file and then waiting for data to be written into the file. - To prevent reading massive amounts of data, only enable the regular file method if the file is initially empty or at most 32 bytes in size.
Background and motivation in #554. Closes #554
cc @bruceg
This PR adds several methods to fix the original request from #554, namely a "graceful shutdown". Not all of these methods need to be merged. They all have their pros and cons. I'm hoping for a discussion to reach a decision on which methods (possibly all) will be merged. I've kept commits separate for easier review and selection.
"Graceful shutdown" means that a stop can be requested at any time but the process is stopped only after any ongoing sync is completed (it does not need to be successful, just finished for the moment). This is complementary to #810.
In this PR the following methods for a "greaceful shutdown" are included:
/dev/null
)), orFor more details please also see the changes to the manual in this PR.
With the
stdin
method one has basically endless possibilities and flexibility to send a stop request. Here are some examples to get the imagination going:sleep 60 | unison ... -repeat
-- automatically request to stop the repeat loop after 60 seconds; it stops when the current sync is completedecho | unison -batch ...
-- guaranteed to stop after one sync, even if there is a repeat preference in the (default) profile; could be used in a script to guarantee termination when not all preference values are knownmkfifo fifoname ; cat fifoname | unison ... -repeat
-- request to stop the repeat loop by writing 0x03 or 0x04 into the named pipe (it doesn't even matter what is written because EOF/closing will also work)ncat -l 45678 | unison ... -repeat
-- request to stop the repeat loop by opening a TCP connection to port 45678 and writing 0x04 into the socket (it doesn't even matter what is written because EOF/closing will also work) (inbash
you can even doecho quit > /dev/tcp/yourhost/45678
)An overview of different methods and some of their strong and weak sides
I've excluded files that are initially larger than 32 bytes (arbitrary limit).