Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When moving a file, sync on two local replicas does a copy from replica1 rather than from the file in replica2 #472

Closed
davidde opened this issue Jan 28, 2021 · 28 comments
Labels
effort-high issue is likely to require >20h of effort, perhaps much more enhancement issue is a request for a feature, and not a defect impact-low low importance

Comments

@davidde
Copy link

davidde commented Jan 28, 2021

I'm mostly drawn to unison because of its wonderful feature to keep 2 directories in sync without excessively copying and deleting. In other words, because it can be smart about file moves.

After some testing, it appears this smart behavior is only available when I specify at least 1 remote replica, even for syncing 2 local directories. For example:

unison -auto /home/david/Documents ssh://localhost//backup/Documents

Obviously, this requires an SSH daemon to be running, which is not really desirable if you don't need it. It is also kind of obscure to track down why the "smart moving" feature doesn't work by default, and not very intuitive for new users.

I don't know much about OCaml, or how this is implemented internally, so there may be a valid reason for this behavior. But if there is no specific blocker, wouldn't it make more sense to drop the SSH requirement?

@gdt gdt added effort-high issue is likely to require >20h of effort, perhaps much more enhancement issue is a request for a feature, and not a defect impact-low low importance labels Jan 28, 2021
@gdt gdt changed the title Feat: Do not require SSH for local "smart moves" Add ability for continuous sync to work on two local replicas Jan 28, 2021
@gdt
Copy link
Collaborator

gdt commented Jan 28, 2021

I think this is not about ssh but about some notion of local and remote. But if you can find out details from reading the code, please update the issue.

@davidde
Copy link
Author

davidde commented Jan 28, 2021

I may not have been clear enough about what I meant.

Continuous sync does work on 2 local replicas. It's just not very efficient because it will delete moved files first and then copy them again. I mention SSH because it works as it should when using SSH. That is, it moves the files in the destination directly, without any deleting/copying. This makes a huge difference when backing up large files/directories, as well as reducing wear on the disks.

I'm not sure the current title reflects what I'm trying to express.

@davidde
Copy link
Author

davidde commented Jan 28, 2021

Also I'm not sure it should be tagged effort-high without verifying it's not a simple oversight in the code; after all, the functionality is already there.

What I'm basically asking for is for the command:

unison -auto /home/david/Documents /backup/Documents

(which works, but just is not "smart" about file moves), to work identical to the command:

unison -auto /home/david/Documents ssh://localhost//backup/Documents

Which is smart about file moves. The big benefit then, is that users are no longer required to run sshd for no reason at all.

@tleedjarv
Copy link
Contributor

@davidde Can you write up a repeatable test case that demonstrates the issue (is it just that a simple mv will result in delete + copy in the other replica?). I will have a look at it.

@gdt gdt changed the title Add ability for continuous sync to work on two local replicas When moving a file, sync on two local replicas does a remove/copy rather than a rename in the other replica Jan 28, 2021
@gdt
Copy link
Collaborator

gdt commented Jan 28, 2021

I adjusted the title; I had a hard time understanding your original report, and I understand better now. The requested test case will help a lot for clarity. Also, you are saying "ssh", but there is local, there is unison over a socket, and there is unison over some program that executes remote commands. Presumably rsh would be the same, and we don't know about over the socket. Narrowing down exactly where the behavior happens is helpful for finding the right place in the code.

My guess it's going to take 20h total to find out exactly what's wrong, read the code, prepare a patch, test it, and review it. I tend to estimate high when there are a lot of unknowns. In any case, when somebody prepares a PR, it will get looked at, and if not, this won't get fixed, so the label doesn't have any real effect.

@gdt
Copy link
Collaborator

gdt commented Jan 28, 2021

Also, using "smart" to describe moves is not very precise. I understand now that what you mean is that if in one replica there has been "mv A B" then in the other replica in one case unison just executes the rename(2) system call, and in the other it does something different, which is unlink(2) on A and either having copied A to B in that replica or sending B from the first replica. This is different in a practical sense mostly if either replica is on a remote filesystem, but that's important because an sshfs mount can work around not being able to build unison.

@davidde
Copy link
Author

davidde commented Jan 28, 2021

@tleedjarv

Correct. That's exactly what I mean.

So for a test case:

# Create replica 1:
mkdir -p replica1/dir1
echo fileA > replica1/fileA
echo fileB > replica1/fileB
echo fileC > replica1/dir1/fileC

# Sync with unison to replica 2:
unison -auto replica1 replica2

# Move file in replica 1:
mv replica1/fileB replica1/dir1/fileB

# Sync again without ssh, optionally with debug:
unison -auto replica1 replica2 -debug copy
# This will delete and copy:
# replica1       replica2
# new file ----> dir1/fileB
# deleted  ----> fileB

Doing the same with the ssh syntax will move the file in replica2. I believe tryCopyMovedFile of the -debug copy output lets us verify if the file was moved correctly or not.

Let me know if there's anything else I can do to help.

@davidde
Copy link
Author

davidde commented Jan 28, 2021

@gdt

You're totally right that my original report was not very precise. My apologies for the confusion.

Thanks for the clarification about the tagging. I was personally just worried that the effort-high tag would minimize the chances of it being fixed. Did not intend any disrespect. You're all doing an amazing job here; I wish I could be of more help.

@tleedjarv
Copy link
Contributor

Thank you for the test case.

I have verified that tryCopyMoveFiled is only present in the remote code path. But there is a reason for that. The feature here is to not transfer the file again to the remote replica. But locally at the remote replica still copy + delete are done. So there is actually no functionality to prevent copying; only to prevent transferring (presumably over a slow link).

Could you paste the output of your sync run with ssh?

@davidde
Copy link
Author

davidde commented Jan 28, 2021

Sure.

Output of unison -auto replica1 ssh://localhost//home/david/Data/dev/testground/unison/replica2 -debug copy:

Connected [//arch-desktop//home/david/Data/dev/testground/unison/replica1 -> //arch-desktop//home/david/Data/dev/testground/unison/replica2]
Looking for changes
  
  Waiting for changes from server
Reconciling changes

local          arch-desktop       
new file ---->            dir1/fileA  
deleted  ---->            fileA  

Proceed with propagating updates? [] y
Propagating updates


UNISON 2.51.2 (OCAML 4.11.0) started propagating changes at 20:00:05.63 on 28 Jan 2021
[BGN] Copying dir1/fileA from /home/david/Data/dev/testground/unison/replica1 to //arch-desktop//home/david/Data/dev/testground/unison/replica2
[copy] copyRegFile(/home/david/Data/dev/testground/unison/replica1,dir1/fileA) -> (//arch-desktop//home/david/Data/dev/testground/unison/replica2,fileA,/home/david/Data/dev/testground/unison/replica2/dir1,.unison.fileA.5f5d4f620a8f7980ba9c4a36da20b3cf.unison.tmp,modified on 2021-01-28 at 19:10:09  size 6         rw-r--r--)
[server: copy] tryCopyMovedFile: -> .unison.fileA.5f5d4f620a8f7980ba9c4a36da20b3cf.unison.tmp /(017715e60b9a9450d604e0d489ebc83a,)/
[server: copy] tryCopyMovedFile: found match at /home/david/Data/dev/testground/unison/replica2,fileA. Try local copying
[server: copy] Copy.localFile /home/david/Data/dev/testground/unison/replica2 / fileA to /home/david/Data/dev/testground/unison/replica2/dir1 / .unison.fileA.5f5d4f620a8f7980ba9c4a36da20b3cf.unison.tmp
[server: copy] tryCopyMoveFile: success.
Shortcut: copied /home/david/Data/dev/testground/unison/replica2/dir1/fileA from local file /home/david/Data/dev/testground/unison/replica2/fileA
[END] Copying dir1/fileA
[BGN] Deleting fileA from //arch-desktop//home/david/Data/dev/testground/unison/replica2
[END] Deleting fileA
UNISON 2.51.2 (OCAML 4.11.0) finished propagating changes at 20:00:05.65 on 28 Jan 2021


Saving synchronizer state
Synchronization complete at 20:00:05  (2 items transferred, 0 skipped, 0 failed)

So if I understand correctly, you're saying there is no benefit in using SSH for local files?

@davidde
Copy link
Author

davidde commented Jan 28, 2021

For completeness sake, the output of the first unison -auto replica1 replica2 -debug copy (without SSH):

Unison 2.51.2 (ocaml 4.11.0): Contacting server...
Looking for changes
  fileA
Reconciling changes

replica1       replica2           
new file ---->            dir1/fileB  
deleted  ---->            fileB  

Proceed with propagating updates? [] y
Propagating updates


UNISON 2.51.2 (OCAML 4.11.0) started propagating changes at 19:16:28.03 on 28 Jan 2021
[BGN] Copying dir1/fileB from /home/david/Data/dev/testground/unison/replica1 to /home/david/Data/dev/testground/unison/replica2
[copy] copyRegFile(/home/david/Data/dev/testground/unison/replica1,dir1/fileB) -> (/home/david/Data/dev/testground/unison/replica2,fileB,/home/david/Data/dev/testground/unison/replica2/dir1,.unison.fileB.5f5d4f620a8f7980ba9c4a36da20b3cf.unison.tmp,modified on 2021-01-28 at 19:05:55  size 6         rw-r--r--)
[copy] Copy.localFile /home/david/Data/dev/testground/unison/replica1 / dir1/fileB to /home/david/Data/dev/testground/unison/replica2/dir1 / .unison.fileB.5f5d4f620a8f7980ba9c4a36da20b3cf.unison.tmp
[END] Copying dir1/fileB
[BGN] Deleting fileB from /home/david/Data/dev/testground/unison/replica2
[END] Deleting fileB
UNISON 2.51.2 (OCAML 4.11.0) finished propagating changes at 19:16:28.03 on 28 Jan 2021


Saving synchronizer state
Synchronization complete at 19:16:28  (2 items transferred, 0 skipped, 0 failed)

@gdt
Copy link
Collaborator

gdt commented Jan 28, 2021

So it looks like it used the remote file to copy based on finding it somehow (sha1 match?), saving the trip across the network connection.

Given all that, I don't see that you are getting anything from using ssh.

Am I missing something?

@tleedjarv
Copy link
Contributor

It seems to work exactly like this.

As you can see from the debug output, tryCopyMovedFile ends up calling Copy.localFile, which is exactly what the local run without ssh is doing.

@davidde
Copy link
Author

davidde commented Jan 28, 2021

@gdt I don't think the output is very readable, but indeed, after @tleedjarv findings, it doesn't look like there is any benefit in using ssh. Note this was a tip on a user forum when I was searching for how to prevent excessive copying.

I have not tested it more elaborately on a larger sync, because I wasn't sure I was going to use it in this way if I'm forced to keep sshd running. I guess that dilemma is off the table now ;)

The effort-high is all the more relevant then, I guess. Good call on that. No one is probably going to implement the "move behavior" now though, since it is totally "start from scratch" ... Too bad :(

@tleedjarv Thank you for verifying this!

Thank you both for the prompt reply and clarifications.

@davidde
Copy link
Author

davidde commented Jan 28, 2021

Wait, wait, people.

I just noticed that in the first (non ssh) output, Copy.localFile copies from replica1 to replica2, but in the second (ssh) output, it copies from replica2 to replica2. Depending on how the copy is implemented, this could still make a huge difference when syncing from e.g. your internal disk to an external drive, and may explain why this suggestion was made on the forum in the first place.

I guess it would still be best to do a more elaborate test with larger files, and see if there is any difference in execution time.

@gdt, @tleedjarv What do you think?

@gdt
Copy link
Collaborator

gdt commented Jan 28, 2021

I think unison is already complicated and that trying to optimize runtime for moved files by trying to guess which source is faster (replica1, replica2) when both are local, is not a wise use of developer time, to be blunt. There are many more problems to solve that will make a far bigger deal to users, like the pinned issues -- but obviously that's my opinion because I pinned them.

An optimization for an actual move (rename) might be good. But it's pretty low on my personal list sorted by value/effort. Don't let that stop you then - learn ocaml and start hacking!

@gdt gdt changed the title When moving a file, sync on two local replicas does a remove/copy rather than a rename in the other replica When moving a file, sync on two local replicas does a copy from replica1 rather than from the file in replica2 Jan 28, 2021
@gdt
Copy link
Collaborator

gdt commented Jan 28, 2021

I'm in favor of closing this, as I don't see anything to fix, absent a true rename(2) optimization which is something different.

it was a useful exercise, as I understand what happens better now.

@davidde
Copy link
Author

davidde commented Jan 29, 2021

I understand the sentiment, but if the above is true, then there is hardly any need for guessing. Unison could simply prefer the files that are already available in the same replica over the files that should be copied over from the other replica.

And since this behavior is already the default for unison over ssh, it may be straightforward to just port it over to the local unison variant. I feel this could bring a huge usability improvement, since external storage backups seem to me an ideal use case for unison. Currently, every backup without SSH makes needless copies after moving files, complicating this use case (e.g. media center backup). This may fix that!

Could you keep the issue open until next week? I'll try to verify whether the above is the case with a more elaborate test over the weekend.

@gdt
Copy link
Collaborator

gdt commented Jan 29, 2021

I really don't understand. First, I don't do "mv" of huge files very often. Second, if you mount an external disk on /mnt and then use unison, and you did a rename, I do not understand why it matters if the data source for the new name in the external disk replica is the pre-renamed file in the external disk or the computer's internal disk. Arguably, using the computer's copy is better because that disk is likely faster, between being more likely to be SSD and better bus attachment.

Making this use rename(2) would be better, but you've more or less said this isn't about that.

I also don't understand "huge usability" improvement. Do you have an actual use case where the current behavior is making the sync slow enough to notice, and where doing the copy from the external disk would be faster? For me, unison is almost always fast, except when I add a few GB of data to a replica and sync it to some machine far away across the net, and then it runs at the speed of sending that new data.

I don't mind leaving it open a bit while you are actively trying to explain and/or make a repro recipe. It was just that at this point I don't see the problem.

@davidde
Copy link
Author

davidde commented Jan 29, 2021

Thanks for keeping it open for now.

I'm not sure yet if it does make a difference, I'm just finding it hard to believe that people on the forum would advise using SSH if they didn't notice it making a difference. Still not impossible I guess. I also think I've seen the same suggestion on Stack Overflow somewhere. But since it's late here, I'll test things out over the weekend and get to the bottom of it then.

I'll let you know if it turns out a dud or an actual improvement.

@tleedjarv
Copy link
Contributor

I feel this could bring a huge usability improvement, since external storage backups seem to me an ideal use case for unison. Currently, every backup without SSH makes needless copies after moving files, complicating this use case (e.g. media center backup).

I echo @gdt's sentiment here. Wouldn't this case be exactly where you don't want a replica-local copy? Assuming spinning disks over USB, you wouldn't want to read and write on the same disk.

But I still think it's good to keep the issue open (and renamed) as detecting an actual rename instead of copy+delete may be an interesting feature.

And since this behavior is already the default for unison over ssh, it may be straightforward to just port it over to the local unison variant.

Not that I have verified, but I think you are right in that it most likely is very straightforward.

@gdt
Copy link
Collaborator

gdt commented Jan 29, 2021

Let's leave this issue being about the different behavior for remote vs local.

As for using the rename system call instead, that's separate and should be a new issue, if it's in the tracker. I'd prefer to close issues where we've figured things out and just have an issue that crisply and correctly states what's going on, as when looking at open issues, things with lots of discussion that could be summarized take much longer to re-read.

@tleedjarv
Copy link
Contributor

I had a look to see if the replica-local copy can be enabled even with two local replicas. It is not entirely straightforward, and it is intentionally disabled for local replicas because it would entail additional processing for all files, all the time. I can't think that it would be noticable in practice, though.

If there is a good case to be made that this is useful functionality to users, I think it could be implemented with reasonable effort, based on the preliminary look.

@davidde
Copy link
Author

davidde commented Jan 30, 2021

So I've tested things out more elaborately with larger files over different drives, and verified for myself now that there is indeed no benefit whatsoever in using SSH. It seems this suggestion was completely unfounded (that, or it did work this way at some point). This also means there is currently no benefit in porting the SSH behavior to the local variant.

However, if the rename system call was to be implemented (as I initially thought it already was), there would be a huge benefit in porting it (thank you @tleedjarv for verifying this is possible in theory). Currently (without rename), something as simple as renaming a directory in replica1 results in its full contents (including subdirectories!) being copied in replica2 (both for ssh and local unison), after which the original directory is deleted. This can be very expensive when working with large media directories, and fixing this would qualify as a huge usability improvement in my book. (What's more, for moving directories we are actually still better off using drag and drop in a file explorer over using Unison, which feels like a waste of potential to me.)

So I am starting to wonder about a possible rename implementation. Would that entail a lot of work, or is it mostly a matter of replacing the system calls used? What are your estimates regarding complexity and man-hours? This is definitely something I would want to contribute to if I can be of any help.

@gdt
Copy link
Collaborator

gdt commented Jan 30, 2021

It sounds like there is no problem to solve regarding the copy behavior, so I'm going to close this. Thanks for doing the tests and figuring out what's going on.

We are really trying to keep the tracker being for bug reports and concise feature requests. Thus, I decline to engage in anything other than those two categories within github. If you would like to discuss the issues around extending the codebase to use rename -- and that sounds like a useful improvement if someone writes the code -- please join the hackers list and I'll be happy to comment there. There is a wiki page that explains the lists at https://github.com/bcpierce00/unison/wiki/Mailing-Lists

@gdt gdt closed this as completed Jan 30, 2021
@tleedjarv
Copy link
Contributor

So I am starting to wonder about a possible rename implementation. Would that entail a lot of work, or is it mostly a matter of replacing the system calls used? What are your estimates regarding complexity and man-hours? This is definitely something I would want to contribute to if I can be of any help.

I didn't see this topic in the mailing list, so just for the record here, I think this is going to be a very complex task. I don't claim to know too much about this functionality but it is some of the most complex code in Unison.

@tleedjarv
Copy link
Contributor

This is the feature request: #23

@tleedjarv
Copy link
Contributor

@davidde, there is now a PR with some support for rename syncing. See if #576 would work for your use cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort-high issue is likely to require >20h of effort, perhaps much more enhancement issue is a request for a feature, and not a defect impact-low low importance
Projects
None yet
Development

No branches or pull requests

3 participants