Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: make push_path open in binary mode so it works on non-text files #1458

Merged
merged 1 commit into from
Nov 20, 2024

Conversation

benhoyt
Copy link
Collaborator

@benhoyt benhoyt commented Nov 19, 2024

There was a bug in push_path where it called open with the default (text) mode on the local files, causing it to not work on binary files (more specifically, if they weren't valid UTF-8). This is the fix for that -- push_path should treat the files as raw bytes, so read in binary mode.

Also change the test to ensure binary / non-UTF-8 files work as expected.

Fixes #1455.

Copy link
Contributor

@james-garner-canonical james-garner-canonical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

I wondered if defaulting to reading bytes would be problematic, but it's not, because previously with a text file pebble.Client.push would just read the text and encode it as bytes before sending (defaulting to UTF-8 -- and Container.push_path doesn't care about or allow the user to specify encoding anyway). Likewise, Container.pull_path always pulls with encoding=None, so it just works with the bytes directly as well.

So this is indeed just a straightforward bugfix.

Copy link
Contributor

@tonyandrewmeyer tonyandrewmeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix looks good to me, and "just push everything as bytes" is also the fix that I would have gone with (and matches pull_path)

There is a small behaviour change for text files. Previously, line endings would get normalised to \n, since Python does that when opening in text mode. I'm not sure if we consider that a bug or not - it does seem like unexpected behaviour if you're thinking of push_path as "copy this exact thing" (like an shutil.copyfile that works across the containers), although less so if you're thinking very specifically about Python and text files. I lean towards "it was also a bug".

With regards to the test: would it be worth keeping a text-only test and adding a binary (non-utf8-encodable) one instead? I'm going with "approve" because I think the text file change is a bug and so I can't think of any case where one would break and the other wouldn't, but raising this as something to consider.

@benhoyt
Copy link
Collaborator Author

benhoyt commented Nov 20, 2024

Previously, line endings would get normalised to \n, since Python does that when opening in text mode.

Yeah, I'd consider that a subtle but nasty bug. I was involved in the original design of push and pull, and the intention with push and pull was very much (and especially the recursive push_path and pull_path versions) was copying files, not modifying the data or doing line ending normalisation. In Go it's all bytes anyway. In Python, push and pull themselves allow supply strings which are then encoded as UTF-8, but that's just to make passing strings/TextIO easier in Python, not for normalisation reasons.

I think one test is probably fine here as the intent is that there's only code path, so going to run with this. Thanks!

@benhoyt benhoyt merged commit 6a17bf9 into canonical:main Nov 20, 2024
31 checks passed
@benhoyt benhoyt deleted the fix-push_path branch November 20, 2024 03:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Container.push_path() only supports text files
3 participants