-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upload the package zip in chunks for GHA #1043
base: main
Are you sure you want to change the base?
Conversation
Sorry for the noise, trying to verify that the transition over to GitHub Actions for the PR bot is working... |
You could pass the data via stdin. |
I could but I honestly have no idea how. |
There is now #1134 :) |
I've switched it to use stdin instead, and it still works 😄. This PR is again ready for review |
Seems like the formatting errors are caused by GitHub upgrading their clang-format, and not my fault. 😄 |
I always also build the format target locally so that the format is always right (if you use clang-format 16) |
Co-authored-by: autoantwort <41973254+autoantwort@users.noreply.github.com>
@BillyONeal friendly ping 😄 |
What is the status of this? 🤔 |
I fixed the merge conflicts that accumulated. Just needs someone from the team to review 😄 |
Co-authored-by: Thomas1664 <46387399+Thomas1664@users.noreply.github.com>
base_cmd.string_arg(url); | ||
|
||
auto file_ptr = fs.open_for_read(file, VCPKG_LINE_INFO); | ||
std::vector<char> buffer(chunk_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that the default size is 450 MB. And no limits.
Is there an alternative to reading it into a buffer first just to forward it to anothers command stdin?
(Remembering all those Raspi people which struggle to build vcpkg
due to low memory...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original way was to split the file on disk, but that's pretty hacky I think.
But I can decrease the buffer size. I'm not sure what you mean with limit.
Are people really running a GitHub Runner server on a Raspi lmao 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are people really running a GitHub Runner server on a Raspi lmao
Well, this is only the tool uploading the artifacts. Caching large artifacts is more important when build machine power is low.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay. What buffer size do you think I should use? I can't make it really small or else the upload will be way slower than it would otherwise be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. I see the trade-offs and barriers.
- Can't make curl read chunks directly from (within) a large file.
- Can't feed (the vcpkg function running) curl with piecewise input. (IO buffer smaller than network chunks.)
Changing curl (tool) is out of scope here.
If the interface remains running curl instead of calling into libcurl, then it would be best to fix the second point.
If this is too intrusive, it might be helpful to have a way for the user to change the buffer size, or at least to turn of the buffering in case of trouble.
std::size_t bytes_read = 0; | ||
for (std::size_t i = 0; i < file_size; i += bytes_read) | ||
{ | ||
bytes_read = file_ptr.read(buffer.data(), sizeof(decltype(buffer)::value_type), chunk_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a whole curl process launch per chunk like this is kind of a problem. I don't see reasonable ways to achieve the effect this PR wants without linking with libcurl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it could still be done like this but the chunk sizes would have to be bigger than make sense to denote as a single contiguous memory buffer; there should be more than one read / write per curl launch, etc.......
and that sounds like a lot more work than linking with libcurl.
#1422 pulls in libcurl. |
@quyykk Is there any way to start using this in GitHub Actions at the moment? It is sorely needed to make CI builds that depend on qtbase bearable. (maybe checking out a specific vcpkg branch in-tree? applying a patch and working dirty?) |
@pedroraimundo-ao Probably not the answer you were looking for, but I switched to Conan for my dependencies. |
This fixes microsoft/vcpkg#31072 and microsoft/vcpkg#31132.
It changes the upload to the cache so that it happens in chunks of 450MB, instead of all at once. This is because GitHub rejects uploads bigger than ~500MB (educated guess) to its cache.
The implementation is a bit hacky though, but I haven't found a better solution: It splits the file into multiple 450MB chunk files on disk.It now reads 450MB chunks from the file at a time.